Coder Social home page Coder Social logo

cosmos-proto's People

Contributors

aaronc avatar amaury1093 avatar cyberbono3 avatar dependabot[bot] avatar faddat avatar fdymylja avatar julienrbrt avatar kocubinski avatar robert-zaremba avatar tac0turtle avatar technicallyty avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cosmos-proto's Issues

Consider "slow" protobuf as a build tag

Whilst iterating over the official goproto repo, many developers asked for reasons for which code generation fast paths (specifically related to marshalling/unmarshalling) implementation were never done. One of the answers given by the owners was that the big code size was a problem for some environments.

Might this something we need to consider in the scope of the cosmos ecosystem?

A solution could be generating a build tag for slow paths, ex:

types.slow.pb.go

// +build slow

... reflection code...

types.pb.go

// +build !slow
.. fast path code ...

bring in protoc-gen-go

we want to fork in protoc-gen-go to make it easier to introduce new features and have only 1 binary

Reconcile protoreflect.Message.ProtoMethods codegen

Currently, fastreflection implements protomethods as nil, which means that in order to marshal a protobuf message, it's going to get the fields one by one using protoreflect.Message.Range. We should reconcile this with the fastmarshal implementation.

how to handle the proto api v1 and v2

Regarding the issue documented at cosmos/cosmos-sdk#19213, I am seeking clarity on the potential deprecation of the Proto API v1. In API v1, the utilization of custom type functions is facilitated by gogoproto, which appears not to be supported in API v2. Could you please provide guidance on how to address this limitation within API v2?

Implement customtype's (Cosmos Scalars)

  • cosmos_proto.scalar proto option that applies string and bytes types
  • Mapping YAML config file between scalars (language agnostic) and golang types
    Ex:
scalars:
  cosmos.Int: github.com/cosmos/cosmos-sdk/types.Int
  • define CustomType golang interface that all golang scalar type implementations need to implement (ex. Int), this is what we're using for gogo proto and can maybe reuse:
type CustomType interface {
	Marshal() ([]byte, error)
	MarshalTo(data []byte) (n int, err error)
	Unmarshal(data []byte) error
	Size() int

	MarshalJSON() ([]byte, error)
	UnmarshalJSON(data []byte) error
}

An alternative that just deals with string and bytes types could be:

type CustomStringType interface {
  MarshalString() (string, error)
  UnmarshalString(string) error
}

type CustomBytesTypes interface {
  MarshalBytes() ([]byte, error)
  UnmarshalBytes([]byte) error
}

Depending on whether cosmos_proto.scalar is used on a string or bytes type, the corresponding interface would be used.

  • generate protoreflect.Message handling
  • generate fast marshaling handling

Fast-path protoreflect.Message methods

Needed for custom types and Anys, will also speed up other codes.

Types to implement:

ProtoReflect.Messsage:

  • Descriptor
  • Type
  • New
  • Interface
  • Range
  • Has
  • Clear
  • Get:
    • Scalar types
    • Oneof types
    • Message types
    • List types
    • Map types
    • Extension Types
  • Set
  • Mutable
  • NewField
  • WhichOneof
  • GetUnknown
  • SetUnknown
  • IsValid

Protoreflect.List:

  • Len
  • Get
  • Set
  • Append
  • AppendMutable
  • Truncate
  • NewElement
  • IsValid

Protoreflect.Map:

  • Len
  • Range
  • Has
  • Clear
  • Get
  • Set
  • Mutable
  • NewValue
  • IsValid

Ex for https://pkg.go.dev/google.golang.org/[email protected]/reflect/protoreflect#Message Has method:

func (foo *Foo) Has(f FieldDescriptor) bool {
  switch f.Number() {
    case 1:
      return foo.Bar != nil
    default:
      return false
  }
}

If we wanted to reuse the default protoreflect implementation, maybe this will work but it won't speed up other code:

func (foo *Foo) Has(f FieldDescriptor) bool {
  switch f.Number() {
    case 1:
      return foo.Bar != nil
    default:
      return f.ProtoReflect().Has(f)
  }
}

Implementing all of the protoreflect.Message methods to be as fast as possible is the goal. Some benchmarks are here: cosmos/cosmos-sdk#9156 (reply in thread).

fix field/method conflict

Given the following proto message type:

message Foo {
  string type = 1;
}

We will produce a Foo.Type field and a Foo.Type() protoreflect.MessageType method, this causes code conflicts.

If a message uses a reserved name, for the sake of simplicity, we should suffix the field with an underscore _.

I would also output a warning message that this is bad, and we might break it in the future (reason for breaking is that we are incompatible with protoc field codegen which I do not think is good).

Should message structs implement protoreflect.Message directly?

Currently we're generating custom protoreflect.Message methods on the message struct itself. But this may unnecessarily pollute the methods that all structs implement. What if a user wanted to implement a method named Has or Range for instance? They wouldn't be able to do it because it would conflict with the generated code.

An alternative would be a wrapper struct that implements protoreflect.Message, i.e.:

type barReflect struct {
  *Bar
}

func (x *Bar) ProtoReflect() protoreflect.Message {
	return barReflect{x}
}

Disable proto2

Based on the discussion, since protov3 does not support extensions, which makes protoreflect fast paths implementation a lot more easier, we shoudl either disable proto2 support or fallback to the original codegen (and maybe output a warning?)

Any <> interface Conversion

This will reuse the existing cosmos_proto.implement_interface and cosmos_proto.accepts_interface: https://github.com/regen-network/cosmos-proto/blob/master/cosmos.proto

We want a mapping from proto interface names to golang interface names, ex:

interfaces:
  cosmos.gov.Content: github.com/cosmos/cosmos-sdk/x/gov/types.Content

Then the generated code should use the interface and not Any, ex:

type MsgSubmitProposal struct {
	Content        Content
	InitialDeposit []*Coin
	Proposer       string                                   
}

protoreflect.Message and fast marshaling handling will need to be generated

  • bring in protoc #23

  • struct code gen

  • protoreflection & proto methods

Utility for dynamicpb.NewMessageType + known types

Summary

Write a new utililty function that's the same as dynamicpb.NewMessageType, but when a nested field is a known type, use that resolved type, instead of a new dynamicpb.Message.

ref: cosmos/cosmos-sdk#15302 (comment)

Problem

Consider a custom Msg, suppose it's only registered in gogo, and not in protov2:

message MsgCustom {
  google.protobuf.Any a = 1;
}

Then when our Unpack utility receives this message, it sees MsgCustom and will use dynamicpb.NewMessageType to generate the whole proto.Message, including the nested field. MsgCustom.A will have type dynamicpb.Message.

Is this expected, or is it more helpful to have MsgCustom.A be an anypb.Any?

Proposal 1: Keep using official dynamicpb.NewMessageType

Using this, we know that proto.Messages created with dynamicpb are dynamicpb all the way down. There are no statically-created types used.

However, this also means that callers in general cannot type-cast. For example, in Textual, we use a switch case to handle both cases (dynamicpb or known type): https://github.com/cosmos/cosmos-sdk/blob/9a83c202cd545265f54d15090c6b150dc692f0f6/x/tx/signing/textual/any.go#L134-L148

We probably can move some utility functions like this to cosmos-proto if they are repeated.

Proposal 2: Write an utility function that does dynamicpb.NewMessageType with known types

We should pass a file resolver / type resolver to this new function, but it shouldn't be too hard to write it. The positive point is that callers can always type-cast to known types.

Support optional fields

The plugin currently does not support optional fields:

testpb/1.proto: is a proto3 file that contains optional fields, but code generator protoc-gen-go-fasteflection hasn't been updated to support optional fields in proto3.

Make ProtoReflect return the message

Currently ProtoReflect() returns the reflection based protoreflect.Message, but since we are implemented protoreflect.Message on our message type directly (for now), we need to change this to:

func (x *Bar) ProtoReflect() protoreflect.Message {
	return x
}

This also means that all the generated methods like Descriptor can't just call ProtoReflect().Descriptor() because that would be self reflexive. But we can deal with that in #3.

Use canonical protoc-gen-go instead of forking

Currently we have a LOT of code copied from protoc-gen-go.

Our initial approach was to attempt to create a protoc-gen-go-lite and only generate what was needed, but as we found a need for descriptors and protoc lang features, more code was needed from protoc-gen-go.

I think we should reconsider this approach and use plugins alongside protoc-gen-go like VT's plugin, and frojdi's fast plugin.

The issue that was brought up from using isolated plugins was that we would end up with multiple generated files per proto file. Solutions to this problem:

  • All generated files in modules could go under their own folder. For example:
    x/upgrade/generated/*.pb.go
  • Create a tool that combines multiple go src files using go/ast

fast feature: Range impl only runs on non-default values

The Range function implemented on the fastreflection structs currently check the fields against their default values to decide if we should call f on these fields
https://github.com/cosmos/cosmos-proto/blob/main/testpb/1.pulsar.go#L316-L335

We should probably just call f on all fields unless they have the possibility of being nil, since we don't really have notion of "presence" in proto3:

In proto3, field presence for scalar fields simply doesn't exist. Your mental model for proto3 should be that it's a C++ or Go struct. For integers and strings, there is no such thing as being set or not, it always has a value. For submessages, it's a pointer to the submessage instance which can be NULL, that's why you can test presence for it.

Documentation?

Hi.

I hope you are having a great new year so far.

Is there documentation on how to use this library? Someone recommended I consider it for a project we are working and I would like to know how to do so.

Thanks in advance.

goproto fork produces malformed generated code with enums and maps

To reproduce the bug generate the following proto with pulsar:

// Copyright 2018 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.

syntax = "proto3";

package goproto.proto.test3;


option go_package = "github.com/cosmos/cosmos-proto/internal/testprotos2";

enum NestedEnum {
  FOO = 0;
  BAR = 1;
  BAZ = 2;
  NEG = -1;  // Intentionally negative.
}

message TestAllTypes {
  NestedEnum     singular_nested_enum     = 101;
  map <string, NestedEnum>    map_string_nested_enum    = 73;
}

The following are the codegen snippets which do not match with goproto and that cause descriptors to panic:
Sample 1:

var file_internal_testprotos_test_proto_depIdxs = []int32{
	2, // 0: goproto.proto.test3.TestAllTypes.map_string_nested_enum:type_name -> goproto.proto.test3.TestAllTypes.MapStringNestedEnumEntry
	0, // 1: goproto.proto.test3.TestAllTypes.singular_nested_enum:type_name -> goproto.proto.test3.NestedEnum
	0, // 2: goproto.proto.test3.TestAllTypes.MapStringNestedEnumEntry.value:type_name -> goproto.proto.test3.NestedEnum
	3, // [3:3] is the sub-list for method output_type
	3, // [3:3] is the sub-list for method input_type
	3, // [3:3] is the sub-list for extension type_name
	3, // [3:3] is the sub-list for extension extendee
	0, // [0:3] is the sub-list for field type_name
}

Expected:

var file_internal_testprotos2_test_proto_depIdxs = []int32{
	0, // 0: goproto.proto.test3.TestAllTypes.singular_nested_enum:type_name -> goproto.proto.test3.NestedEnum
	2, // 1: goproto.proto.test3.TestAllTypes.map_string_nested_enum:type_name -> goproto.proto.test3.TestAllTypes.MapStringNestedEnumEntry
	0, // 2: goproto.proto.test3.TestAllTypes.MapStringNestedEnumEntry.value:type_name -> goproto.proto.test3.NestedEnum
	3, // [3:3] is the sub-list for method output_type
	3, // [3:3] is the sub-list for method input_type
	3, // [3:3] is the sub-list for extension type_name
	3, // [3:3] is the sub-list for extension extendee
	0, // [0:3] is the sub-list for field type_name
}

Sample 2:

type TestAllTypes struct {
	state         protoimpl.MessageState
	sizeCache     protoimpl.SizeCache
	unknownFields protoimpl.UnknownFields

	MapStringNestedEnum map[string]NestedEnum `protobuf:"bytes,73,rep,name=map_string_nested_enum,json=mapStringNestedEnum,proto3" json:"map_string_nested_enum,omitempty" protobuf_key:"bytes,1,opt,name=key,proto3" protobuf_val:"varint,2,opt,name=value,proto3,enum=goproto.proto.test3.NestedEnum"`
	SingularNestedEnum  NestedEnum            `protobuf:"varint,101,opt,name=singular_nested_enum,json=singularNestedEnum,proto3,enum=goproto.proto.test3.NestedEnum" json:"singular_nested_enum,omitempty"`
}

Expected:

type TestAllTypes struct {
	state         protoimpl.MessageState
	sizeCache     protoimpl.SizeCache
	unknownFields protoimpl.UnknownFields

	SingularNestedEnum  NestedEnum            `protobuf:"varint,101,opt,name=singular_nested_enum,json=singularNestedEnum,proto3,enum=goproto.proto.test3.NestedEnum" json:"singular_nested_enum,omitempty"`
	MapStringNestedEnum map[string]NestedEnum `protobuf:"bytes,73,rep,name=map_string_nested_enum,json=mapStringNestedEnum,proto3" json:"map_string_nested_enum,omitempty" protobuf_key:"bytes,1,opt,name=key,proto3" protobuf_val:"varint,2,opt,name=value,proto3,enum=goproto.proto.test3.NestedEnum"`
}

If we replaced the malformed snippet in Sample 1 with the expected one the panic does not happen. Although generated code should match 1 on 1 what goproto produces.

Canonical marshalling for imported messages

Our deterministic marshalling implementation enforces canonical marshalling for messages.
This is achieved because we generated that code with pulsar codegen.

But what happens when we import a message that was not generated with cosmos-proto?

Ex:

message GeneratedWithCosmosProto {
  google.Protobuf.FileDescriptor not_generated_with_cosmos_proto = 1;
  string field = 1;
  map<string, int64> map_field = 2;
  ...
}

How do we detect that the import is not generated with pulsar from marshal codegen, and then how do we marshal it in a canonical way without impacting performance much?

Use cases:

  • ORM will most likely save objects protobuf file descriptors into state using the google.Protobuf.FileDescriptor message.
  • Timestamp and Duration proto

NOTE:
We can enforce deterministic marshalling for imported objects, protobuf marshal options supports this, but there is no guarantee that that deterministic marshalling follows https://github.com/cosmos/cosmos-sdk/blob/master/docs/architecture/adr-027-deterministic-protobuf-serialization.md

Need DO NOT EDIT comment

Pulsar generated files are missing a comment like:

// Code generated by protoc-gen-go-pulsar. DO NOT EDIT.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.