Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proto: panics on handwritten types that directly embed generated messages #668

Closed
tandr opened this issue Aug 2, 2018 · 12 comments
Closed

Comments

@tandr
Copy link

tandr commented Aug 2, 2018

Ok, I am trying to convert our code (pre-XXX) to the latest protoc-gen-go and... I am seriously thinking about switching back to json.

Side effects so far

  • Things that were perfectly comparable are not comparable any more (most of our structs were strings and ints, nothing fancy). Now I have to use proto.Equals(), which uses reflection. Something tells me that speedup that I got from using protobuf might be slowly disappearing into comparisons and special cases to handle it (that I need to remember when I am using it). (this is covered by @sitano in protoc-gen-go: XXX_ fields break code back compatibility #607)
  • getting panic in tests for code that serializing / deserializing the following (handrolled) code for serialization that was perfectly working before. (We have many of these little things created without proto files, and it was considered a nice feature of proto to be able to do it easily)
type leveldbModTime struct {
	ModTime *proto.Timestamp `protobuf:"bytes,1,opt,name=modTime" json:"modTime,omitempty"`
}

func (m * leveldbModTime) Reset()         { *m = leveldbModTime{} }
func (m * leveldbModTime) String() string { return proto.CompactTextString(m) }
func (* leveldbModTime) ProtoMessage()    {}

func init() {
	proto.RegisterType((* leveldbModTime)(nil), "tandr.leveldbModTime")
}

and the panic stack is something like this

=== RUN   TestStore_StoreLoad
--- FAIL: TestStore_StoreLoad (0.16s)
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x1185538]

goroutine 29 [running]:
testing.tRunner.func1(0xc4201cab40)
	/usr/local/go/src/testing/testing.go:742 +0x29d
panic(0x15a03a0, 0x1a32e00)
	/usr/local/go/src/runtime/panic.go:502 +0x229
tandr/myexe1/vendor/github.com/golang/protobuf/proto.unmarshalStringValue(0xc4201d41e2, 0x1d4, 0x1df, 0x0, 0x2, 0xc4206db910, 0x101397c, 0xc4200ba878, 0xc4206db9d8, 0x10)
	/Users/tandrey/ww/go/src/tandr/myexe1/vendor/github.com/golang/protobuf/proto/table_unmarshal.go:1454 +0x108
tandr/myexe1/vendor/github.com/golang/protobuf/proto.(*unmarshalInfo).unmarshal(0xc4200fbea0, 0x0, 0xc4201d41e0, 0x1d6, 0x1e0, 0x1cbee40, 0xc4201fe820)
	/Users/tandrey/ww/go/src/tandr/myexe1/vendor/github.com/golang/protobuf/proto/table_unmarshal.go:171 +0x7b7
tandr/myexe1/vendor/github.com/golang/protobuf/proto.(*InternalMessageInfo).Unmarshal(0x1a45bc0, 0x16dafc0, 0x0, 0xc4201d41e0, 0x1d6, 0x1e0, 0x1, 0x1cbee40)
	/Users/tandrey/ww/go/src/tandr/myexe1/vendor/github.com/golang/protobuf/proto/table_unmarshal.go:63 +0x75
tandr/myexe1/vendor/tandr/proto-interfaces/gen/go/myexe1_proto.(*Block).XXX_Unmarshal(0x0, 0xc4201d41e0, 0x1d6, 0x1e0, 0x0, 0x0)
	/Users/tandrey/ww/go/src/tandr/myexe1/vendor/tandr/proto-interfaces/gen/go/myexe1_proto/block.pb.go:173 +0x61
tandr/myexe1/vendor/github.com/golang/protobuf/proto.Unmarshal(0xc4201d41e0, 0x1d6, 0x1e0, 0x16da200, 0xc42000e408, 0x1a64080, 0xc4201d41e0)
	/Users/tandrey/ww/go/src/tandr/myexe1/vendor/github.com/golang/protobuf/proto/decode.go:338 +0x1ad
tandr/myexe1/leveldb_store.unpackBlock(0xc4201d41e0, 0x1d6, 0x1e0, 0x1e0, 0xc4201f27e0, 0x8)
	/Users/tandrey/ww/go/src/tandr/myexe1/leveldb_store/proto_helpers.go:44 +0x71
tandr/myexe1/leveldb_store.(*leveldbStore).collectBlocks(0xc4201f41b0, 0x0, 0x0, 0x0, 0x0)
	/Users/tandrey/ww/go/src/tandr/myexe1/leveldb_store/leveldb_store.go:385 +0x272
tandr/myexe1/leveldb_store.(*leveldbStore).LoadOutputs(0xc4201f41b0, 0xc4207100c0, 0x0, 0x0, 0x0, 0x0, 0xc4206e0b40)
	/Users/tandrey/ww/go/src/tandr/myexe1/leveldb_store/leveldb_store.go:327 +0x39
tandr/myexe1/leveldb_store.(*leveldbResolver).LoadOutputs(0xc42000c080, 0xc4207100c0, 0x0, 0x0, 0xc4201cab40, 0x0, 0x0)
	/Users/tandrey/ww/go/src/tandr/myexe1/leveldb_store/leveldb_store_resolver.go:235 +0x97
tandr/myexe1/leveldb_store.checkStoredBlocks(0xc4201cab40, 0x16e02c0, 0xc42000c080, 0xc420030240, 0x4, 0x0, 0x0)
	/Users/tandrey/ww/go/src/tandr/myexe1/leveldb_store/leveldb_store_test.go:366 +0x50
tandr/myexe1/leveldb_store.setupAcAndOutputs(0xc4201cab40, 0x16e02c0, 0xc42000c080, 0x2d2, 0x10ed5c0, 0x10ed5ed, 0x17b06d0)
	/Users/tandrey/ww/go/src/tandr/myexe1/leveldb_store/leveldb_store_test.go:357 +0x498
tandr/myexe1/leveldb_store.TestStore_StoreLoadOutputs(0xc4201cab40)
	/Users/tandrey/ww/go/src/tandr/myexe1/leveldb_store/leveldb_store_test.go:77 +0xb1
testing.tRunner(0xc4201cab40, 0x168ebe8)
	/usr/local/go/src/testing/testing.go:777 +0xd0
created by testing.(*T).Run
	/usr/local/go/src/testing/testing.go:824 +0x2e0
FAIL	tandr/myexe1/leveldb_store	1.968s
@dsnet
Copy link
Member

dsnet commented Aug 2, 2018

What version of the proto package is this? I'm having a hard time correlating table_unmarshal.go:1454 with existing Go code.

@dsnet
Copy link
Member

dsnet commented Aug 2, 2018

You mention that you are using hand-rolled messages, which is not something we support. The proto package currently assumes all messages were generated by protoc-gen-go. However, we are willing to fix certain unobtrusive cases if it helps hand-written code continue to work.

That being said, we're going to need a minimized reproduction that we can run in order to address this.

@tandr
Copy link
Author

tandr commented Aug 2, 2018

I will work to create a minimum reproducible example.

grpc code is vendrored from tag v1.14.0 that was just released
protobuf is from v1.1.0 , extract from vendor.json below

{
"checksumSHA1": "Pyou8mceOASSFxc7GeXZuVdSMi0=",
"path": "github.com/golang/protobuf/proto",
"revision": "b4deda0973fb4c70b50d226b1af49f3da59f5265",
"revisionTime": "2018-04-30T18:52:41Z",
"version": "v1.1.0",
"versionExact": "v1.1.0"
},

old proto-gen-go was checked out and built on 2018-01-19
new one on 2018-08-01

"Handrolled" is equivalent to "old code" in this case - these items are no different than any other older proto generated code.

As for the "everything proto-based need to be generated with latest version" ... that is a really fat unsafe assumption. In my case it is painful but doable - I can add/change these 10 types to adhere to new interfaces. But if I import the code (and we do) that is produced by another team, I have no control of what version of protoc-gen-go they have used! (Even worse - since there is no versioning or tagging of proto-gen-go, I cannot even tell them what version to use other than a commit id, but I digress)

So, in essence, by introducing these changes (table parsing, XXX), you now disallow to import anything that implements old proto interface contract (aka old code)?

@tandr
Copy link
Author

tandr commented Aug 3, 2018

Since I am not 100% sure how to attach multiple files to an issue, I created repo with code that is quite close to what we are doing https://github.com/tandr/bug668

@dsnet
Copy link
Member

dsnet commented Aug 3, 2018

Now I have to use proto.Equals(), which use reflection. Something tells me that speedup that I got from using protobuf might be slowly disappearing into comparisons and special cases to handle it (that I need to remember when I am using it).

Whether proto.Equals uses reflect or not is an implementation detail. Unmarshal and Marshal are faster today because they use unsafe in a table-driven implementation.

@tandr
Copy link
Author

tandr commented Aug 3, 2018

It depends on how often I have to use proto.Equals, no?

@dsnet
Copy link
Member

dsnet commented Aug 3, 2018

I'm not sure what you mean? Before replying, can you open a separate issue on the performance of proto.Equals? Let's keep this issue about fixing the nil panic you're seeing (which I'm investigating now).

@dsnet
Copy link
Member

dsnet commented Aug 3, 2018

Thank you for the reproducer. I am able to reproduce it on my end.

@tandr
Copy link
Author

tandr commented Aug 3, 2018

Thanks Joe, lets fix this one first, performance is something we will address later.

@dsnet
Copy link
Member

dsnet commented Aug 3, 2018

Alright, I struggled with this for a while, and I have some bad news and good news.

The bad news: there's not really anything we can do on our end to fix this.

  • The problem is arising because the logic first checks if a XXX_Marshal or XXX_Unmarshal method exists on the type, and if so, calls it.
  • In your situation, you have a mixture of a hand-written proto dbBlock, which embeds a *Block. Embedding in Go forwards all fields and methods from the embedded type to the parent. Per our compatibility statement, we reserve the right to add new fields and methods to a type. Since embedding forwards all methods, both those that the author intended and any new methods added in the future that the author may or may not have intended, the proto package now sees the XXX_Marshal methods (newly generated in 1.1) forwarded by the *Block type.
  • (Similar to comparability) embedding types in Go is dangerous since it may forward methods and fields that the author did not intend. See my comment here about the dangers of embedding.
  • From the Go language perspective it is impossible to determine whether a method was forwarded or not. Thus, we cannot distinguish between XXX_Marshal or XXX_Unmarshal methods that were declared on an embedded child type or directly on the parent type.

Now the good news. There are some (hopefully not too intrusive) workarounds:

  • Option 1: Stop embedding. This is my personally suggested approach. My experience with dealing with massive codebases (across all of Google) is that the dangers of embedding outweigh its benefits.

  • Option 2: Only forward the fields. It seems that the main purpose of embedding in your case is to forward the fields for convenience. To accomplish this, you can do a type-redefinition of the proto message to strip any methods, but preserve the fields.

diff --git a/bug668_test.go b/bug668_test.go
index 67654ff..b7dc95f 100644
--- a/bug668_test.go
+++ b/bug668_test.go
@@ -30,9 +31,11 @@ func TestDbBlock_packUnpack(t *testing.T) {
        }
 }

+type BlockNoMethods Block
+
 // block wrapper as possible additional meta info holder
 type dbBlock struct {
-       *Block `protobuf:"bytes,1,opt,name=block" json:"block,omitempty"`
+       *BlockNoMethods `protobuf:"bytes,1,opt,name=block" json:"block,omitempty"`
 }

 // fulfilling Message interface
@@ -52,11 +55,11 @@ func unpackBlock(value []byte) (*Block, error) {
                log.Fatalf("cannot un-marshall block %v, error=%v", dbBlock.BlockId, err)
        }

-       return dbBlock.Block, err
+       return (*Block)(dbBlock.BlockNoMethods), err
 }

 func packBlock(b *Block) ([]byte, error) {
-       dbBlock := dbBlock{b}
+       dbBlock := dbBlock{(*BlockNoMethods)(b)}

        buf, err := proto.Marshal(&dbBlock)
  • Option 3: Intentionally invalidate the forwarded methods. Since the methods causing problems are XXX_Marshal and XXX_Unmarshal, you can intentionally declare those methods with a method signature that the type assertion in the proto package won't match on. This is a hack that works around the immediate problem, but does nothing to address methods that may be added in the future that causes problems again.
diff --git bug668_test.go bug668_test.go
index 67654ff..72e2b66 100644
--- bug668_test.go
+++ bug668_test.go
@@ -35,6 +36,9 @@ type dbBlock struct {
        *Block `protobuf:"bytes,1,opt,name=block" json:"block,omitempty"`
 }

+func (*dbBlock) XXX_Unmarshal() {}
+func (*dbBlock) XXX_Marshal()   {}
+
 // fulfilling Message interface
 func (m *dbBlock) Reset()         { *m = dbBlock{} }
 func (m *dbBlock) String() string { return proto.CompactTextString(m) }

Applying options 1, 2, or 3 fixed the test you provided.

@dsnet dsnet changed the title Code generated by protoc-gen-go causes panic during runtime when used with exiting serialization code proto: panics on handwritten types that directly embed generated messages Aug 3, 2018
@tandr
Copy link
Author

tandr commented Aug 3, 2018

Thanks Joe. Stopping embedding sounds like a plan. (Other 2 solutions are just hacks - 2 is "just" a hack, 3 is a "go try to figure it out later, hehe" hack, sorry).

On a side note: For XXX_ methods: maybe it would be easier if you would create "MessageXXX" interface (that fulfills Message too) and just specify that newly generated code implements it instead. And then who wants these extra fields will have them (we don't :) )

Tagging version of protobuf (and protoc-gen-go) right before that change will be much appreciated.

(rant) As you probably can see, I don't like this XXX_ thing... It is extremely intrusive, it breaks ability for generated classed to be used as nice POCs, and it breaks code that was nicely working before (using classes as map keys). Now I need to introduce proxies or maintain POCs myself (with all the pains of getting out of sync attached...). Since autogenerated POCs were SO convenient, we are using these generated classes (types) everywhere across 3 different languages ... (/rant sorry=true)

@dsnet
Copy link
Member

dsnet commented Aug 3, 2018

The issues your bringing up about XXX is something we're aware of. We're working on the next generation implementation of proto that will address these issues. See #364. This document, Go Protocol Buffer API Improvements, gives an overview of the direction were heading.

Anyways, I'm glad removing embedding will fix this for you.

@dsnet dsnet closed this as completed Aug 3, 2018
@golang golang locked as resolved and limited conversation to collaborators Jun 25, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants