-
Notifications
You must be signed in to change notification settings - Fork 15.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make AbstractMessage
public in Ruby
#12550
Comments
The thing we are really trying to avoid here is giving the impression that protobuf messages have a common base class where common functionality goes. If we followed the common expectation of base classes, then once AbstractMessage is a public base class, people could start sending us PRs to add any number of convenient methods to it. But this is asking for conflicts, because the generated class will have accessors for every field defined in the message BadForRuby {
optional int32 to_json = 1;
optional int32 to_proto = 2;
optional int32 decode = 3;
optional int32 convenient_method_added_in_pr = 4;
} To avoid these kinds of conflicts, we want generated classes to look as much like dumb data containers as possible, with no base class, and have all interesting/common functionality come from other places. |
@haberman thanks for the quick response, really appreciate it!
Seems feasible to make it clear where the line is drawn on contributions there, both with a comment in the code and with code review.
Doesn’t this problem exist regardless of whether you define methods like decode and descriptor on AbstractMessage or you pretend they’re only accessible on each concrete message class? The PR that came before this already introduced AbstractMessage. If someone defines a proto message with fields whose names collide with names already used by AbstractMessage, they will override them, regardless of whether AbstractMessage is public or private. Also, correct me if I'm wrong, but this only applies to instance methods, right? I'm not sure that In the case of having field names collide with method names, there's even more reason to want to expose bad_for_ruby = # ... create BadForRuby message
Google::Protobuf::AbstractMessage.instance_method(:to_s).bind(bad_for_ruby).call This So my argument is:
|
I wish there was a way of writing this more like: sig do
params(
proto_class: Google::Protobuf::MessageClass,
bytes: String
)
.returns(T.instance_of)
end
def decode_and_log(proto_class, bytes)
puts("Decoding with class: #{proto_class.descriptor.name}")
proto_class.decode(bytes)
end It's really the singleton methods you are interested in here, which are like instance methods on the class. The fact that the proto has a base class isn't interesting or important, except in what it says about what singleton methods are available. On the other hand, I notice that
I think Tested here: https://github.com/protocolbuffers/protobuf/blob/main/ruby/tests/basic.rb#L110 |
ah thanks, I had only checked the C defined methods
is this a class that already exists? If not, could you say more about how you thought this would be defined? also, in your example what does T.instance_of mean? Again, I’m pretty sure that the relevant Sorbet machinery exists to write a sig, as long as the methods are defined on some class or module that already exists. If the singleton class methods were defined on Google::Protobuf::MessageExts::ClassMethods, that would good, but AbstractMessage would still be better. Basically: any solution that involves defining the methods that are common to all messages on a common ancestor of all messages is what unblocks better types. That could be a parent class, or a module, but it has to be some class/module that actually is visible at runtime and actually has the methods defined on it. Writing the type annotations is easier and involves less boilerplate when it’s a class like AbstractMessage, but if the instance methods were on MessageExts and the singleton class methods on MessageExts::ClassMethods, sorbet would be able to write types for that too (again, this approach would just add verbosity to users, but at least this is tolerable).
So I’m also interested in the instance methods, to a lesser degree. We have 120 MB of proto-generated Ruby code in our codebase, which adds about 120 + 120*4 MB to the memory used when parsing and type checking. If these methods were defined on a parent class, we could define them once and not have to make seemingly-identical definitions for all the instance and singleton class methods. This would cut down on the size of generated files, making type checking faster and use less memory. Sorry for not bringing up this point in the original post, as I only realized it yesterday afternoon. |
In our codebase we're fighting a constant battle against the codebase's growing size. Having a larger codebase makes basically everything slower, from creating an initial dev environment to type checking code to creating deploy artifacts. Generated protobuf files end up contributing a lot to the bloat of our codebase. These two options make protoc-gen-rbi generate less code. The options: - `hide_common_methods` This option hides the common methods like `decode_json` and `to_h` that protoc-gen-rbi would otherwise create. While it's most accurate to have those methods be defined on these classes, it's actually possible to get basically the same type safety by defining these methods elsewhere, using `T.attached_class`. - `use_abstract_message` A recent `google-protobuf` PR changed the superclass of all message classes: <protocolbuffers/protobuf#10281> The new superclass, `Google::Protobuf::AbstractMessage`, is technically private. However, having a common superclass is quite convenient for a number of reasons, and codebases may wish forcibly make this class publicly visible. If that's been done, it's nice to be able to mention that superclass instead of the `include` and `extend` calls, because `AbstractMessage` is exactly defined by just an empty class with those two mixins, and getting rid of those calls generates less code. For more discussions on the merits of making this class public: <protocolbuffers/protobuf#12550> I'm happy to split this change apart if you think it would be better that way. On Stripe's codebase, these changes amount to a 4% reduction in Sorbet's peak memory usage, and a 1% improvement to time spent type checking, so I'm quite eager to get this change or some equivalent version of it landed.
In our codebase we're fighting a constant battle against the codebase's growing size. Having a larger codebase makes basically everything slower, from creating an initial dev environment to type checking code to creating deploy artifacts. Generated protobuf files end up contributing a lot to the bloat of our codebase. These two options make protoc-gen-rbi generate less code. The options: - `hide_common_methods` This option hides the common methods like `decode_json` and `to_h` that protoc-gen-rbi would otherwise create. While it's most accurate to have those methods be defined on these classes, it's actually possible to get basically the same type safety by defining these methods elsewhere, using `T.attached_class`. - `use_abstract_message` A recent `google-protobuf` PR changed the superclass of all message classes: <protocolbuffers/protobuf#10281> The new superclass, `Google::Protobuf::AbstractMessage`, is technically private. However, having a common superclass is quite convenient for a number of reasons, and codebases may wish forcibly make this class publicly visible. If that's been done, it's nice to be able to mention that superclass instead of the `include` and `extend` calls, because `AbstractMessage` is exactly defined by just an empty class with those two mixins, and getting rid of those calls generates less code. For more discussions on the merits of making this class public: <protocolbuffers/protobuf#12550> I'm happy to split this change apart if you think it would be better that way. On Stripe's codebase, these changes amount to a 4% reduction in Sorbet's peak memory usage, and a 1% improvement to time spent type checking, so I'm quite eager to get this change or some equivalent version of it landed.
Right, and my concern is that I don't want messages to have common methods. As much as possible, I want message instances to have accessors for the fields defined in a We do have common methods on message classes (ie. class-level methods). That's why I would favor a solution that can somehow represent that protobuf classes have common class-level methods, but not object methods.
|
... but there already are common methods, right? If a message defines a field named my_message.hash you would be calling the accessor for the proto field named But if Google::Protobuf::AbstractMessage.instance_method(:hash).bind_call(my_message) which lets the user forcibly call the real The gem as written today, where def count_fields(message)
as_hash.to_h.size
# ^^^ expected to call `to_h` in parent, but might actually access a field
end The user will only find out that it misbehaves the first time they happen to import a proto file with an uncommon name—whether
Given what I've said above, I'm confused how to satisfy these criteria. The way I read this request suggests that you're nearly asking for the changes in #10281 to be reverted, which started to define common instance methods on all messages via a superclass. I am not suggesting that we add more common methods to all messages, but merely that the existing parent class be made public.
Without getting into a full discussion of how most type systems for Ruby work (including Sorbet's type system), suffice it to say that as long as some constant like So the feature you're describing already exists, although it has different syntax from the syntax you propose. But that syntax requires that |
We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please add a comment. This issue is labeled |
We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please reopen it. This issue was closed and archived because there has been no new activity in the 14 days since the |
What language does this apply to?
Ruby
Describe the problem you are trying to solve.
In this change the
AbstractMessage
parent class was added for Ruby code. The motivation was to be a pure performance win, exploiting how classes and objects are laid out in the Ruby code.But as it happens, having a common base class that has all these methods defined is particularly convenient. When using an optional static type checker in a Ruby project, it's very easy if you can write
Google::Protobuf::AbstractMessage
in a type signature, and know that it has all these methods. For example, using Sorbet, it would be possible to write a signature likeRight now it's annoying to write signatures for these methods, because methods like
descriptor
anddecode
aren't actually defined on the only other common type in the inheritance hierarchy,MessageExts::ClassMethods
:So there's not a convenient type you can write—but there would be if
AbstractMessage
were not a private constant.Describe the solution you'd like
Can we remove
private_constant :AbstractMessage
?protobuf/ruby/lib/google/protobuf/message_exts.rb
Line 56 in 5233990
Describe alternatives you've considered
An alternative would be to actually define the methods on
MessageExts::ClassMethods
.In the Sorbet type system this is slightly less ergonomic to use,
MessageExts
andMessageExts::ClassMethods
look like completely orthogonal modules to the type system. (For example, it doesn't know thatMessageExts::ClassMethods
has all the methods that a normal class would have, like.name
or.ancestors
.) That means you have to go around writing things likeT.all(Google::Protobuf::MessageExts, Class)
which is confusing.It's a lot easier to just mention the exact superclass you want in your type annotation, which in this case is
AbstractMessage
.Additional context
Add any other context or screenshots about the feature request here.
The text was updated successfully, but these errors were encountered: