-
Notifications
You must be signed in to change notification settings - Fork 424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
External protos #278
External protos #278
Conversation
All the failures seem to be of the form:
Is there an issue with CI? |
The doc tests have been broken for a while because of rust-lang/cargo#8531; that's what the shared library errors in Travis are. What are the benefits of putting the protos in a separate crate, instead of keeping them in the I'll have to think about it more when I've had more sleep. |
I would prefer the proto types became public and the deprecated fields were all removed for a minor version bump 👀 it would be nice if there was a minimum number of versions a deprecated field lived for before deletion. Well a lot of the proto types aren't as yet publicly exposed the only fields you can access them from are a My hope with this is that there's less pressure to update the bindings to get certain functionality, and users can easily get the Also, I got here because I was attempting to add a proto in another PR I was working on and working out all the protos I needed to add to the existing build script for dependencies was a massive pain, as well as working out to run the codegen itself (the docs didn't quite help, I eventually worked it out from something like a CI script). My method generates code for all the protos, works out the dependencies and sets up feature gates for every proto file you'd want to add including it's dependencies. So then for any new protos in the future it's just a case of enabling a feature which is definitely easier 👀 Another option could be to move my code inline to this repo, and then give you publish access to the protobufs crate I published. And then everything is here |
I agree that this project should move toward using actual protobuf types and not just I don't think we want to get rid of the methods which return One question I had was whether it made sense for the generated protos to go into the I'd be a bit more comfortable if we feature-gate the use of proto types in the API for now (i.e. a single feature named something like By the way, if there is any documentation which is unclear, please file an issue or submit a PR. We should fix bad or incomplete docs. I'm also definitely interested in having more contributors to the project, especially for the Windows build. I always need help with the Windows build. :( |
Hmm I can adapt my crate to generate using different protobuf crates and feature gate each one. I use prost in other projects just because tonic uses it so would be nice to cut down dependencies. I can probably get it to work so you can just do it like
That's why I suggested making you a contributor to the repo/crates.io as well. That would give you push/merge/publish rights 😄. With that in mind would you still want it inside this repo, it might end up being more substantially sized with other protobuf options included. |
Either way, I'll start working on changes to my proto crate to prototype other protobuf generation crates 👍 |
https://github.com/xd009642/tensorflow-protos-rs/tree/prost_support if you want to have a look at prototype prost support here. However, prost generates the files differently, instead of a file for each proto it's a file for each include directory. This means I can't generate the feature crates and a lot more will be pulled in as a result. I did notice there was a tensorflow_grpc proto file so I guess for a fuller support I should considering generating the grpc service definitions as well if also adding prost support... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't hide the Vec<u8>
from the user if they enable a feature, because features are generally considered to be additive. Basically, if code compiles with a feature disabled, it should continue to compile with the feature enabled. That also implies that if we have both prost_protos
and protobuf_protos
features, it should be possible to enable both at the same time. Imagine what happens if crate A uses tensorflow
with the prost_protos
feature, and crate B uses tensorflow
with the protobuf_protos
feature, and then crate C uses both crates A and B.
If we support both protobuf and prost explicitly, we'd have to do something like e.g.
fn graph_def(&self) -> Result<Vec<u8>>
#[cfg(feature="protobuf_protos")]
fn graph_def_protobuf(&self) -> Result<protobuf_protos::GraphDef>
#[cfg(feature="prost_protos")]
fn graph_def_prost(&self) -> Result<prost_protos::GraphDef>
If the generated protos are going to be in a separate crate anyway, it probably makes sense to have one for protobuf
TensorFlow protos and one for prost
TensorFlow protos. That's simpler than having both in the same crate and feature gating between the two proto libraries.
We also don't have to add prost
support immediately. I would just like to make sure we leave room for it later.
This release contains completely rewritten class serializer using .NET expression trees. All the class serialisation documentation has been updated to reflect this. Existing `ParquetConvert` serializer is left untouched and marked obsolete, hence there is no increase in the major version number. If you are using `ParquetConvert`, consider switching to `ParquetSerializer` soon, because no new features or fixes will be added to `ParquetConvert`. New class serializer supports all primitive types and nested types (structs, lists, maps and their combinations). It also fully conforms to Dremel specification, which was a massive headache to implement properly. Other improvements: - Documentation updated for low-level and serializer API on how to use nested data types. - Corrected how definition and repetition levels are calculated, making it more conformant to Parquet specification. - Schema path calculation logic improved and does not rely on string splitting/joining, which allows you to use any characters anywhere in column names (tensorflow#278)
As mentioned in #272 I believe this will simplify the current release/build process for these bindings and using an external crate for this means some of the fields currently serialized as byte vectors can be interacted with by users with less hassle or internal change. As well as greater ability to extend the bindings without reliance on changes in this crate.
Also, I've generated a whole bunch of protos you don't even need yet so it's easy peasy to just add another feature and get more stuff bought into this crate whereas personally I found before interacting with tensorflow-proto-codegen was difficult to get working.
I'm aware relying on an external crate for the proto implementations may be seen as somewhat of a risk so I'm willing to add you as a maintainer/contributor @adamcrume 😄