Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVRO 4091: [C#] Allow previously parsed schemas to be referenced when parsing a schema #3242

Merged
merged 3 commits into from
Nov 20, 2024

Conversation

rayokota
Copy link
Contributor

@rayokota rayokota commented Nov 16, 2024

What is the purpose of the change

Currently the C# Avro library does not have a way to parse a new schema that refers to named schemas that have been previously parsed.

The equivalent in the Java Avro library is to construct an instance of Schema.Parser and to use the same parser instance to parse the referenced schemas before the referring schema.

The change that is needed is to simply modify the following method in Schema.cs from internal to public:

internal static Schema Parse(string json, SchemaNames names, string encspace)

Without this change, tools that allow Avro schemas to reference one another, such as the Confluent Schema Registry, cannot interoperate well with C# applications.

Verifying this change

This change is a trivial rework / code cleanup without any test coverage.

Documentation

  • Does this pull request introduce a new feature? (no)

@github-actions github-actions bot added the C# label Nov 16, 2024
@rayokota
Copy link
Contributor Author

@martin-g @KalleOlaviNiemitalo @zcsizmadia could I kindly get a review? Thanks in advance!

@martin-g
Copy link
Member

I am not familiar with C# and/or the Avro C# SDK ...
The change is trivial so I enabled the CI checks, but I cannot say whether the change is a good one or not. There are no new tests verifying the problem or the fix.

@KalleOlaviNiemitalo
Copy link
Contributor

@rayokota, are you planning a change in https://github.com/confluentinc/confluent-kafka-dotnet that would make use of this public method? If you can post that as a draft PR there, it will be easier to understand what the benefit is.

@KalleOlaviNiemitalo
Copy link
Contributor

If the method becomes public, then it should have argument null checks and trimming like in Parse(string json).

To avoid redundant checks and trimming when the caller has already done those, one could rename the existing internal static Schema Parse(string json, SchemaNames names, string encspace) to ParseInternal and add a public Parse wrapper with the checks and trimming.

@rayokota
Copy link
Contributor Author

rayokota commented Nov 19, 2024

Thanks for the review @martin-g and @KalleOlaviNiemitalo ! I've incorporated your review feedback.

@KalleOlaviNiemitalo , I've renamed the existing method as ParseInternal as you suggested, and added a public method that calls it after performing argument null checks and json trimming.

Yes, I plan to use this method in https://github.com/confluentinc/confluent-kafka-dotnet , since Confluent Schema Registry supports Avro schemas that can reference other schemas by name. This allows schema reuse, as well as allows our customers to decompose very Avro large schemas into smaller ones.

As suggested, I've added a test showing how I intend to use this method. The test has a root schema that refers to a nested schema by name only. In Confluent Schema Registry we also store some metadata with the root schema to indicate which schema(s) it refers to. The Schema Registry client in https://github.com/confluentinc/confluent-kafka-dotnet will retrieve both schemas independently, and then use the method in this PR to reconstruct the root schema.

I've implemented this recomposition functionality for Schema Registry clients written in Java, go, Python, and Node.js, since the Avro libraries in those other languages also provide a means to compose schemas by name. This PR will allow C# to provide the same capability.

Please let me know if you would like me to add anything else to my explanation or this PR. Thanks in advance!

@zcsizmadia zcsizmadia self-requested a review November 19, 2024 13:30
@zcsizmadia zcsizmadia merged commit b1ec3b9 into apache:main Nov 20, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants