Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DryDataProvider for baked providers #5211

Merged
merged 6 commits into from
Aug 19, 2024

Conversation

robertbastian
Copy link
Member

No description provided.

@robertbastian robertbastian requested a review from a team as a code owner July 9, 2024 17:53
@robertbastian robertbastian requested a review from sffc July 9, 2024 17:53
Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only use case for DryDataProvider is supported locales, and supported locales are only queryable if #5195 lands, so I don't think this should land unless #5195 lands.

I'd be okay making it an option on BakedExporter, though.

@robertbastian
Copy link
Member Author

The only use case for DryDataProvider is supported locales

That's a very absolute statement.

and supported locales are only queryable if #5195 lands

Well they're queryable like this.

I'd be okay making it an option on BakedExporter, though.

I don't want to the API surface of the baked providers to change depending on ICU4X_DATA_DIR.

@robertbastian
Copy link
Member Author

BlobDataProvider and FsDataProvider also implement DryDataProvider even if the data is using maximal deduplication.

The implementation is also required to use forking providers between baked and a provider where dry_load is actually cheaper.

@robertbastian robertbastian requested a review from sffc July 9, 2024 22:22
@sffc
Copy link
Member

sffc commented Jul 10, 2024

There's also the question about whether we want DryDataProvider to potentially use a different trie. I think you made an argument that such behavior, making dry_load and load return different locales, should be off the table.

@sffc
Copy link
Member

sffc commented Jul 11, 2024

At some point I also remember you saying that you didn't want people using provider::Baked directly, and instead using the macros exported by the data crates.

I hear you about consistency with the blanket impls we have on BlobProvider and FsDataProvider. I think the differences with baked provider are:

  1. BakedProvider is not blanked impld. It is impld for specific markers
  2. We have been discussing different impls for Baked DryDataProvider in the other PR that might be more efficient, and this PR locks us into one particular impl

Why not just export separate macros like impldry_foo_marker_v1!?

Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^

@robertbastian
Copy link
Member Author

BakedProvider is not blanked impld. It is impld for specific markers

I did this as a concession to you because you worry about source code size. I'm very happy to impl for specific markers.

We have been discussing different impls for Baked DryDataProvider in the other PR that might be more efficient, and this PR locks us into one particular impl

I don't see how it locks us in. It's a correct implementation.

Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BakedProvider is not blanked impld. It is impld for specific markers

I did this as a concession to you because you worry about source code size. I'm very happy to impl for specific markers.

I worry about code size when we can reduce it without causing side-effects (option 3 in #5230 for instance)

We have been discussing different impls for Baked DryDataProvider in the other PR that might be more efficient, and this PR locks us into one particular impl

I don't see how it locks us in. It's a correct implementation.

It locks us out of having marker-specific impls that do different things such as having a base languages supplement trie. But, I think your position is that impls where dry_load and load return different locales (even if they return the same success status) is off the table?

@robertbastian robertbastian requested a review from sffc July 25, 2024 16:44
Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concerns haven't changed since my last review

Should we get more feedback from others or discuss in a WG call?

@robertbastian robertbastian added the discuss-priority Discuss at the next ICU4X meeting label Jul 25, 2024
@robertbastian
Copy link
Member Author

I unimplemented it for our concrete data providers, I hope this removes the point of contention.

@robertbastian robertbastian requested a review from sffc August 13, 2024 17:31
@robertbastian robertbastian removed the discuss-priority Discuss at the next ICU4X meeting label Aug 13, 2024
@sffc
Copy link
Member

sffc commented Aug 19, 2024

This adds DryDataProvider impls for concrete markers in a way that users can opt-in via the macro call site. This is my preferred approach.

@sffc
Copy link
Member

sffc commented Aug 19, 2024

The code size impact seems smallish relative to the overall code size of these baked data files.

@robertbastian robertbastian merged commit d143452 into unicode-org:main Aug 19, 2024
28 checks passed
@robertbastian robertbastian deleted the baked branch August 19, 2024 20:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants