Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New datagen API #3564

Closed
1 task done
robertbastian opened this issue Jun 22, 2023 · 3 comments · Fixed by #4041
Closed
1 task done

New datagen API #3564

robertbastian opened this issue Jun 22, 2023 · 3 comments · Fixed by #4041
Assignees
Labels
C-data-infra Component: provider, datagen, fallback, adapters S-large Size: A few weeks (larger feature, major refactoring)

Comments

@robertbastian
Copy link
Member

robertbastian commented Jun 22, 2023

Outstanding questions:

  • DatagenDriver::with_collations or DatagenDriver::with_addtional_collations?
@robertbastian robertbastian added C-data-infra Component: provider, datagen, fallback, adapters S-large Size: A few weeks (larger feature, major refactoring) labels Jun 22, 2023
@robertbastian robertbastian self-assigned this Jun 22, 2023
@robertbastian robertbastian added this to the 1.3 Blocking ⟨P1⟩ milestone Jun 22, 2023
@robertbastian robertbastian added the discuss Discuss at a future ICU4X-SC meeting label Aug 10, 2023
@sffc
Copy link
Member

sffc commented Aug 10, 2023

Initial ideas:

  • Create a new type that takes an IterableDynamicDataProvider and plumbs it to a DataExporter
  • Possible name: DataExportDriver

@sffc
Copy link
Member

sffc commented Aug 10, 2023

In 2.0:

  • DatagenProvider::set_cldr_source(&mut self, ...) -> &mut Self
  • DataExportDriver::set_locales(&mut self, ...) -> &mut Self

But we want to do:

let driver = DataExportDriver::new().set_locales(...);

But we should prevent mutation:

let mut driver = DataExportDriver::new();
driver.set_locales();

// do some exporting

let driver = driver.with_keys();

// do some more exporting

  • @sffc - Maybe we should make this be a real builder so that we can resolve CldrSet
  • @robertbastian - We need SourceData for that
  • @Manishearth - I agree with the goals. The precise solution seems like an okay way of achieving it.

@sffc
Copy link
Member

sffc commented Sep 5, 2023

For datagen driver:

  • Okay to have ::new() because there is a reasonable default set of settings, and this is basically just an options bag; however, locales and keys must be explicitly set. If they are not, calling export should result in an Err.
  • Default should be:
    • Unset locales and keys
    • Recommended segmenter models, with method with_recommended_segmenter_models(). In docs, say "at the moment, this is all models"
    • No additional collations
    • Default fallback mode (RecommendedForExporter)
  • Silence the Clippy for missing Default impl because it is an unusable default

LGTM: @sffc @robertbastian (@Manishearth)

@robertbastian robertbastian removed the discuss Discuss at a future ICU4X-SC meeting label Sep 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-data-infra Component: provider, datagen, fallback, adapters S-large Size: A few weeks (larger feature, major refactoring)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants