-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cross-calendar skeleta data #4779
Conversation
ICU4X WG discussion: Three dimensions:
Things we care about:
General directions:
@sffc to make a write-up. Agreed that this is a priority / worthwhile discussion. |
The output of the test I just pushed to this branch:
My eyes pop a bit at those numbers... 17493 unique size is too small. It's great if true, but I must be missing something. |
^ that was running with test locales only. Here it is with all locales:
108 KB for everything is not bad at all. Doesn't include the lookup table though. |
Based on my findings above, my conclusions are:
Therefore, I'd like to more forward with skeleta as auxiliary keys. I was thinking of using one-letter-per-field skeleta as the auxiliary key. For example:
So then the data would be indexed as something like:
|
As a reminder, the starting point was:
So this proposal brings us from 167 KB for all skeleta in a single calendar to 108 KB for all skeleta in all calendars.* * plus some yet-to-be-calculated overhead for the lookup table |
Discussion from ICU4X-WG:
|
One skeleton, all calendars and locales = 42057 B (note: this is YearMonthDay which is highly likely to be among the larger skeleta due to more differences across locales)
For comparison, all date skeleta, all locales, only Gregorian = 166515 B