Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wasm] Estimate ICU data size savings when only shipping English #48355

Closed
CoffeeFlux opened this issue Feb 16, 2021 · 9 comments
Closed

[wasm] Estimate ICU data size savings when only shipping English #48355

CoffeeFlux opened this issue Feb 16, 2021 · 9 comments
Labels
arch-wasm WebAssembly architecture area-System.Globalization size-reduction Issues impacting final app size primary for size sensitive workloads
Milestone

Comments

@CoffeeFlux
Copy link
Contributor

CoffeeFlux commented Feb 16, 2021

It should be possible to modify https://github.com/dotnet/icu/tree/maint/maint-67/icu-filters and generate the data file for only English instead of EFIGS to see how much space we save.

To generate the file, you need to build the full repo, and then you can rebuild the data files after adjusting the filter with make -f icu.mk data-icudt TARGET_OS=browser TARGET_ARCHITECTURE=wasm.

@CoffeeFlux CoffeeFlux added arch-wasm WebAssembly architecture area-System.Globalization size-reduction Issues impacting final app size primary for size sensitive workloads labels Feb 16, 2021
@ghost
Copy link

ghost commented Feb 16, 2021

Tagging subscribers to 'arch-wasm': @lewing
See info in area-owners.md if you want to be subscribed.

Issue Details

It should be possible to modify https://github.com/dotnet/icu/tree/maint/maint-67/icu-filters and generate the data file for only English instead of EFIGS to see how much space we save.

To generate the file, you need to build the full repo, and then you can rebuild the data files after adjusting the filter with make -f icu.mk data-icudt TARGET_OS=browser TARGET_ARCHITECTURE=wasm.

@tannergooding can you handle getting the estimate?

Author: CoffeeFlux
Assignees: -
Labels:

arch-wasm, area-System.Globalization, size-reduction

Milestone: -

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Feb 16, 2021
@CoffeeFlux CoffeeFlux added this to the 6.0.0 milestone Feb 16, 2021
@CoffeeFlux
Copy link
Contributor Author

If the size savings looks worthwhile, we can adjust this issue to split up the files further.

@CoffeeFlux CoffeeFlux removed the untriaged New issue has not been triaged by the area owner label Feb 16, 2021
@CoffeeFlux
Copy link
Contributor Author

Full EFIGS:

-rw-r--r--  1 ryan  staff   602096 Feb 23 16:42 icudt_EFIGS.dat

en_US only:

-rw-r--r--  1 ryan  staff   495072 Feb 23 16:55 icudt_EFIGS.dat

This is uncompressed. I'm not sure this is significant enough of a win post-compression to justify the extra effort, but cc: @eerhardt

@CoffeeFlux
Copy link
Contributor Author

Also, for reference, all English locales:

-rw-r--r--  1 ryan  staff   549520 Feb 23 19:25 icudt_EFIGS.dat

@lewing
Copy link
Member

lewing commented Feb 25, 2021

Compressed sizes?

@CoffeeFlux
Copy link
Contributor Author

That one is harder to check without knowing the exact Brotli settings we use - do you know where I can find those?

@CoffeeFlux
Copy link
Contributor Author

By default, we use quality 11.

Full EFIGS:

-rw-r--r--  1 ryan  staff   148657 Feb 26 17:56 icudt_EFIGS.dat.br

All English locales:

-rw-r--r--  1 ryan  staff   138302 Feb 23 19:25 icudt_EFIGS.dat.br

en_US only:

-rw-r--r--  1 ryan  staff   130624 Feb 26 17:49 icudt_EFIGS.dat.br

So shipping only en_US instead of EFIGS saves about 18k. All English locales is around 10k.

@CoffeeFlux
Copy link
Contributor Author

CoffeeFlux commented Mar 1, 2021

@lewing do you think this is worth splitting up further, or is the savings here not worth the added complexity? I'd prefer to defer that judgement to you since I don't have a good idea of the complexity this would add on the Blazor end, but I would assume it's not really worth it.

@CoffeeFlux
Copy link
Contributor Author

CoffeeFlux commented Mar 5, 2021

This is probably only worth it if we are able to split the locale-specific data from the more generic data required.

@ghost ghost locked as resolved and limited conversation to collaborators Apr 4, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-wasm WebAssembly architecture area-System.Globalization size-reduction Issues impacting final app size primary for size sensitive workloads
Projects
None yet
Development

No branches or pull requests

2 participants