Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds in default Intl.DateTimeFormat #127

Merged
merged 21 commits into from
Nov 29, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
ac7d2df
adds default Intl.DateTimeFormat
TatianaKapos Aug 30, 2022
1053de6
Merge branch 'main' of https://github.com/TatianaKapos/hermes-windows…
TatianaKapos Sep 6, 2022
792cc8f
Merge branch 'main' of https://github.com/TatianaKapos/hermes-windows…
TatianaKapos Sep 8, 2022
0963919
adds supportedLocales
TatianaKapos Sep 8, 2022
9141dbf
simple PR feedback
TatianaKapos Sep 20, 2022
84e76ed
format
TatianaKapos Sep 20, 2022
a94b954
more simple PR feedback
TatianaKapos Sep 28, 2022
e989dfb
change to nullptr
TatianaKapos Sep 28, 2022
065d043
use std:optional
TatianaKapos Sep 29, 2022
61c0b07
Create shared code file and more PR feedback
TatianaKapos Sep 29, 2022
279b2c2
free datetimeformat
TatianaKapos Sep 30, 2022
13b245e
Merge branch 'main' of https://github.com/TatianaKapos/hermes-windows…
TatianaKapos Sep 30, 2022
f8e219f
store std::string globally
TatianaKapos Oct 13, 2022
98027ea
Merge branch 'main' into tk-datetimeformat
TatianaKapos Oct 17, 2022
94ed8a6
add more error checking
TatianaKapos Nov 15, 2022
d68611f
Merge branch 'tk-datetimeformat' of https://github.com/TatianaKapos/h…
TatianaKapos Nov 15, 2022
70e70b4
Merge branch 'main' of https://github.com/TatianaKapos/hermes-windows…
TatianaKapos Nov 15, 2022
589180d
Update IntlAPIs.md
TatianaKapos Nov 17, 2022
271bdc8
fix camelCase
TatianaKapos Nov 28, 2022
784672f
Merge branch 'tk-datetimeformat' of https://github.com/TatianaKapos/h…
TatianaKapos Nov 28, 2022
c97f311
remove old comment
TatianaKapos Nov 28, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
98 changes: 81 additions & 17 deletions doc/IntlAPIs.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,18 @@ id: intl
title: Internationalization APIs
---

This document describes the current state of Android implementation of the [ECMAScript Internationalization API Specification](https://tc39.es/ecma402/) (ECMA-402, or `Intl`). ECMA-402 is still evolving and the latest iteration is [7th edition](https://402.ecma-international.org/7.0/) which was published in June 2020. Each new edition is built on top of the last one and adds new capabilities typically as,
This document describes the current state of Android/Windows implementation of the [ECMAScript Internationalization API Specification](https://tc39.es/ecma402/) (ECMA-402, or `Intl`). ECMA-402 is still evolving and the latest iteration is [7th edition](https://402.ecma-international.org/7.0/) which was published in June 2020. Each new edition is built on top of the last one and adds new capabilities typically as,
- New `Intl` service constructors (e.g. `Intl.Collator`, `Intl.NumberFormat` etc.) or extending existing ones by accepting more parameters
- New functions or properties in `Intl` objects (e.g. `Intl.Collator.prototype.compare`)
- New locale aware functions in standard Javascript object prototypes (e.g. `String.prototype.localeCompare`)

# Android

One popular implementation strategy followed by other engines, is to bundle an internationalization framework (typically [ICU](http://site.icu-project.org/)) along with the application package. This guarantees deterministic behaviours at the cost of applications package bloat. We decided to consume the Android platform provided facilities for space efficiency, but at the cost of some variance in behaviours across Android platforms.

# ECMA-402 Compliance
## ECMA-402 Compliance

## Supported
### Supported

- `Intl.Collator`
- `Intl.Collator.supportedLocalesOf`
Expand Down Expand Up @@ -49,7 +51,7 @@ One popular implementation strategy followed by other engines, is to bundle an i
- `toLocaleDateString`
- `toLocaleTimeString`

## Not yet supported
### Not yet supported

- [`Intl.PluralRules`](https://tc39.es/ecma402/#pluralrules-objects)

Expand All @@ -67,13 +69,13 @@ One popular implementation strategy followed by other engines, is to bundle an i
- [`fractionalSecondDigits`](https://github.com/tc39/ecma402/pull/347)
- [`BigInt.prototype.toLocaleString`](https://tc39.es/ecma402/#sup-bigint.prototype.tolocalestring)

## Excluded
### Excluded

- `Intl.DateTimeFormat`: [`formatMatcher`](https://tc39.es/ecma402/#sec-basicformatmatcher) parameter is not respected. The parameter enables the implementation to pick the best display format when it supports only a subset of all possible formats. ICU library in Android platform and hence our implementation allows all subsets and formats which makes this `formatMatcher` property unnecessary.

## Limitations across Android SDKs
### Limitations across Android SDKs

### Android 11
#### Android 11

- The keys of the object returned by `resolvedOptions` function in all `Intl` services are not deterministically ordered as prescribed by spec.
- DateFormat: ECMAScript [beginning of time](https://www.ecma-international.org/ecma-262/11.0/index.html#sec-time-values-and-time-range) (-8,640,000,000,000,000), is formatted as `November 271817`, instead of expected `April 271822`.
Expand All @@ -83,48 +85,48 @@ One popular implementation strategy followed by other engines, is to bundle an i
- `signDisplay`
- `currencyFormat`

### Android 10 and older (SDK < 30)
#### Android 10 and older (SDK < 30)

- `Intl.NumberFormat`: Scientific notation formatting has issues on some cases. e.g. `-Infinity` may get formatted as '-∞E0' instead of expected '-∞'. Another manifestation of the issues is that the formatToParts may return 4 parts instead of 2.
- `Intl.NumberFormat`: Compact notation `formatToParts` doesn't identify unit, hence we report unit as 'literal'. For e.g. the second part of "100ac" gets reported as "literal" instead of "compact"

### Android 9 and older (SDK < 29)
#### Android 9 and older (SDK < 29)

- There are some failures likely due to older Unicode and CLDR version, which are hard to generalize. Some examples are,
- `Intl.NumberFormat`: 'Percent' is not accepted as a unit.
- `Intl.NumberFormat`: unit symbols difference, kph vs km/h
- Some issue in significant digit precision, which is not yet looked into the details.

### Android 8.0 – 8.1 and older (SDK < 28)
#### Android 8.0 – 8.1 and older (SDK < 28)

- `Intl.getCanonicalLocales`: Unicode/CLDR version differences results in some variances. e.g. und-u-tz-utc vs. und-u-tz-gmt.
- `Intl.NumberFormat`: CompactFormatter doesn't respect the precision inputs.

### Android 7.0 - 7.1 and older (SDK < 26)
#### Android 7.0 - 7.1 and older (SDK < 26)

- `Intl.getCanonicalLocales`: Unicode/CLDR version differences results in some variances. e.g. und-u-ms-imperial vs. und-u-ms-uksystem.

### Android 7.0 - 7.1 and older (SDK < 24)
#### Android 7.0 - 7.1 and older (SDK < 24)

- `Intl.Collator`: Doesn't canonically decompose the input strings. Canonically equivalent string with non-identical code points may not match.
- `Intl.getCanonicalLocales`: Unicode/CLDR version differences results in some variances. e.g. und-u-ca-ethiopic-amete-alem vs. und-u-ca-ethioaa, und-u-ks-primary vs. und-u-ks-level1.
- `Intl.NumberFormat`: Unit style does not work.
- `Intl.NumberFormat`: There are issues in the precision configuration due to lack of APIs.
- `Intl.DateFormat`: There are issues with the calendar configuration which needs to be dug into.

### SDK < 21 and older
#### SDK < 21 and older

On platforms before 21, `Locale.forLanguageTag()` is not available, hence we can't construct `java.util.Locale` object from locale tag. Hence, we fallback to English for any locale input.

# Internationalization framework in Android Platform
## Internationalization framework in Android Platform

Our implementation is essentially projecting the Android platform provided internationalization facilities through the ECMA-402 specified services. It implies that the results of running the same code may vary between devices running different versions of Android.

Android platform internationalization libraries have been based on [ICU4j project](https://unicode-org.github.io/icu-docs/#/icu4j). Version of ICU4j and the backing [CLDR data](http://cldr.unicode.org/) varies across Android platform versions. Also, the ICU APIs were never exposed directly, but only through wrappers and aliases. This results in significant variance in internationalization API surface and data across platform versions.

The following table summarizes ICU, CLDR and Unicode versions available on the Android platforms.

### Platform 24+ where ICU4j APIs are available.
#### Platform 24+ where ICU4j APIs are available.

| Android Platform Version | ICU | Unicode | CLDR
| --- | --- | --- | --- |
Expand All @@ -135,7 +137,7 @@ The following table summarizes ICU, CLDR and Unicode versions available on the A
| Android 7.0 - 7.1 (API levels 24 - 25) | ICU4j 56 ([ref](https://developer.android.com/guide/topics/resources/internationalization))| CLDR 28 | Unicode 8.0 |


### Pre-24 platforms
#### Pre-24 platforms

| Android Platform Version | ICU | Unicode | CLDR
| --- | --- | --- | --- |
Expand All @@ -157,7 +159,7 @@ In summary,

4. Platform 30 has introduced classes under [`android.icu.number`](https://developer.android.com/reference/android/icu/util/package-summary) namespace which will majorly improve our `Intl.NumberFormat` implementation

# Impact on Application Size
## Impact on Application Size

The following numbers are measured using a test application which takes dependency on the Hermes library to evaluate a JavaScript snippet. Essentially, enabling Intl APIs adds 57-62K per ABI.

Expand Down Expand Up @@ -193,3 +195,65 @@ And finally, this is the increase in the final npm package,
| **NPM Package** | **NOINTL** | **INTL** | **DIFF** | **PERC** |
| --- | --- | --- | --- | --- |
| hermes | 214447973 | 219291220 | 4,843,247 | 2.26% |

# Windows

The Windows Intl API's are a work in progress and currently very limited in support. We'll keep track of the status of API's here as we work through them.

## ECMA-402 Compliance
### Supported
- `Intl.DateTimeFormat`
- `Intl.DateTimeFormat.supportedLocalesOf`
- `Intl.DateTimeFormat.prototype.format`
- `Intl.DateTimeFormat.prototype.resolvedOptions`

- `Intl.getCanonicalLocales`

### Not yet supported

- `Intl.DateTimeFormat`
- `Intl.DateTimeFormat.prototype.formatToParts`

- `Intl.DateTimeFormat` properties
- [`dayPeriod`]
- [`fractionalSecondDigits`]
- [`formatMatcher`]

- `Intl.Collator`
- `Intl.Collator.supportedLocalesOf`
- `Intl.Collator.prototype.compare`
- `Intl.Collator.prototype.resolvedOptions`

- `Intl.NumberFormat`
- `Intl.NumberFormat.supportedLocalesOf`
- `Intl.NumberFormat.prototype.format`
- `Intl.NumberFormat.prototype.formatToParts`
- `Intl.NumberFormat.prototype.resolvedOptions`

- `String.prototype`
- `localeCompare`
- `toLocaleLowerCase`
- `toLocaleUpperCase`

- `Array.prototype`
- `toLocaleString`

- `Number.prototype`
- `toLocaleString`

- `Date.prototype`
- `toLocaleString`
- `toLocaleDateString`
- `toLocaleTimeString`

- [`Intl.PluralRules`](https://tc39.es/ecma402/#pluralrules-objects)

- [`Intl.RelativeTimeFormat`](https://tc39.es/ecma402/#relativetimeformat-objects)

- [`Intl.DisplayNames`](https://tc39.es/proposal-intl-displaynames/#sec-intl-displaynames-constructor)

- [`Intl.ListFormat`](https://tc39.es/proposal-intl-list-format/#sec-intl-listformat-constructor)

- [`Intl.Locale`](https://tc39.es/ecma402/#sec-intl-locale-constructor)

- [`BigInt.prototype.toLocaleString`](https://tc39.es/ecma402/#sup-bigint.prototype.tolocalestring)
176 changes: 176 additions & 0 deletions lib/Platform/Intl/PlatformIntlShared.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
/*
* Copyright (c) Meta Platforms, Inc. and affiliates.
*
* This source code is licensed under the MIT license found in the
* LICENSE file in the root directory of this source tree.
*/

// This file includes shared code between Apple and Windows implementation of Intl APIs
#include "hermes/Platform/Intl/PlatformIntl.h"

using namespace ::hermes;

namespace hermes {
namespace platform_intl {

// https://402.ecma-international.org/8.0/#sec-bestavailablelocale
std::optional<std::u16string> bestAvailableLocale(
const std::vector<std::u16string> &availableLocales,
const std::u16string &locale) {
// 1. Let candidate be locale
std::u16string candidate = locale;

// 2. Repeat
while (true) {
// a. If availableLocales contains an element equal to candidate, return
// candidate.
if (llvh::find(availableLocales, candidate) != availableLocales.end())
return candidate;
// b. Let pos be the character index of the last occurrence of "-" (U+002D)
// within candidate.
size_t pos = candidate.rfind(u'-');

// ...If that character does not occur, return undefined.
if (pos == std::u16string::npos)
return std::nullopt;

// c. If pos ≥ 2 and the character "-" occurs at index pos-2 of candidate,
// decrease pos by 2.
if (pos >= 2 && candidate[pos - 2] == '-')
pos -= 2;

// d. Let candidate be the substring of candidate from position 0,
// inclusive, to position pos, exclusive.
candidate.resize(pos);
}
}

// https://402.ecma-international.org/8.0/#sec-lookupsupportedlocales
std::vector<std::u16string> lookupSupportedLocales(
const std::vector<std::u16string> &availableLocales,
const std::vector<std::u16string> &requestedLocales) {
// 1. Let subset be a new empty List.
std::vector<std::u16string> subset;
// 2. For each element locale of requestedLocales in List order, do
for (const std::u16string &locale : requestedLocales) {
// a. Let noExtensionsLocale be the String value that is locale with all
// Unicode locale extension sequences removed.
// We can skip this step, see the comment in lookupMatcher.
// b. Let availableLocale be BestAvailableLocale(availableLocales,
// noExtensionsLocale).
std::optional<std::u16string> availableLocale =
bestAvailableLocale(availableLocales, locale);
// c. If availableLocale is not undefined, append locale to the end of
// subset.
if (availableLocale) {
subset.push_back(locale);
}
}
// 3. Return subset.
return subset;
}

std::optional<bool> getOptionBool(
vm::Runtime &runtime,
const Options &options,
const std::u16string &property,
std::optional<bool> fallback) {
// 1. Assert: Type(options) is Object.
// 2. Let value be ? Get(options, property).
auto value = options.find(property);
// 3. If value is undefined, return fallback.
if (value == options.end()) {
return fallback;
}
// 8. Return value.
return value->second.getBool();
}

// Implementation of
// https://402.ecma-international.org/8.0/#sec-todatetimeoptions
vm::CallResult<Options> toDateTimeOptions(
vm::Runtime &runtime,
Options options,
std::u16string_view required,
std::u16string_view defaults) {
// 1. If options is undefined, let options be null; otherwise let options be ?
// ToObject(options).
// 2. Let options be OrdinaryObjectCreate(options).
// 3. Let needDefaults be true.
bool needDefaults = true;
// 4. If required is "date" or "any", then
if (required == u"date" || required == u"any") {
// a. For each property name prop of « "weekday", "year", "month", "day" »,
// do
// TODO(T116352920): Make this a std::u16string props[] once we have
// constexpr std::u16string.
static constexpr std::u16string_view props[] = {
u"weekday", u"year", u"month", u"day"};
for (const auto &prop : props) {
// i. Let value be ? Get(options, prop).
if (options.find(std::u16string(prop)) != options.end()) {
// ii. If value is not undefined, let needDefaults be false.
needDefaults = false;
}
}
}
// 5. If required is "time" or "any", then
if (required == u"time" || required == u"any") {
// a. For each property name prop of « "dayPeriod", "hour", "minute",
// "second", "fractionalSecondDigits" », do
static constexpr std::u16string_view props[] = {
u"dayPeriod", u"hour", u"minute", u"second", u"fractionalSecondDigits"};
for (const auto &prop : props) {
// i. Let value be ? Get(options, prop).
if (options.find(std::u16string(prop)) != options.end()) {
// ii. If value is not undefined, let needDefaults be false.
needDefaults = false;
}
}
}
// 6. Let dateStyle be ? Get(options, "dateStyle").
auto dateStyle = options.find(u"dateStyle");
// 7. Let timeStyle be ? Get(options, "timeStyle").
auto timeStyle = options.find(u"timeStyle");
// 8. If dateStyle is not undefined or timeStyle is not undefined, let
// needDefaults be false.
if (dateStyle != options.end() || timeStyle != options.end()) {
needDefaults = false;
}
// 9. If required is "date" and timeStyle is not undefined, then
if (required == u"date" && timeStyle != options.end()) {
// a. Throw a TypeError exception.
return runtime.raiseTypeError(
"Unexpectedly found timeStyle option for \"date\" property");
}
// 10. If required is "time" and dateStyle is not undefined, then
if (required == u"time" && dateStyle != options.end()) {
// a. Throw a TypeError exception.
return runtime.raiseTypeError(
"Unexpectedly found dateStyle option for \"time\" property");
}
// 11. If needDefaults is true and defaults is either "date" or "all", then
if (needDefaults && (defaults == u"date" || defaults == u"all")) {
// a. For each property name prop of « "year", "month", "day" », do
static constexpr std::u16string_view props[] = {u"year", u"month", u"day"};
for (const auto &prop : props) {
// i. Perform ? CreateDataPropertyOrThrow(options, prop, "numeric").
options.emplace(prop, Option(std::u16string(u"numeric")));
}
}
// 12. If needDefaults is true and defaults is either "time" or "all", then
if (needDefaults && (defaults == u"time" || defaults == u"all")) {
// a. For each property name prop of « "hour", "minute", "second" », do
static constexpr std::u16string_view props[] = {
u"hour", u"minute", u"second"};
for (const auto &prop : props) {
// i. Perform ? CreateDataPropertyOrThrow(options, prop, "numeric").
options.emplace(prop, Option(std::u16string(u"numeric")));
}
}
// 13. return options
return options;
}

} // namespace platform_intl
} // namespace hermes
Loading