Skip to content

Commit

Permalink
Adds in default Intl.DateTimeFormat (#127)
Browse files Browse the repository at this point in the history
* adds default Intl.DateTimeFormat

* adds supportedLocales

* simple PR feedback

* format

* more simple PR feedback

* change to nullptr

* use std:optional

* Create shared code file and more PR feedback

* free datetimeformat

* store std::string globally

* add more error checking

* Update IntlAPIs.md

* fix camelCase

* remove old comment
  • Loading branch information
TatianaKapos authored Nov 29, 2022
1 parent 1e01740 commit f5d4942
Show file tree
Hide file tree
Showing 5 changed files with 1,383 additions and 50 deletions.
98 changes: 81 additions & 17 deletions doc/IntlAPIs.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,18 @@ id: intl
title: Internationalization APIs
---

This document describes the current state of Android implementation of the [ECMAScript Internationalization API Specification](https://tc39.es/ecma402/) (ECMA-402, or `Intl`). ECMA-402 is still evolving and the latest iteration is [7th edition](https://402.ecma-international.org/7.0/) which was published in June 2020. Each new edition is built on top of the last one and adds new capabilities typically as,
This document describes the current state of Android/Windows implementation of the [ECMAScript Internationalization API Specification](https://tc39.es/ecma402/) (ECMA-402, or `Intl`). ECMA-402 is still evolving and the latest iteration is [7th edition](https://402.ecma-international.org/7.0/) which was published in June 2020. Each new edition is built on top of the last one and adds new capabilities typically as,
- New `Intl` service constructors (e.g. `Intl.Collator`, `Intl.NumberFormat` etc.) or extending existing ones by accepting more parameters
- New functions or properties in `Intl` objects (e.g. `Intl.Collator.prototype.compare`)
- New locale aware functions in standard Javascript object prototypes (e.g. `String.prototype.localeCompare`)

# Android

One popular implementation strategy followed by other engines, is to bundle an internationalization framework (typically [ICU](http://site.icu-project.org/)) along with the application package. This guarantees deterministic behaviours at the cost of applications package bloat. We decided to consume the Android platform provided facilities for space efficiency, but at the cost of some variance in behaviours across Android platforms.

# ECMA-402 Compliance
## ECMA-402 Compliance

## Supported
### Supported

- `Intl.Collator`
- `Intl.Collator.supportedLocalesOf`
Expand Down Expand Up @@ -49,7 +51,7 @@ One popular implementation strategy followed by other engines, is to bundle an i
- `toLocaleDateString`
- `toLocaleTimeString`

## Not yet supported
### Not yet supported

- [`Intl.PluralRules`](https://tc39.es/ecma402/#pluralrules-objects)

Expand All @@ -67,13 +69,13 @@ One popular implementation strategy followed by other engines, is to bundle an i
- [`fractionalSecondDigits`](https://github.com/tc39/ecma402/pull/347)
- [`BigInt.prototype.toLocaleString`](https://tc39.es/ecma402/#sup-bigint.prototype.tolocalestring)

## Excluded
### Excluded

- `Intl.DateTimeFormat`: [`formatMatcher`](https://tc39.es/ecma402/#sec-basicformatmatcher) parameter is not respected. The parameter enables the implementation to pick the best display format when it supports only a subset of all possible formats. ICU library in Android platform and hence our implementation allows all subsets and formats which makes this `formatMatcher` property unnecessary.

## Limitations across Android SDKs
### Limitations across Android SDKs

### Android 11
#### Android 11

- The keys of the object returned by `resolvedOptions` function in all `Intl` services are not deterministically ordered as prescribed by spec.
- DateFormat: ECMAScript [beginning of time](https://www.ecma-international.org/ecma-262/11.0/index.html#sec-time-values-and-time-range) (-8,640,000,000,000,000), is formatted as `November 271817`, instead of expected `April 271822`.
Expand All @@ -83,48 +85,48 @@ One popular implementation strategy followed by other engines, is to bundle an i
- `signDisplay`
- `currencyFormat`

### Android 10 and older (SDK < 30)
#### Android 10 and older (SDK < 30)

- `Intl.NumberFormat`: Scientific notation formatting has issues on some cases. e.g. `-Infinity` may get formatted as '-∞E0' instead of expected '-∞'. Another manifestation of the issues is that the formatToParts may return 4 parts instead of 2.
- `Intl.NumberFormat`: Compact notation `formatToParts` doesn't identify unit, hence we report unit as 'literal'. For e.g. the second part of "100ac" gets reported as "literal" instead of "compact"

### Android 9 and older (SDK < 29)
#### Android 9 and older (SDK < 29)

- There are some failures likely due to older Unicode and CLDR version, which are hard to generalize. Some examples are,
- `Intl.NumberFormat`: 'Percent' is not accepted as a unit.
- `Intl.NumberFormat`: unit symbols difference, kph vs km/h
- Some issue in significant digit precision, which is not yet looked into the details.

### Android 8.0 – 8.1 and older (SDK < 28)
#### Android 8.0 – 8.1 and older (SDK < 28)

- `Intl.getCanonicalLocales`: Unicode/CLDR version differences results in some variances. e.g. und-u-tz-utc vs. und-u-tz-gmt.
- `Intl.NumberFormat`: CompactFormatter doesn't respect the precision inputs.

### Android 7.0 - 7.1 and older (SDK < 26)
#### Android 7.0 - 7.1 and older (SDK < 26)

- `Intl.getCanonicalLocales`: Unicode/CLDR version differences results in some variances. e.g. und-u-ms-imperial vs. und-u-ms-uksystem.

### Android 7.0 - 7.1 and older (SDK < 24)
#### Android 7.0 - 7.1 and older (SDK < 24)

- `Intl.Collator`: Doesn't canonically decompose the input strings. Canonically equivalent string with non-identical code points may not match.
- `Intl.getCanonicalLocales`: Unicode/CLDR version differences results in some variances. e.g. und-u-ca-ethiopic-amete-alem vs. und-u-ca-ethioaa, und-u-ks-primary vs. und-u-ks-level1.
- `Intl.NumberFormat`: Unit style does not work.
- `Intl.NumberFormat`: There are issues in the precision configuration due to lack of APIs.
- `Intl.DateFormat`: There are issues with the calendar configuration which needs to be dug into.

### SDK < 21 and older
#### SDK < 21 and older

On platforms before 21, `Locale.forLanguageTag()` is not available, hence we can't construct `java.util.Locale` object from locale tag. Hence, we fallback to English for any locale input.

# Internationalization framework in Android Platform
## Internationalization framework in Android Platform

Our implementation is essentially projecting the Android platform provided internationalization facilities through the ECMA-402 specified services. It implies that the results of running the same code may vary between devices running different versions of Android.

Android platform internationalization libraries have been based on [ICU4j project](https://unicode-org.github.io/icu-docs/#/icu4j). Version of ICU4j and the backing [CLDR data](http://cldr.unicode.org/) varies across Android platform versions. Also, the ICU APIs were never exposed directly, but only through wrappers and aliases. This results in significant variance in internationalization API surface and data across platform versions.

The following table summarizes ICU, CLDR and Unicode versions available on the Android platforms.

### Platform 24+ where ICU4j APIs are available.
#### Platform 24+ where ICU4j APIs are available.

| Android Platform Version | ICU | Unicode | CLDR
| --- | --- | --- | --- |
Expand All @@ -135,7 +137,7 @@ The following table summarizes ICU, CLDR and Unicode versions available on the A
| Android 7.0 - 7.1 (API levels 24 - 25) | ICU4j 56 ([ref](https://developer.android.com/guide/topics/resources/internationalization))| CLDR 28 | Unicode 8.0 |


### Pre-24 platforms
#### Pre-24 platforms

| Android Platform Version | ICU | Unicode | CLDR
| --- | --- | --- | --- |
Expand All @@ -157,7 +159,7 @@ In summary,

4. Platform 30 has introduced classes under [`android.icu.number`](https://developer.android.com/reference/android/icu/util/package-summary) namespace which will majorly improve our `Intl.NumberFormat` implementation

# Impact on Application Size
## Impact on Application Size

The following numbers are measured using a test application which takes dependency on the Hermes library to evaluate a JavaScript snippet. Essentially, enabling Intl APIs adds 57-62K per ABI.

Expand Down Expand Up @@ -193,3 +195,65 @@ And finally, this is the increase in the final npm package,
| **NPM Package** | **NOINTL** | **INTL** | **DIFF** | **PERC** |
| --- | --- | --- | --- | --- |
| hermes | 214447973 | 219291220 | 4,843,247 | 2.26% |

# Windows

The Windows Intl API's are a work in progress and currently very limited in support. We'll keep track of the status of API's here as we work through them.

## ECMA-402 Compliance
### Supported
- `Intl.DateTimeFormat`
- `Intl.DateTimeFormat.supportedLocalesOf`
- `Intl.DateTimeFormat.prototype.format`
- `Intl.DateTimeFormat.prototype.resolvedOptions`

- `Intl.getCanonicalLocales`

### Not yet supported

- `Intl.DateTimeFormat`
- `Intl.DateTimeFormat.prototype.formatToParts`

- `Intl.DateTimeFormat` properties
- [`dayPeriod`]
- [`fractionalSecondDigits`]
- [`formatMatcher`]

- `Intl.Collator`
- `Intl.Collator.supportedLocalesOf`
- `Intl.Collator.prototype.compare`
- `Intl.Collator.prototype.resolvedOptions`

- `Intl.NumberFormat`
- `Intl.NumberFormat.supportedLocalesOf`
- `Intl.NumberFormat.prototype.format`
- `Intl.NumberFormat.prototype.formatToParts`
- `Intl.NumberFormat.prototype.resolvedOptions`

- `String.prototype`
- `localeCompare`
- `toLocaleLowerCase`
- `toLocaleUpperCase`

- `Array.prototype`
- `toLocaleString`

- `Number.prototype`
- `toLocaleString`

- `Date.prototype`
- `toLocaleString`
- `toLocaleDateString`
- `toLocaleTimeString`

- [`Intl.PluralRules`](https://tc39.es/ecma402/#pluralrules-objects)

- [`Intl.RelativeTimeFormat`](https://tc39.es/ecma402/#relativetimeformat-objects)

- [`Intl.DisplayNames`](https://tc39.es/proposal-intl-displaynames/#sec-intl-displaynames-constructor)

- [`Intl.ListFormat`](https://tc39.es/proposal-intl-list-format/#sec-intl-listformat-constructor)

- [`Intl.Locale`](https://tc39.es/ecma402/#sec-intl-locale-constructor)

- [`BigInt.prototype.toLocaleString`](https://tc39.es/ecma402/#sup-bigint.prototype.tolocalestring)
176 changes: 176 additions & 0 deletions lib/Platform/Intl/PlatformIntlShared.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
/*
* Copyright (c) Meta Platforms, Inc. and affiliates.
*
* This source code is licensed under the MIT license found in the
* LICENSE file in the root directory of this source tree.
*/

// This file includes shared code between Apple and Windows implementation of Intl APIs
#include "hermes/Platform/Intl/PlatformIntl.h"

using namespace ::hermes;

namespace hermes {
namespace platform_intl {

// https://402.ecma-international.org/8.0/#sec-bestavailablelocale
std::optional<std::u16string> bestAvailableLocale(
const std::vector<std::u16string> &availableLocales,
const std::u16string &locale) {
// 1. Let candidate be locale
std::u16string candidate = locale;

// 2. Repeat
while (true) {
// a. If availableLocales contains an element equal to candidate, return
// candidate.
if (llvh::find(availableLocales, candidate) != availableLocales.end())
return candidate;
// b. Let pos be the character index of the last occurrence of "-" (U+002D)
// within candidate.
size_t pos = candidate.rfind(u'-');

// ...If that character does not occur, return undefined.
if (pos == std::u16string::npos)
return std::nullopt;

// c. If pos ≥ 2 and the character "-" occurs at index pos-2 of candidate,
// decrease pos by 2.
if (pos >= 2 && candidate[pos - 2] == '-')
pos -= 2;

// d. Let candidate be the substring of candidate from position 0,
// inclusive, to position pos, exclusive.
candidate.resize(pos);
}
}

// https://402.ecma-international.org/8.0/#sec-lookupsupportedlocales
std::vector<std::u16string> lookupSupportedLocales(
const std::vector<std::u16string> &availableLocales,
const std::vector<std::u16string> &requestedLocales) {
// 1. Let subset be a new empty List.
std::vector<std::u16string> subset;
// 2. For each element locale of requestedLocales in List order, do
for (const std::u16string &locale : requestedLocales) {
// a. Let noExtensionsLocale be the String value that is locale with all
// Unicode locale extension sequences removed.
// We can skip this step, see the comment in lookupMatcher.
// b. Let availableLocale be BestAvailableLocale(availableLocales,
// noExtensionsLocale).
std::optional<std::u16string> availableLocale =
bestAvailableLocale(availableLocales, locale);
// c. If availableLocale is not undefined, append locale to the end of
// subset.
if (availableLocale) {
subset.push_back(locale);
}
}
// 3. Return subset.
return subset;
}

std::optional<bool> getOptionBool(
vm::Runtime &runtime,
const Options &options,
const std::u16string &property,
std::optional<bool> fallback) {
// 1. Assert: Type(options) is Object.
// 2. Let value be ? Get(options, property).
auto value = options.find(property);
// 3. If value is undefined, return fallback.
if (value == options.end()) {
return fallback;
}
// 8. Return value.
return value->second.getBool();
}

// Implementation of
// https://402.ecma-international.org/8.0/#sec-todatetimeoptions
vm::CallResult<Options> toDateTimeOptions(
vm::Runtime &runtime,
Options options,
std::u16string_view required,
std::u16string_view defaults) {
// 1. If options is undefined, let options be null; otherwise let options be ?
// ToObject(options).
// 2. Let options be OrdinaryObjectCreate(options).
// 3. Let needDefaults be true.
bool needDefaults = true;
// 4. If required is "date" or "any", then
if (required == u"date" || required == u"any") {
// a. For each property name prop of « "weekday", "year", "month", "day" »,
// do
// TODO(T116352920): Make this a std::u16string props[] once we have
// constexpr std::u16string.
static constexpr std::u16string_view props[] = {
u"weekday", u"year", u"month", u"day"};
for (const auto &prop : props) {
// i. Let value be ? Get(options, prop).
if (options.find(std::u16string(prop)) != options.end()) {
// ii. If value is not undefined, let needDefaults be false.
needDefaults = false;
}
}
}
// 5. If required is "time" or "any", then
if (required == u"time" || required == u"any") {
// a. For each property name prop of « "dayPeriod", "hour", "minute",
// "second", "fractionalSecondDigits" », do
static constexpr std::u16string_view props[] = {
u"dayPeriod", u"hour", u"minute", u"second", u"fractionalSecondDigits"};
for (const auto &prop : props) {
// i. Let value be ? Get(options, prop).
if (options.find(std::u16string(prop)) != options.end()) {
// ii. If value is not undefined, let needDefaults be false.
needDefaults = false;
}
}
}
// 6. Let dateStyle be ? Get(options, "dateStyle").
auto dateStyle = options.find(u"dateStyle");
// 7. Let timeStyle be ? Get(options, "timeStyle").
auto timeStyle = options.find(u"timeStyle");
// 8. If dateStyle is not undefined or timeStyle is not undefined, let
// needDefaults be false.
if (dateStyle != options.end() || timeStyle != options.end()) {
needDefaults = false;
}
// 9. If required is "date" and timeStyle is not undefined, then
if (required == u"date" && timeStyle != options.end()) {
// a. Throw a TypeError exception.
return runtime.raiseTypeError(
"Unexpectedly found timeStyle option for \"date\" property");
}
// 10. If required is "time" and dateStyle is not undefined, then
if (required == u"time" && dateStyle != options.end()) {
// a. Throw a TypeError exception.
return runtime.raiseTypeError(
"Unexpectedly found dateStyle option for \"time\" property");
}
// 11. If needDefaults is true and defaults is either "date" or "all", then
if (needDefaults && (defaults == u"date" || defaults == u"all")) {
// a. For each property name prop of « "year", "month", "day" », do
static constexpr std::u16string_view props[] = {u"year", u"month", u"day"};
for (const auto &prop : props) {
// i. Perform ? CreateDataPropertyOrThrow(options, prop, "numeric").
options.emplace(prop, Option(std::u16string(u"numeric")));
}
}
// 12. If needDefaults is true and defaults is either "time" or "all", then
if (needDefaults && (defaults == u"time" || defaults == u"all")) {
// a. For each property name prop of « "hour", "minute", "second" », do
static constexpr std::u16string_view props[] = {
u"hour", u"minute", u"second"};
for (const auto &prop : props) {
// i. Perform ? CreateDataPropertyOrThrow(options, prop, "numeric").
options.emplace(prop, Option(std::u16string(u"numeric")));
}
}
// 13. return options
return options;
}

} // namespace platform_intl
} // namespace hermes
Loading

0 comments on commit f5d4942

Please sign in to comment.