Feb 27 Attendees:
- Shane Carr - Google i18n (SFC), Moderator
- Romulo Cintra - CaixaBank (RCA), MessageFormat Working Group Liaison
- Long Ho - Dropbox (LHO)
- Caio Lima - Igalia (CLA)
- Zibi Braniecki - Mozilla (ZB)
- Jeff Walden - Spidermonkey (JSW)
- Leo Balter - Bocoup (LEO)
- Myles C. Maxfield - Apple (MCM)
- Craig Cornelius - Google i18n (CCN)
- Konstantin Pozin - Google (KP)
- Frank Yung-Fong Tang - Google i18n, V8 (FYT)
Standing items:
- Discussion Board
- Status Wiki -- please update!
- Abbreviations
- MDN Tracking
March 26, 10am PDT (5pm GMT)
SFC: Thanks Leo for getting this ready!
LEO: Is there anything else we want?
ZBI: I would really like dateStyle/timeStyle and Intl.ListFormat, especially dateStyle/timeStyle.
LEO: I need to present the release candidate for voting in the next TC39 meeting. So the 2020 Edition should be cut before that meeting.
LEO: I'm joining another company and may not be able to continue being the Editor. So we're open for other candidates for the editor of ECMA-402. I would love to help with the transition.
SFC: Can you summarize the responsibilities you've done over the last 12 months?
LEO: Trying to be neutral to some normative decisions, making decisions on what considerations are editorial, what positions should go to specs. Being neutral, being friendly towards everyone. Git-related skills to ensure you can merge everything into the specs, making sure you can make the cut for each of the editions. Making edits to spec on how we describe the internals. Also keeping up editorial structure (ecmarkup and other tools) close for ECMA-402 and ECMA-262. Should not be that hard, there are opportunities to improve the specs.
LEO: THe other important thing as the editor is helping champions to review proposals. Most importantly, from Stage 2 up to Stage 4. This is the most complex part, but it's also nice because you get up to speed with all the proposals and it's very interesting.
CLA: Basically I would like to ask you what was the complexity about reviewing the PRs on the spec? The work I've done on those PRs are not super-trivial to me. It takes time to get context on what's going on. How much background should you have on ECMA-402 internals?
LEO: Nothing in tech is really that easy, it's definitely complex and requires you to read spec text. ECMA-402 doesn't require you to understand the syntax grammar, but requires you to understand the tangling with other standards. It's not simple, it requires some time, maybe 10% or 20% of every week out of work, but I assume everyone here does so for JS.Intl. It is complex but it is technically and emotionally rewarding in the end. You're not on the front lines to understand the proposal as the champions are, you're just there to give support, that's good.
SFC: Everyone here should consider volunteering, it's quite rewarding. As far as minimum requirements, it might be hard if you've never done any proposal work before, but so long as you've done some proposal writing work, that should be enough.
RCA: We had our fifth meeting on Monday. We're starting creating docs for MessageFormat. We're going to start making organizational decisions soon. We've had around 20 people every meeting.
SFC: message format sometimes are even bigger than ECMA402 meetings.
RCA: You can follow our meeting notes here
SFC: Main theme for today is user preferences. Want to take a big chunk of today's meeting to discuss where we stand and what the future looks like. Issue #6 is about Intl.getCalendarInfo(). But also want to look at calendar preferences, not just about firstDayOfWeek().
ZB: A lot of those issues came from use cases from Mozilla. There is an internal implementation being used for 3 years now on Firefox UI that can serve as inspiration. - https://firefox-source-docs.mozilla.org/intl/dataintl.html#services-intl
SFC: We’ve been hearing from Intl experts that BCP 47 (?) should also include calendar preferences, and if there are other preferences that can be encoded, we should consider that too.
ZB: The way I’ve been evaluating Intl APIs is how it can allow a man-in-the-middle to override defaults. The way it is designed, it can take data from CLDR, but it also can look into OS preference.
SFC: If we were to take the maximizing language tag approach, are there any problems with how this bubbles up to the browser. We can take that logic and put it into Navigator.acceptLanguages and getLanguage. Would that break the web browser because they would not be able to parse the extra information
JSW: People have raised concerns about Accept-Language getting bigger and increasing bandwidth.
MGR: I have had problems just because the order of the languages was different, and Strava webpage was displayed in French even though it was my 3rd language in the list.
SFC: Can we have extra info by adding navigator.locales?
ZB: If we got to it, we could use Intl.Locale to be able to parse those strings. This would then deprecate “navigator.languages”.
JSW: It feels that getting some of this info by constructing a locale and calling resolvedOptions on it seems convoluted. CalendarInfo seems like a simpler and more straightforward approach for people to apply in their own code. That said, exposing things like first day of week in resolvedOptions makes sense in its own right.
SFC: Do you mean getters on Intl.Locale?
JSW: Yes.
SFC: ???
SFC: We have 2 different directions -- sthg like mozIntl.getCalendarInfo demo, or using the Intl.Locale object. Intl.Locale is stage 4, so we can start looking at it closely. Understand why we can't extend Navigator.language & getLanguages. If you want to do server- and client-side rendering, then you want to have them both have the same information, and the server can accept whatever extra info you give it, so why is it that the client side has more information?
JSW: Increasing the size of Accept-Languages (and thus the size of every HTTP request) is undesirable, would be a hard sell
KP: Currently according to MDN, you can specify "fr-CH, fr;q=0.9". Maybe we can extend this, at least for the primary language.
ZB: if I specify the first day of the week in a locale string, does it just apply to that one primary language, or to all languages in the fallback chain?
MCM: Fingerprinting concerns are important to us. Exposing lots of user info is harmful. But we understand that some info is useful, so a calendar app has the right behavior. So we need to strike a balance of privacy and fidelity. So whatever the answer is here, it should be possible for the browser to mask or hide some of this information.
ZB: I think the settings set by the user should not apply to their fallback languages.
KP: About lying about user options, I think that's an option in Edgium.
ZB: We are integrating a lot of Tor browser features, and we have fingerprint prevention (mask Accepted Languages, mask JS Default Locale etc.).
JSW: Browsers can decide to put nothing in Accept-Languages(?) or navigator.locales, then -- and then no one’s actually any better off, even if the spec purports to allow it.
FYT: Are we mixing two levels of issues? For example, first day of week for a particular locale. Second, the user preference.
ZB: For me, those are two separate issues. My design approach to 402 is to put a man in the middle that overrides user preferences. So, for example, with the calendar info, it gives you preferences based on the language you asked for, but if there is an OS setting, I can return that info in an opaque way.
FYT: In that sense, the locale itself should be different, too. So let's say you have a French locale. If the user overrides it, the first day of week will be a different value. Should the locale also reflect that? Because we could go with this approach, or the other approach is the ECMA-402 API always returns the same thing. But a different API is to return user preferences. My belief is that the concern of Accept-Language only ties to one approach but not the other.
ZB: We’ve been going with the former of what you propose. Not sure how extensible it is. We are basically saying that whenever we have a new option, we add an unicode extension to it. But how far can that keep going? Everyone wants to add extensions to the language code, but what about measurements, temperatures? Not everything is a language code extension and should be treated as user preferences.
SFC: We’ve definitely been thinking on the Unicode side to allow user preferences. It is not a hard decision if Unicode decides to add a new extension(?). That information is encoded in LDML.
FYT: If the API changed some value, the return should be changed. And then the second question: for each other field: if that API needs to return it, we have to change the locale. You're saying that we shouldn't add all those fields to the Unicode extension. Which proves that approach is not scalable. So I think we shouldn't go to the first approach. Because if we can't put everything into the Unicode extension, then we can't do it.
ZB: Alternative is what Shane was saying is that we let Unicode decide what unicode extensions exist and we don’t support preferences that don’t have unicode extension keys. It sounds kinda harsh, but we are allowing ~95% of use cases.
KP: Temperature is a counter-example, temperature is used pretty much everywhere
SFC: You can either use the -rg-
extension or the -ms-
extension. Mark Davis has started to consider including more fine-grained unit preferences and that could go into the LDML 35 spec.
MGR: The Candian examples for temperature units is really confusing.
JSW: This is a humorous/horrifying Canadian measurement units flow chart to illustrate how nutty these preferences can be.
FYT: For the approach you're going for, the code can only get back the … ???
ZB: I’m ok Intl.Locale.?? To return the first day of week. I assume your question is: if the approach that I'm presenting, if that approach means we can't expose info that doesn't have Unicode extension keys? The answer is no, we can expose info that isn't covered by Unicode extension keys, we just don't let users override it.
FYT: Regardless, what I would like to see is that the developer can set something that is not overridable by the user. Should provide a way to provide the default settings regardless of what the user has overridden?
ZB: In the approach I'm presenting, the solution would be to take the locale, take the base name, and take the info on the base name: the locale without extension keys. The way we override is by using extension keys. So by cutting out extension keys, you get the default for the locale.
SFC: Can you clarify the approaches again?
ZB: (1) use only extension keys, or (2) extracting user preference information beyond the locale (Intl.userPreferences). If we went the first route, saying extension keys are how we get user preferences, and that if we don't have an extension key we can't get the user preference. The one limitation I see from my work in Firefox is a common case, a small and vocal minority, is to override the pattern for dateStyle/timeStyle. If we go the locale extension key approach, then we couldn't respect that preference in the web site. This also relates to fingerprinting.
KP: Could we extend Intl.Locale to use Unicode extensions where available, custom parameters for the constructor for other preferences, and expose everything through getters on Intl.Locale?
MGR: Are we suggestion that user-specified extensions could be stripped to make a new locale? Non-verbal agreement.
FYT: The OS already allows you to do ____ (provide your own formatting pattern?).
ZB: Windows has had 3 different UIs from 3 different eras. You can go to the really nice Windows 10 UI, and then you can click a button and get the Windows 7 version which is more detailed.
SFC: We can still allow Intl.DateTimeFormat to use OS preferences for dateStyle/timeStyle preferences as an implementation detail. We don't need to expose those patterns as user preferences.
ZB: Really strongly oppose exposing anything to the user that looks like a pattern. Current practice seems to be that the user is asked, but the system can override with it’s interpretation of the user’s intent, without following the exact pattern.
SFC: To summarize where we're at currently: is it possible to add extension keys to Accept-Languages to accept user preferences. Jeff says likely no. I think it would be elegant in a way. But maybe that opens up for an API to accept user preferences. There is an elegance of using unicode properties key to keep consistency between server-side/client side.
MCM: Generally the common case is for authors to want the user’s full set of distinct preferences, not some broad characterization that’s “close” but not exactly the user’s preferences. There should be fewer lines of code to get what is overridden by the user. If there is a separate userPreferences API, it's more likely authors will get it wrong.
MGR: Using navigator.locale and shoving it into the API could make this work.
SFC: What’s a good way to resolve this? Contact Steven, Addison, etc? Who is the decision maker here? Who will have the power to say “no”?
FYT: We should be careful about acceptLanguage. Browser A sends out "French" that overrides OS setting, a different browser might send out a diff language in the header, and not all servers support all locales.
SFC: this is a compatibility concern.
MCM: I agree with Frank.
FYT: You could have Navigator.withExtension or a new HTTP header, or something that won't break pre-existing behavior and expose it to JavaScript. There are pros and cons to each approach, but there are risks for one and not the other.
SFC: We could Accept-Locales: as a new HTTP header
KP: Unicode only controls the -u-
extension. So how do we want to encode user preferences?
SFC: I don’t have a great sense if adding unicode extensions to navigator.languages would break the web. Who else have opinions on it. Raise to W3C, other decision makers? If not possible to get consensus on syncing server/client info, then we should consider inventing our own preferences object available only on the client side.
ZB: Would you want to have a separate user preference defined for every JS Intl object?
SFC: This is a question we need to answer once we decide to go with Preferences object.
MGR: I think Richard Ishida (w3c i18n) would also be a good contact.
FYT: Need to include IETF in any decision on HTTP headers.
SFC: We would be adding a new header with BCP-47.
FYI: From one version of browser to another, we would be sending different sets of information to the server.
SFC: Requests are different if the version of client is different already. It is up to the server to take decision what to do with this difference.
MCM: There’s already a general purpose proposal called “Client hints” making its way through WHATWG about taking info from client and delivering it to server in the initial request. This proposal should just be a client hint.
SFC: We should talk with champions of that proposal
ZB: We confirm that this is not simple.
ZB: Maybe there is some argument to say that in my JS environment, my default locale is X, but in a datetime context, by default in Y.
MIH: In Android, you can have them different, but in the same language family. So the language and script are the same, but not the region.
ZB: I had no idea about this, but at Mozilla, what I started doing was looking at the operating system, and if the regions match, we use that region for formatting, but otherwise, we ignore the OS preference for region.
MGR: There are use cases where there are different preferences for other preferences, but the language is set to english, since localization of Applications can be not as good as the English version.
LEO: My personal experience is that I have wanted to have different locales for different types of formatting (number formatting is one way, datetime formatting is another way)
MIH: draw the line between separate locale vs. “knobs” on the locale’s preferences. A fully different locale is too far, and he is not sure that customers are really ready for this.
MGM: ???
FYT: If we look at 85 or 87, POSIX has this LC_ALL environment variable, and we have LC_TIME, LC_DATE, etc. And there's a hierarchy. And that's what we're reflecting here in the request. That's 30-40 years ago. All the programs on Linux could implement that differently. We expose default in a way. Currently we always return the same value.
ZB: They could be a chain. I think we should stop talking about a single locale; it's always a chain.
FYT: This ability is in POSIX for more than 30 years. So it may have a good reason to be there.
MIH: Personally I wouldn't use POSIX as a good i18n or l10n model. It's ancient history, bad models, ignore it.
SFC: It’s different between date formatting vs. collation - if there’s some knob that an extension doesn’t allow, then we could ask the Unicode people to add/change that setting option. It may be that having more information on the client side than the server has may be undesirable.
MIH: The one piece missing in the standard is the specific format of the dates.
ZB: The most popular knob is the hour cycle, then first day of the week, then 3rd is the date/time style. Most users want/will accept standard formatting without customization.
SFC: Rewind to the specific question: do we allow and use different locales from various sources or do we use a single locale?
LEO: Consistency - as a web developer, I’d love to have consistency between implementations.
My argument is that I would be positive with this if all the browsers can agree to have some predictability over decisions.
SFC: How does this relate to polyfill-ability? Does this flexibility break the polyfill functionality? His preference is to use the extensions that did not exist in the 1980s rather than depend on older, perhaps not relevant standards.
JSW: I think the way defaultLocale is defined in the spec says you can't have Unicode extensions in it. Because the APIs we have in SpiderMonkey specifically don't allow that.
SFC: I was not looking specifically at defaultLocale, but on navigator.languages. I’ll follow up on that later on ECMA402 spec. Is there anyone who feels strongly?
ZB: I just checked and it appears ECMA-402 does allow BCP47 in defaultLocales. I want to follow up with FYT about locale lists.
JSW: Double-double-check: The default locale cannot include Unicode extensions, even today: the default locale must be in [[AvailableLocales]], and that list’s entries must not have Unicode extensions.
Close the issue.
MGR: We could make a single section. It's nice to have a separate section to refer to. But it's nice to have the little icon on the side.
SFC: We already have an annes that lists all the ILD data.
MGR: I feel like it is a requirement in W3C that you have at least a tag that will poke at you to have it.
CLA: What's the audience? Is it the spec authors, users, implementers? Who benefits from this disclaimer section? I'm asking because if it's just for implementers… a lot of browsers/implementers are here in the meeting. Creating a disclaimer for that can raise information for other people to find exploits and things. So I don't know if it's beneficial. It makes it more obvious that there's a vector, to make it easier for bad actors to discover.
MGR: It is very nice for specs to be written not assuming that all the browsers have seen them already.
FYT: It looks as if everything in ECMA-402 is fingerprint-able. Why not put it at the top to say everything is fingerprint-able instead of putting it into every field?
MCM: You can't put just a mark at the top and say that everything is fingerprint-able. That's not acceptable.
SFC: points out that there is a definition in Annex A that defines “finger-printable items”
MGR: We don’t want to have useless warnings everywhere.
MCM: The purpose is not to spray word "fingerprinting" everywhere. Purpose is to clarify to users what is used for what purpose. What are the tradeoffs, what do you give up in privacy, and what do you gain for that? Browsers can have lots of complicated policies around fingerprinting, so just having a comment at the top that everything is fingerprint-able is not sufficient.
ZB: There are existing parts that are not fingerprint-able.
FYT: You can pass in a long locale, and based on the browser, it can return a different list.
ZB: It's only canonicalization. It will return the same as long as everyone's canonicalization algorithm is the same.
ZB: The only fingerprintable thing is defaultLocale. I think this is an argument to be made. Once we start talking about user preferences, we're entering this phase, but because defaultLocale is used anywhere, as a result, almost any API is fingerprintable. And maybe there's a way to mark this one bit.
MGR: More in general, if we're in a situation where we start marking everything, we should figure out what is our boundary for marking things. If we're in a situation where we mark everything, that's not useful. ZB's approach sounds reasonable. If there's one bit you want to mask, you can mask it in one place.
LEO: As an editor, I tend to agree with Allen Wirfs-Brock who asks if this is an actual concern for ECMA-402. It should probably be a concern for the HTML spec where it integrates ECMA-262. Prob should do something generic for Annex A. Understand ECMA-402 is used a lot in browsers, but that's not in the spec that it is limited only to the browser.
SFC: Can we agree that Annex A is fingerprintable, add that information, and show an icon?
SFC: Is the audience for this tag developers who will use it for, e.g., GDPR?
Mark Annex A as fingerprintable.
There was general consensus that behaviour proposed by Anba was accepted.
JSW will work on this PR and tests for it.
How about the -t-
extension? (and whether duplicate variants are allowed in tlang
, e.g. en-t-en-emodeng-emodeng
, while en-emodeng-emodeng
on its own would be invalid)
JSW: We can wait on for browsers, even if the TC39 version is out of date.
SFC: Congrats to ZB for Locale and to SFC for NumberFormat Unified API
MCM: Is the public status page up to date?