Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add eslint rule for localization issues #10397

Merged
merged 1 commit into from
Feb 17, 2022
Merged

Conversation

msujew
Copy link
Member

@msujew msujew commented Nov 8, 2021

What it does

Adds a new eslint rule that helps developers of Theia to correctly localize the application:

  1. Flags calls to nls.localizeByDefault which use a default value which does not exist inside of the nls.metadata.json file. Once such a call is detected, the rule tries to find the best fitting existing default value using the levenshtein distance.
  2. Flags calls to nls.localize which could be replaced by nls.localizeByDefault.

How to test

  1. Assert that yarn lint does not yield any errors (CI is green)
  2. Changing the default value of an existing localizeByDefault call slightly should lead to an eslint error and a quickfix is provided.
  3. Changing an existing localizeByDefault(default) call to localize(key, default) should lead to an eslint error and a quickfix is provided here as well.

Review checklist

Reminder for reviewers

@msujew msujew added contributor experience issues related to the contributor experience localization issues related to localization/internalization/nls labels Nov 8, 2021
Copy link
Member

@vince-fugnitto vince-fugnitto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes overall look good to me, I'll let @colin-grant-work review as well and take a deeper look into the code:

  • warns when localizeByDefault is used incorrectly (no value exists)
  • warns when there is a typo, for instance a string does not exist but is close
  • warns when localize is used when localizeByDefault can be used

While potentially difficult, will there be plans to have user facing strings checked if they can also be localized, potentially in the future?

dev-packages/eslint-plugin/rules/localization-check.js Outdated Show resolved Hide resolved
@msujew msujew force-pushed the msujew/localization-eslint branch from 49148a0 to 89b8dd1 Compare November 9, 2021 19:41
@msujew
Copy link
Member Author

msujew commented Nov 12, 2021

While potentially difficult, will there be plans to have user facing strings checked if they can also be localized, potentially in the future?

@vince-fugnitto No plans so far, due to the difficulty. Perhaps there's some good idea on how to accomplish that, but I have no idea how one would do that without flagging a ton of false positives. I would be glad to contribute that, if we come up with a good solution for that.

@vince-fugnitto
Copy link
Member

@vince-fugnitto No plans so far, due to the difficulty. Perhaps there's some good idea on how to accomplish that, but I have no idea how one would do that without flagging a ton of false positives. I would be glad to contribute that, if we come up with a good solution for that.

@msujew I don't really have a good solution other than creating "hint" or "info" markers in the frontend for all strings as reminders for devs if they want to/should localize. The issue is even as a hint devs might find it is too verbose or distracting.

@colin-grant-work
Copy link
Contributor

One question that this raises for me is how to handle new strings that don't appear in VSCode. There are a few places where we differ in irreconcilable ways, and I've seen some examples of .localize with a key containing theia rather than VSCode, and you've mentioned machine translating new strings in the past. Is the idea that we should then modify the nls.metadata.json file directly, or how would those be managed? Basically: how do you fix this lint error if you're introducing a genuinely new string?

@msujew
Copy link
Member Author

msujew commented Nov 17, 2021

One question that this raises for me is how to handle new strings that don't appear in VSCode.

@colin-grant-work I think I went over most of this during this weeks dev-meeting. New strings will be extracted using the theia nls-extract cli and translated using #10187. I'm currently thinking that these commands will run just before the release and any mis-translations will be fixed during the month after the release (probably with the help/issues raised by the community).

Basically: how do you fix this lint error if you're introducing a genuinely new string?

Genuinely new strings should just use the nls.localize function and introduce a new key. The proposed eslint rule just catches any issues that arise due to incorrect use of nls.localizeByDefault and helps users not introduce values with nls.localize that already exist inside of the nls.metadata.json file.

@@ -14,6 +14,8 @@
* SPDX-License-Identifier: EPL-2.0 OR GPL-2.0 WITH Classpath-exception-2.0
********************************************************************************/

/* eslint-disable @theia/localization-check */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add a second commit fixing these? it looks like it's mainly suggestions to use byDefault in place of the keyed version. For that violation, it might make sense to add a fixer to the plugin.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this file is a weird case anyway. This file uses it's own "implementation" (if one can say so), since it's in the node directory and has no actual localizations available:

namespace nls {
export function localize(key: string, _default: string): string {
return _default;
}
}

This code has been there for quite a long time, or at least way before I started working on i18n in Theia. Actually, looking further into it, nothing that's exported in this file is referenced somewhere else. It seems to be dead code from before the vscode-extension time of Theia. I would remove this in a separate PR.

dev-packages/eslint-plugin/rules/localization-check.js Outdated Show resolved Hide resolved
@@ -109,6 +109,7 @@ export namespace GitFileStatus {
case GitFileStatus.New: return !!staged ? nls.localize('theia/git/added', 'Added') : nls.localize('theia/git/unstaged', 'Unstaged');
case GitFileStatus.Renamed: return nls.localize('theia/git/renamed', 'Renamed');
case GitFileStatus.Copied: return nls.localize('theia/git/copied', 'Copied');
// eslint-disable-next-line @theia/localization-check
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the solution for things that are VSCode but aren't in our JSON (yet)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, in this particular case the Modified value seems to exist somewhere in the nls.metadata.json (our version). However, I worry about semantics here, since the Modified that exists inside of vscode is used for modified preferences. I assume that this particular context does not translate correctly into other language when used for git related modifications.

Also you perhaps have noticed the key to vscode.git, which is the builtin extension, that is also translated in the language pack. As this isn't localized text from the actual vscode codebase, it doesn't appear inside of the nls.metadata.json. Now that we use localizeByDefault basically everywhere, I would be fine with using theia/* keys for these instead of relying on the extension translations. There are only around ~20 instances of these cases in Theia. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my target would be a one-tiered world: everything uses localizeByDefault. If we find that the thing translated doesn't exist in the JSON, there are two possible causes: it's a typo / rephrase, or it hasn't been translated by language packs. In the former case, we should correct it, because then it will be translated correctly. In the latter case, rewriting localizeByDefault to localize('some/theia/key') doesn't actually get us any closer to getting the thing translated, so I don't see the point. Instead, whenever we are able to create new translations ourselves, we can traverse the code base, find any uses of localizeByDefault that aren't currently working, translate them, and add it to the JSON (creating an automatic key, if that's necessary to make things work, something like theia/<package>/<file>/<first word or two as necessary>). If we want a system that requires investigating the actual origin of a particular string to determine its key and checking the JSON to see if that source is covered, then we'll need a lot more documentation about when and how to do that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting approach. I like the simplicity (for the user, hidden behind a bit of technical complexity). I'll create a draft for this, and add it as an additional commit to this PR.

dev-packages/eslint-plugin/package.json Outdated Show resolved Hide resolved
@msujew msujew mentioned this pull request Nov 18, 2021
8 tasks
@msujew msujew force-pushed the msujew/localization-eslint branch 2 times, most recently from 7cdf340 to 5504e0c Compare November 22, 2021 13:04
@colin-grant-work
Copy link
Contributor

@msujew, I don't remember whether my latest comment was before or after you requested my review - ping me if / when you'd like me take another look.

@colin-grant-work colin-grant-work removed their request for review November 30, 2021 15:16
@msujew msujew force-pushed the msujew/localization-eslint branch 2 times, most recently from 48d0d48 to 0299576 Compare February 15, 2022 14:09
@msujew
Copy link
Member Author

msujew commented Feb 15, 2022

@colin-grant-work It took me a while to get back to this, since I was busy with other stuff and waiting for #10187 to get merged. So, I think having a one-tiered approach to this (only using localizeByDefault in Theia) will lead to too much implicit behavior. I can imagine multiple occasions where we run into the following issues:

  1. Using localizeByDefault everywhere makes this eslint rule obsolete. As a consequence, developers will be uncertain whether they actually use a "default" translation (using vscode) or introduce a new translation.
  2. Given that translation ids will be generated using this approach from the text, minor changes in the original English text might lead to stability issues down the line, i.e. English text changes -> id changes -> translations for other languages are suddenly missing.

Therefore, I believe the two-tiered approach makes it clear which method to use, especially when the eslint rule shows errors on misused localizeByDefault and localize calls. Does my line of thinking make sense here? In short: I like the explicit nature of localize(key, default) combined with the safety of the eslint rule more than implicit localizeByDefault behavior which might break translations on minor changes.

Do you mind looking into my rebased changes? :)

@colin-grant-work
Copy link
Contributor

@colin-grant-work It took me a while to get back to this, since I was busy with other stuff and waiting for #10187 to get merged. So, I think having a one-tiered approach to this (only using localizeByDefault in Theia) will lead to too much implicit behavior...

@msujew, now that there are real translations of the non-VSCode strings in Theia (or at least the potential to create them, and an associated procedure), I think it makes sense to distinguish between the two systems. I'll take a look at the code again in the next day or two.

@msujew msujew force-pushed the msujew/localization-eslint branch from 0299576 to 7fc298b Compare February 17, 2022 13:29
@colin-grant-work
Copy link
Contributor

@msujew, first thing on my agenda this morning!

@msujew msujew force-pushed the msujew/localization-eslint branch from 7fc298b to 95eae50 Compare February 17, 2022 15:11
Copy link
Contributor

@colin-grant-work colin-grant-work left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behavior looks correct to me, and the check is quite quick.

  • Detected bad use of byDefault and offered a suggestion.
  • Automatically inserted the suggestion when triggered to do so.
  • Detected use of .localize when byDefault was possible.
  • Correctly rewrote call to byDefault when triggered.

Minor comment re: variable names. It also doesn't trigger on nls.localizeByDefault('') (empty string). I'm not sure whether that's because the empty string is in the metadata.json, and I'm not sure whether it's desirable to trigger in that case, but it does seem like an odd input.

@msujew msujew force-pushed the msujew/localization-eslint branch from 95eae50 to 518fed8 Compare February 17, 2022 15:28
@msujew
Copy link
Member Author

msujew commented Feb 17, 2022

Thanks @colin-grant-work, the empty string issue was triggered by a faulty if (value) { ... } check. It will now detect '' as being an invalid value for localizeByDefault.

@msujew msujew merged commit ee716ac into master Feb 17, 2022
@github-actions github-actions bot added this to the 1.23.0 milestone Feb 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributor experience issues related to the contributor experience localization issues related to localization/internalization/nls
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants