Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bilingual support #766

Open
danielnaber opened this issue Aug 11, 2017 · 9 comments
Open

Bilingual support #766

danielnaber opened this issue Aug 11, 2017 · 9 comments

Comments

@danielnaber
Copy link
Member

danielnaber commented Aug 11, 2017

See languagetool-org/languagetool-browser-addon#75 for ideas - but it's not specific to the add-on, so the issue should be discussed here.

Keywords: mixed languages support

@adamint
Copy link

adamint commented Sep 10, 2017

This would also be useful for non-cyrillic languages like French. Is this still wanted?

@danielnaber
Copy link
Member Author

Yes, it's still on the list of things that should be done.

@dpelle
Copy link
Member

dpelle commented Sep 10, 2017

Maybe nit-picky, but rather than "bilingual support", I would rather
generalize to "multilingual support".

When give 2 or more languages to LT, I would then expect LT
to ignore spelling errors for words that are correct in any of the
given languages.

For grammar checking, I'm not sure what it should do.
I see 2 possibilities:

  1. Guess the language of each sentence being checked.
    It's only a heuristic, so not entirely satisfactory.

  2. or specify multiple language(s) for spelling checking, but
    only one for grammar checking? Grammar checking could
    for example always use the first of the given languages.

@tugit
Copy link

tugit commented Sep 26, 2017

@dpelle I would vote for the first option as it's way more powerful. I think it could be made more efficient if the language of the first sentence is evaluated and LT assumes the same language for the sentences to follow. The language should only be reevaluated if too many words in the sentence are incorrect spelling wise. I think most texts have a primary language and use a second one for citations.

@danielnaber
Copy link
Member Author

danielnaber commented Aug 22, 2018

It seems there are two cases and everything in between:

  • a single e.g. English word used in a non-English text
  • a whole English paragraph used in a non-English text

For the first case, I'd suggest to still have errors for these words, but in a different color, probably the blue that's used for style suggestions. For this, we'll need to make RuleMatch carry a color code or name. Currently, all the colors are coded in the client. We cannot prevent the clients from using their own colors, but the API could suggest a color in it's JSON response.

@ghost
Copy link

ghost commented Aug 22, 2018

I think this is about the pressure of English mostly. English (or other) words that became common in the first language, should be in the spell checking list for the first language. (Dutch is an example where lots of English words are accepted, as well as some German and French).

But it might be that the text is of a very specialist kind, a field of word where English is the language the innovations are coming from (ICT, engneering, medical). Allowing the user to specify a 'fallback' language could be helpful in those cases. But I guess it is still better to show the writer it is not 'a generally known native word'; there might be a better way to write the text.

About the way it is coded: my 2 cents are that there should be no color codes in the API, just error types, so the client can choose the colors.

@Esokrates
Copy link

I would agree for those words to be marked in a different way, though I would not call it "error", except the word is misspelled in the secondary language.
I think the color should be left to the client. If users only want to be notified if the word is misspelled and not care about mixing languages they should have the option to ignore the "different language, correctly spelled" warning.

@laurids-reichardt
Copy link

Is this feature on the road map?

@rami-alloush
Copy link

This is very needed for projects that uses languages like Arabic where the HTML Markup for example is in English but the actual values are in Arabic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants