Gather questions for the user agents investigation #48

mitchellevan · 2025-01-06T13:30:29Z

From 2025-01-05 paper notes...

What is necessary for accurate pronunciation of a language?

(What is "accurate pronuncation"? See #52)

My hypotheses:

Windows screen readers

This is my current understanding for NVDA. I believe JAWS is the same, to confirm.

The user has a synthesizer, either bundled with the screen reader or installed separately.
The synthesizer has a set of voices, either bundled with the synthesizer or installed individually.
The user has configured the screen reader to use this synthesizer.
One of the synthesizer's voices offers support for language X.
The screen reader is configured for automatic language switching (the default for both JAWS and NVDA).
The user uses a modern browser. (Separate research issue: confirm in browsers if necessary, though I am not aware of any that break this.)
The user loads a web page with language X correctly marked up. (Separate research issue: "correctly marked up" will vary by language -- see below.)

Stock Android 14 with correct web code

The user has a speech engine, either built into Android OS or installed separately. ("Text-to-speech output" > "Preferred engine" > "Speech Recognition and
Synthesis from Google" is the only built-in choice.)
The user has configured "Text-to-speech output" to use this speech engine. (Nothing to do on stock Android.)
The speech engine offers support for language X. (What does this mean actually? Is it in one of the language lists in Settings?)
Speech engine settings > "Language detection" is "Off" (not the default). (Presumably "Conservative" and "Aggressive" would usually work here too, though errors are conceivable.)
The user uses a modern browser. (Separate research issue: confirm in browsers if necessary, though I am not aware of any that break this.)
The user loads a web page with language X correctly marked up.

Questions:

In speech engine settings ("Google TTS Options"), what does "Install voice data" do? If I don't install voice data for language X, can Android synthesize it in the cloud?
Confirm: I don't see an "automatic language switching" setting. "Language detection" set to "off" does not prevent language switching for correctly marked-up phrases in a web page.

Stock Android 14 with automatic language detection

Research issue: what does Android automatic language detection depend on? This is not a high priority for research in a web context, because best practice is to mark up the web content correctly.

Same as previous, except:

"Language detection" is set to (TBD) "Conservative" (default) or "Aggressive".
The web content is in language X.
The web content does not have a lang attribute. (Or has a lang attribute with an incorrect value?)

Script variants

When a script is highly matched to a language, like Japanese or Kannada, does it increase the likelihood of heuristics determining the language without a correct lang?

Stock Android with language choice

TODO research issue. What does the TalkBack "language" menu do? Is it the same as "Text-to-speech" > "Language"? Combined with the "Language detection" setting, what can it repair?

This is lower priority for web content in supported languages, because best practice is to mark it up correctly.

However, it could be higher priority question for web content in languages not supported by the speech engine. Some users might want to choose a good-enough voice for a content language.

What does "correctly marked up" mean for various languages?

This should be its own issue, and might be answered by https://datatracker.ietf.org/doc/html/rfc4647

Which subtags match? Examples:

macrolanguage and nested language. Example: Chinese (zh) and Cantonese (yue)
nested macrolanguages. Example: Odia
sibling lects in the same macrolanguage, e.g. Lower Sorbian and Upper Sorbian, both in Sorbian (wen)
where the content markup is more granular than the user agent expects
where the content markup is less granular than the user agent expects
in a language commonly written in more than one script, with or without the script subtag
deprecated, variant, and grandfathered subtags
does a region code sometimes change the TTS voice? In good ways or potentially unwanted ways?

Does TTS support or not support script variants for some languages?

In a language commonly written in more than one script, such as some India languages; Uzbek; to some extent Japanese.

Note: eSpeak documentation gives examples of such limitations.

What determines the language of announcements for web element roles and states?

There will be separate research issues in "technical tests" to break this down more granularly. Thinking at first of most element roles.

Hypothesis: It's the UI language of the screen reader. This can be the same as the OS UI language or it can be set separately.
Or is it the OS UI language, even when different from the screen reader UI language?
Or is it the OS preferred language for content? e.g. when the most preferred language is not an OS UI language, and it is an available screen reader UI language.
Or could browser UI language make a difference?
Or could browser preferred content language make a difference?
Or could the web content language make a difference? Example: JAWS for Kiosk has this apparently - it's mentioned in a patent claim in a public patent database

The text was updated successfully, but these errors were encountered:

mitchellevan added the user-agents-strand2 Research and document how the user agent layers work with languages label Jan 6, 2025

mitchellevan mentioned this issue Jan 6, 2025

Ask E. for known AT behaviors in languages #51

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gather questions for the user agents investigation #48

Gather questions for the user agents investigation #48

mitchellevan commented Jan 6, 2025 •

edited

Loading

Gather questions for the user agents investigation #48

Gather questions for the user agents investigation #48

Comments

mitchellevan commented Jan 6, 2025 • edited Loading

What is necessary for accurate pronunciation of a language?

Windows screen readers

Stock Android 14 with correct web code

Stock Android 14 with automatic language detection

Stock Android with language choice

What does "correctly marked up" mean for various languages?

Does TTS support or not support script variants for some languages?

What determines the language of announcements for web element roles and states?

mitchellevan commented Jan 6, 2025 •

edited

Loading