Using machine translation #68

Bloke · 2025-03-05T09:53:57Z

As much as I despise the splurge of appalling so-called "AI" tools, is there any mileage in using the machine translation engines available in CrowdIn for doing pre-translations that can be checked and tweaked by human translators? Or is there too much scope for the machine tools getting it wrong and providing bad or potentially embarrassing out-of-context translations?

Just wondering if this would give us a leg-up to completing some of the pophelp packs, since it may seem a daunting task to create from scratch or from a sparsely-populated pack, but might seem less daunting if it just needs a sweep to check for accuracy and then tweak. Assuming of course, the tools are freely available. If there's cost involved, it's probably not worth it.

To be clear, I would only (potentially) advocate this usage on strings where no translation has been provided yet. No way would I want a machine tinkering with what we've already translated.

The setup wizard allows the tool(s) to "learn" context from other strings in the project at varying levels of granularity, but I'm not sure how accurate it would be.

Not sure also on the efficacy of using the tool(s) for individual textpack strings. The context there might be more difficult for a machine to ascertain.

Thoughts?

philwareham · 2025-03-05T12:12:24Z

The Crowdin AI tools are a bit hit-or-miss. For some languages it works great, others not so much (really depends on how many users have inout data in that language across the whole system it seems). I guess you could use something like ChatGPT for better translations maybe?

Bloke · 2025-03-05T12:43:53Z

Okay, good to know, thanks.

I'm not fussed about using something as invasive as ChatGPT. If it handled more languages, I'd nearly always fall back on DeepL because it seems to make a better job of translations (from rudimentary testing) than Google translate, etc.

Just wondered if the built-in modules offered inside CrowdIn were worth pursuing. If they're not much cop, happy to leave it.

philwareham · 2025-03-05T12:46:42Z

Oh, I see there is an AI section now (in addition to the ML they had before, which I based my comment on). Let me investigate, will get back to you shortly.

philwareham · 2025-03-05T12:49:31Z

OK, so there are a few providers, any preference? This also depends on what the cost is for each service.

Bloke · 2025-03-05T13:20:16Z

OK, so there are a few providers, any preference?

The cheapest/free-est ;)

I don't have any preference whatsoever. I have practically zero experience with any of them.

philwareham · 2025-03-05T13:23:04Z

Hmmm OK, I will add $20 to OpenAI and see what it spits out (not saying I'll use it past that initial test). I can get the money back from our Open Collective funds.

Bloke · 2025-03-05T13:26:07Z

Nice one. Gotta be worth a punt.

philwareham · 2025-03-05T13:51:39Z

OK, I used the OpenAI module through Crowdin complete the Spanish pophelp, which was about 60% completed previously (see this commit for reference 5178f44). That cost around £2.50 in total.

So it's up to you, the translations seem to look good based on my limited knowledge of Spanish and checking that HTML and things like that are properly formed. We can apply this to some other languages ad-hoc or splurge on getting all the languages into a level of completeness.

Bloke · 2025-03-05T14:16:13Z

I know almost no Spanish but it seems reasonable as a starting point. Nice one.

Ad-hoc is fine by me. Maybe if we can enlist someone who speaks Spanish to check if the translations are accurate enough, it might give us confidence to proceed with others.

philwareham · 2025-03-05T15:33:24Z

I've done Spanish, German and French. That has used up the budget I put in now.

They have been put into the dev repo of Textpattern, so depending on what feedback you get from the beta2 will determine whether we invest more funds into the translations for other languages.

Bloke · 2025-03-05T15:44:26Z

Sweet, thank you. That's a good call about using beta 2 as a casting call for people to verify the machine output... which was trained by people(!)

Top work, thank you for jumping on this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using machine translation #68

Using machine translation #68

Bloke commented Mar 5, 2025 •

edited

Loading

philwareham commented Mar 5, 2025

Bloke commented Mar 5, 2025

philwareham commented Mar 5, 2025 •

edited

Loading

philwareham commented Mar 5, 2025

Bloke commented Mar 5, 2025 •

edited

Loading

philwareham commented Mar 5, 2025

Bloke commented Mar 5, 2025

philwareham commented Mar 5, 2025

Bloke commented Mar 5, 2025 •

edited

Loading

philwareham commented Mar 5, 2025 •

edited

Loading

Bloke commented Mar 5, 2025

Using machine translation #68

Using machine translation #68

Comments

Bloke commented Mar 5, 2025 • edited Loading

philwareham commented Mar 5, 2025

Bloke commented Mar 5, 2025

philwareham commented Mar 5, 2025 • edited Loading

philwareham commented Mar 5, 2025

Bloke commented Mar 5, 2025 • edited Loading

philwareham commented Mar 5, 2025

Bloke commented Mar 5, 2025

philwareham commented Mar 5, 2025

Bloke commented Mar 5, 2025 • edited Loading

philwareham commented Mar 5, 2025 • edited Loading

Bloke commented Mar 5, 2025

Bloke commented Mar 5, 2025 •

edited

Loading

philwareham commented Mar 5, 2025 •

edited

Loading

Bloke commented Mar 5, 2025 •

edited

Loading

Bloke commented Mar 5, 2025 •

edited

Loading

philwareham commented Mar 5, 2025 •

edited

Loading