Skip to content

GTNC 1.0

Latest
Compare
Choose a tag to compare
@damiaanr damiaanr released this 20 Jun 23:57
· 6 commits to main since this release

Many-to-one dataset containing original texts from 50 languages and corresponding translations into English by Google Translate. This version contains 7,500 samples per language with an average character length in English of ~125.