-
Notifications
You must be signed in to change notification settings - Fork 3.5k
en-vi: add IWSLT'15 English-Vietnamese as new problem #611
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonderful, thanks!
Hi, I trained Transformer base for this dataset with 500K steps, evaluated with t2t-bleu and got BLEU scores around 28.47 (cased) and 29.32 (uncased). But when i evaluate this result with multi-bleu.pl script from moses, I got only 27.69 (you can see my training/evaluation signatures from here). That's weird! It seems that t2t-bleu tends to provide higher score. |
With multi-bleu.pl you need to tokenize the hypothesis and reference yourself first (and possibly normalize unicode punctuation) - depending on the way you do it you can get big difference (e.g. +- 5 BLEU). Thus multi-bleu.pl is not replicable in general. |
Ah, I forgot to mention. The decoded output from tensor2tensor for this dataset is already tokenized (look at this). Also, the provided reference is also already tokenized. So, the evaluation with multi-bleu.pl is fair. |
Hi Stefan, Thank you very much for sharing your work. I tried to reproduce the reported results with your pretrained model. Nevertheless, I got the error below:
Please help me to check the log file Log.txt and give me a piece of advice on how to solve this issue. Thank you. |
@ankushagarwal Sorry for the long response time! Could you solve the problem? What version of T2T are you using for your experiments? |
Hi,
with this PR a new problem is added: English-Vietnamese using the IWSLT'15 dataset from the Stanford NLP group.
I trained an English to Vietnamese model for 125k steps on a NVIDIA GTX 1060. Here are some nice comparisons of BLEU score on the
tst2013
test set. Other BLEU scores are taken from the Towards Neural Phrase-based Machine Translation paper.