Skip to content
This repository has been archived by the owner on Jul 7, 2023. It is now read-only.

en-vi: add IWSLT'15 English-Vietnamese as new problem #611

Merged
merged 1 commit into from
Feb 28, 2018
Merged

en-vi: add IWSLT'15 English-Vietnamese as new problem #611

merged 1 commit into from
Feb 28, 2018

Conversation

stefan-it
Copy link
Contributor

Hi,

with this PR a new problem is added: English-Vietnamese using the IWSLT'15 dataset from the Stanford NLP group.

I trained an English to Vietnamese model for 125k steps on a NVIDIA GTX 1060. Here are some nice comparisons of BLEU score on the tst2013 test set. Other BLEU scores are taken from the Towards Neural Phrase-based Machine Translation paper.

Model BLEU (Beam Search)
Luong & Manning (2015) 23.30
Sequence-to-sequence model with attention 26.10
Neural Phrase-based Machine Translation Huang et. al. (2017) 27.69
Neural Phrase-based Machine Translation + LM Huang et. al. (2017) 28.07
Transformer (Base) 28.12 (cased)
Transformer (Base) 28.97 (uncased)

Copy link
Contributor

@lukaszkaiser lukaszkaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonderful, thanks!

@duyvuleo
Copy link
Contributor

duyvuleo commented Apr 9, 2018

Hi, I trained Transformer base for this dataset with 500K steps, evaluated with t2t-bleu and got BLEU scores around 28.47 (cased) and 29.32 (uncased). But when i evaluate this result with multi-bleu.pl script from moses, I got only 27.69 (you can see my training/evaluation signatures from here). That's weird!

It seems that t2t-bleu tends to provide higher score.

@martinpopel
Copy link
Contributor

With multi-bleu.pl you need to tokenize the hypothesis and reference yourself first (and possibly normalize unicode punctuation) - depending on the way you do it you can get big difference (e.g. +- 5 BLEU). Thus multi-bleu.pl is not replicable in general.
It seems you have forgotten to do any tokenization.
The advantage of t2t-bleu and sacrebleu is that you don't need care about tokenization - it is integrated inside.

@duyvuleo
Copy link
Contributor

duyvuleo commented Apr 9, 2018

Ah, I forgot to mention. The decoded output from tensor2tensor for this dataset is already tokenized (look at this). Also, the provided reference is also already tokenized. So, the evaluation with multi-bleu.pl is fair.

@anhtuanvn
Copy link

Hi Stefan,

Thank you very much for sharing your work.

I tried to reproduce the reported results with your pretrained model. Nevertheless, I got the error below:

NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Key transformer/symbol_modality_20428_512/shared/weights_0 not found in checkpoint
[[node save/RestoreV2_1 (defined at /usr/local/lib/python2.7/dist-packages/tensorflow_estimator/python/estimator/estimator.py:627) ]]"

Please help me to check the log file Log.txt and give me a piece of advice on how to solve this issue.

Thank you.

@stefan-it
Copy link
Contributor Author

stefan-it commented Mar 19, 2019

@ankushagarwal Sorry for the long response time! Could you solve the problem? What version of T2T are you using for your experiments?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants