Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bag-of-Words as Target for Neural Machine Translation #109

Open
kweonwooj opened this issue May 30, 2018 · 0 comments
Open

Bag-of-Words as Target for Neural Machine Translation #109

kweonwooj opened this issue May 30, 2018 · 0 comments

Comments

@kweonwooj
Copy link
Owner

Abstract

  • Propose a new training approach that uses both sentence and bag-of-words as targets in training stage
  • Addition of bag-of-words encourages model to generate potentially correct sentences, instead of punishing all non-reference sentences as incorrect
  • in NIST Zh-En set, BLEU improves 4.55

Details

Introduction

  • Problem
    • NMT training considers target sentence as the only golden label, and other semantically close or syntactically close sentences as incorrect equally with completely incorrect sentences.
  • Solution
    • add bag-of-words loss that encourages models to receive rewards when correct tokens are selected even with incorrect positions

Overview

screen shot 2018-05-30 at 2 18 38 pm

  • bag-of-words loss is calculated by summing all softmax results in decoder and compare them with reference bag-of-words in MLE format

Result

  • NIST Zh-En dataset has 1.25M, which is quite small for NMT set
    screen shot 2018-05-30 at 2 22 24 pm
  • bag-of-words model improves model by +4.55 BLEU, but notice SMT Moses achieving almost equivalent performance as Seq2Seq+Att.

Personal Thoughts

  • Good problem definition and solution is both intuitive and straighforward
  • NMT usually performs much better than SMT, but in this dataset, performance of SMT Moses and Seq2Seq+Att baseline is too close. Then what is bag-of-words improving?

Link : https://arxiv.org/pdf/1805.04871v1.pdf
Authors : Ma et al. 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant