Skip to content

Latest commit

 

History

History
47 lines (41 loc) · 2.58 KB

2003-statistical-phrase-based-translation.md

File metadata and controls

47 lines (41 loc) · 2.58 KB
title authors fieldsOfStudy meta_key numCitedBy reading_status ref_count tags urls venue year
Statistical Phrase-Based Translation
Philipp Koehn
F. Och
D. Marcu
Computer Science
2003-statistical-phrase-based-translation
3757
TBD
21
gen-from-ref
other-default
paper
NAACL
2003

semanticscholar url

Statistical Phrase-Based Translation

Abstract

We propose a new phrase-based translation model and decoding algorithm that enables us to evaluate and compare several, previously proposed phrase-based translation models. Within our framework, we carry out a large number of experiments to understand better and explain why phrase-based models out-perform word-based models. Our empirical results, which hold for all examined language pairs, suggest that the highest levels of performance can be obtained through relatively simple means: heuristic learning of phrase translations from word-based alignments and lexical weighting of phrase translations. Surprisingly, learning phrases longer than three words and learning phrases from high-accuracy word-level alignment models does not have a strong impact on performance. Learning only syntactically motivated phrases degrades the performance of our systems.

Paper References

  1. Improved Alignment Models for Statistical Machine Translation
  2. Improved Statistical Alignment Models
  3. The Mathematics of Statistical Machine Translation - Parameter Estimation
  4. A Syntax-based Statistical Translation Model
  5. A Phrase-Based, Joint Probability Model for Statistical Machine Translation
  6. An Efficient A* Search Algorithm for Statistical Machine Translation
  7. Three Generative, Lexicalised Models for Statistical Parsing
  8. Robust German Noun Chunking With a Probabilistic Context-Free Grammar
  9. Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora
  10. Bleu - a Method for Automatic Evaluation of Machine Translation
  11. Statistical methods for speech recognition
  12. Statistical language modeling using the CMU-cambridge toolkit
  13. The mathematics of statistical machine translation
  14. Application of translation knowledge acquired by hierarchical phrase alignment for pattern-based MT.
  15. Fast Decoding and Optimal Decoding for Machine Translation