Skip to main navigation Skip to search Skip to main content

Improve statistical machine translation with context-sensitive bilingual semantic embedding model

  • Haiyang Wu
  • , Daxiang Dong
  • , Wei He
  • , Xiaoguang Hu
  • , Dianhai Yu
  • , Hua Wu
  • , Haifeng Wang
  • , Ting Liu
  • Baidu Inc
  • Harbin Institute of Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We investigate how to improve bilingual embedding which has been successfully used as a feature in phrase-based statistical machine translation (SMT). Despite bilingual embedding's success, the contextual information, which is of critical importance to translation quality, was ignored in previous work. To employ the contextual information, we propose a simple and memory-efficient model for learning bilingual embedding, taking both the source phrase and context around the phrase into account. Bilingual translation scores generated from our proposed bilingual embedding model are used as features in our SMT system. Experimental results show that the proposed method achieves significant improvements on large-scale Chinese-English translation task.

Original languageEnglish
Title of host publicationEMNLP 2014 - 2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference
PublisherAssociation for Computational Linguistics (ACL)
Pages142-146
Number of pages5
ISBN (Electronic)9781937284961
DOIs
StatePublished - 2014
Externally publishedYes
Event2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014 - Doha, Qatar
Duration: 25 Oct 201429 Oct 2014

Publication series

NameEMNLP 2014 - 2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference

Conference

Conference2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014
Country/TerritoryQatar
CityDoha
Period25/10/1429/10/14

Fingerprint

Dive into the research topics of 'Improve statistical machine translation with context-sensitive bilingual semantic embedding model'. Together they form a unique fingerprint.

Cite this