Authors
Sadaf Abdul-Rauf, Mark Fishel, Patrik Lambert, Sandra Noubours, Rico Sennrich
Publication date
2012/5/27
Pages
6-10
Publisher
University of Zurich
Description
Parallel corpora are usually a collection of documents which are translations of each other. To be useful in NLP applications such as word alignment or machine translation, they first have to be aligned at the sentence level. This paper is a user study briefly reviewing several sentence aligners and evaluating them based on the performance achieved by the SMT systems trained on their output. We conducted experiments on two language pairs and showed that using a more advanced sentence alignment algorithm may yield gains of 0.5 to 1 BLEU points.
Total citations
2012201320142015201620172018201920202021202220232024432222121111
Scholar articles