Inventors
Yuqing Gao, Bing Xiang, Bowen Zhou
Publication date
2013/1/8
Patent office
US
Patent number
8352244
Application number
12506483
Description
Embodiments of the present invention utilize active learn ing to update parallel corpus with increased speed and decreased cost. Anactive learning approach, where a machine can partially teach itself, does not rely solely on human trans lators and provides a great benefit to statistical machine trans lation systems by increasing translation performance while using less human resources. Described herein is a method for creating or updating par allel corpus in a machine translation system. The method prepares a test set E to be updated, translates the test set E from a first language to a second language so as to create set F in the second language, translates set F back to the first language so as to create set E'in the first language, computes confidence scores for the translation of each item in the set based on the similarity of E and E, creates a subset of the highest confidence scores and adds the translations in the …
Total citations
20142015201620172018201920202021202220232861013154103