Authors
Oskar Kohonen, Sami Virpioja, Krista Lagus
Publication date
2010/7
Conference
Proceedings of the 11th meeting of the ACL Special Interest Group on Computational Morphology and Phonology
Pages
78-86
Description
We consider morphology learning in a semi-supervised setting, where a small set of linguistic gold standard analyses is available. We extend Morfessor Baseline, which is a method for unsupervised morphological segmentation, to this task. We show that known linguistic segmentations can be exploited by adding them into the data likelihood function and optimizing separate weights for unlabeled and labeled data. Experiments on English and Finnish are presented with varying amount of labeled data. Results of the linguistic evaluation of Morpho Challenge improve rapidly already with small amounts of labeled data, surpassing the state-ofthe-art unsupervised methods at 1000 labeled words for English and at 100 labeled words for Finnish.
Total citations
20102011201220132014201520162017201820192020202120222023202423424881113696413
Scholar articles
O Kohonen, S Virpioja, K Lagus - Proceedings of the 11th meeting of the ACL Special …, 2010