Authors
Mathias Johan Philip Creutz, Krista Hannele Lagus
Publication date
2005/6
Conference
International and Interdisciplinary Conference on Adaptive Knowledge Representation and Reasoning (AKRR'05)
Pages
106-113
Description
This work presents an algorithm for the unsupervised learning, or induction, of a simple morphology of a natural language. A probabilistic maximum a posteriori model is utilized, which builds hierarchical representations for a set of morphs, which are morpheme-like units discovered from unannotated text corpora. The induced morph lexicon stores parameters related to both the “meaning” and “form” of the morphs it contains. These parameters affect the role of the morphs in words. The model is implemented in a task of unsupervised morpheme segmentation of Finnish and English words. Very good results are obtained for Finnish and almost as good results are obtained in the English task.
Total citations
Scholar articles
MJP Creutz, KH Lagus - … Knowledge Representation and Reasoning (AKRR'05), 2005