Authors
Masahiro Ito, Kotaro Nakayama, Takahiro Hara, Shojiro Nishio
Publication date
2008/10/26
Book
Proceedings of the 17th ACM conference on Information and knowledge management
Pages
817-826
Description
Wikipedia, a huge scale Web based encyclopedia, attracts great attention as an invaluable corpus for knowledge extraction because it has various impressive characteristics such as a huge number of articles, live updates, a dense link structure, brief anchor texts and URL identification for concepts. We have already proved that we can use Wikipedia to construct a huge scale accurate association thesaurus. The association thesaurus we constructed covers almost 1.3 million concepts and its accuracy is proved in detailed experiments. However, we still need scalable methods to analyze the huge number of Web pages and hyperlinks among articles in the Web based encyclopedia.
In this paper, we propose a scalable method for constructing an association thesaurus from Wikipedia based on link co-occurrences. Link co-occurrence analysis is more scalable than link structure analysis because it is a one-pass …
Total citations
2007200820092010201120122013201420152016201720182019202020212022202320241210148810945431122
Scholar articles
M Ito, K Nakayama, T Hara, S Nishio - Proceedings of the 17th ACM conference on …, 2008