View article

[PDF] from uni-kassel.de

Ontologies improve text document clustering

Authors

Andreas Hotho, Steffen Staab, Gerd Stumme

Publication date

2003/11/22

Conference

Third IEEE international conference on data mining

Pages

541-544

Publisher

IEEE

Description

Text document clustering plays an important role in providing intuitive navigation and browsing mechanisms by organizing large sets of documents into a small number of meaningful clusters. The bag of words representation used for these clustering methods is often unsatisfactory as it ignores relationships between important terms that do not cooccur literally. In order to deal with the problem, we integrate core ontologies as background knowledge into the process of clustering text documents. Our experimental evaluations compare clustering techniques based on pre-categorizations of texts from Reuters newsfeeds and on a smaller domain of an eLearning course about Java. In the experiments, improvements of results by background knowledge compared to a baseline without background knowledge can be shown in many interesting combinations.

Total citations

Cited by 864

20042005200620072008200920102011201220132014201520162017201820192020202120222023202413 32 28 45 53 59 55 58 62 50 57 63 71 49 46 41 25 19 13 11 1

Scholar articles

Ontologies improve text document clustering

A Hotho, S Staab, G Stumme - Third IEEE international conference on data mining, 2003

Ontologies to improve text document clustring

A Hotho, S Staab, G Stumme - Proc. of the 20th ICML, 2003

Cited by 8 Related articles

Ontologies improve text clustering*

A Hotho, S Staab, G STRUMME - Proc. ICDM'03 3rd IEEE Int. Conf. on Data Mining, 2003

Cited by 3 Related articles

Ontologies improve text text clustering*

A Hotho, S Staab, G Stumme - ICDM. Third IEEE International Conference on, 2003

Cited by 2 Related articles