View article

[PDF] from aclanthology.org

HUMB: Automatic key term extraction from scientific articles in GROBID

Authors

Patrice Lopez, Laurent Romary

Publication date

2010/7

Conference

Proceedings of the 5th international workshop on semantic evaluation

Pages

248-251

Description

The Semeval task 5 was an opportunity for experimenting with the key term extraction module of GROBID, a system for extracting and generating bibliographical information from technical and scientific documents. The tool first uses GROBID’s facilities for analyzing the structure of scientific articles, resulting in a first set of structural features. A second set of features captures content properties based on phraseness, informativeness and keywordness measures. Two knowledge bases, GRISP and Wikipedia, are then exploited for producing a last set of lexical/semantic features. Bagged decision trees appeared to be the most efficient machine learning algorithm for generating a list of ranked key term candidates. Finally a post ranking was realized based on statistics of cousage of keywords in HAL, a large Open Access publication repository.

Total citations

Cited by 219

2010201120122013201420152016201720182019202020212022202320241 1 5 18 12 24 25 26 24 24 16 12 14 10 6

Scholar articles

HUMB: Automatic key term extraction from scientific articles in GROBID

P Lopez, L Romary - Proceedings of the 5th international workshop on …, 2010