View article

[PDF] from aksw.org

N³-A Collection of Datasets for Named Entity Recognition and Disambiguation in the NLP Interchange Format.

Authors

Michael Röder, Ricardo Usbeck, Sebastian Hellmann, Daniel Gerber, Andreas Both

Publication date

2014/5/26

Conference

LREC

Pages

3529-3533

Description

Abstract Extracting Linked Data following the Semantic Web principle from unstructured sources has become a key challenge for scientific research. Named Entity Recognition and Disambiguation are two basic operations in this extraction process. One step towards the realization of the Semantic Web vision and the development of highly accurate tools is the availability of data for validating the quality of processes for Named Entity Recognition and Disambiguation as well as for algorithm tuning. This article presents three novel, manually curated and annotated corpora (N3). All of them are based on a free license and stored in the NLP Interchange Format to leverage the Linked Data character of our datasets.

Total citations

Cited by 112

2013201420152016201720182019202020212022202320241 8 5 9 12 13 11 15 12 8 10 7

Scholar articles

N³-A Collection of Datasets for Named Entity Recognition and Disambiguation in the NLP Interchange Format.

M Röder, R Usbeck, S Hellmann, D Gerber, A Both - LREC, 2014