View article

[HTML] from sciencedirect.com

Building a semantically annotated corpus of clinical texts

Authors

Angus Roberts, Robert Gaizauskas, Mark Hepple, George Demetriou, Yikun Guo, Ian Roberts, Andrea Setzer

Publication date

2009/10/1

Journal

Journal of biomedical informatics

Volume

Issue

Pages

950-966

Publisher

Academic Press

Description

In this paper, we describe the construction of a semantically annotated corpus of clinical texts for use in the development and evaluation of systems for automatically extracting clinically significant information from the textual component of patient records. The paper details the sampling of textual material from a collection of 20,000 cancer patient records, the development of a semantic annotation scheme, the annotation methodology, the distribution of annotations in the final corpus, and the use of the corpus for development of an adaptive information extraction system. The resulting corpus is the most richly semantically annotated resource for clinical text processing built to date, whose value has been demonstrated through its use in developing an effective information extraction system. The detailed presentation of our corpus construction and annotation methodology will be of value to others seeking to build high …

Total citations

Cited by 201

2009201020112012201320142015201620172018201920202021202220234 3 15 21 10 17 18 11 17 16 12 11 16 12 16

Scholar articles

Building a semantically annotated corpus of clinical texts

A Roberts, R Gaizauskas, M Hepple, G Demetriou… - Journal of biomedical informatics, 2009