Authors
Richard Eckart, Elke Teich
Publication date
2007
Journal
Data Structures for Linguistic Resources and Applications. Gunter Narr, Tübingen, Germany
Description
We present an XML-based data model that is deployed in a system for querying corpora with multiple layers of linguistic annotation. The model is based upon the simple, but effective idea of leaving each layer of annotation intact at annotation time and only relate the layers to each other at query time. Queries select parts of the layers or of the text and then use interval operations based on stand-off anchors to relate the results to each other. The queries are performed by the XQuery engine of a native XML database which has been extended with custom functions for interval operations and for access of the annotated text.
Scholar articles
R Eckart, E Teich - Data Structures for Linguistic Resources and …, 2007