Authors
Robert Gaizauskas, George Demetriou, Peter J. Artymiuk, Peter Willett
Publication date
2003/1
Journal
Bioinformatics
Volume
19
Issue
1
Pages
135-143
Publisher
Oxford University Press
Description
Motivation: The rapid increase in volume of protein structure literature means useful information may be hidden or lost in the published literature and the process of finding relevant material, sometimes the rate-determining factor in new research, may be arduous and slow.
Results: We describe the Protein Active Site Template Acquisition (PASTA) system, which addresses these problems by performing automatic extraction of information relating to the roles of specific amino acid residues in protein molecules from online scientific articles and abstracts. Both the terminology recognition and extraction capabilities of the system have been extensively evaluated against manually annotated data and the results compare favourably with state-of-the-art results obtained in less challenging domains. PASTA is the first information extraction (IE) system developed for the protein structure domain and one of the most …
Total citations
2002200320042005200620072008200920102011201220132014201520162017201820192020202120222023282420221518181210171110886625463
Scholar articles