Authors
Nai-Wen Chang, Jitendra Jonnagaddala, Feng-Duo Wang, Hong-Jie Dai
Publication date
2017
Journal
Proceedings of the BioCreative VI Challenge and Workshop (October 18–20). DoubleTree by Hilton Hotel, Bethesda, Maryland, USA
Pages
28-31
Description
The detection of organism mentions in scientific literature facilitates researchers with the ability to find relevant subsets of papers based on species-specific queries. Furthermore, most biological articles will describe pathways or regulation information in figure captions to enhance the understanding of experimental results. The extraction of miRNA and organism from figure captions is useful in characterizing the research studies. In this study, we adopted openly available organism recognition tools and our statistical principle-based miRNA recognizer for identifying organism and miRNA mentions in figure captions of an article. The miRNA recognizer is extended by generating scores for matched slots and indexes for matched terms to normalize recognized miRNAs to identifiers in the Rfam database. We study the performance of the existing tools in recognizing terms in figure captions and the challenges remained to address by evaluating them on the BioCreative VI Bio-ID dataset. We believe the Bio-ID corpus provide a nice starting point for evaluating the performance of miRNA normalization system. In the future, we would like to undertake more comprehensive evaluation of existing tools for extraction of organism/species and would like to enhance the consistency and comprehensiveness of miRNA annotations in the dataset.
Total citations
Scholar articles
NW Chang, J Jonnagaddala, FD Wang, HJ Dai - Proceedings of the BioCreative VI Challenge and …, 2017