Authors
Roman Klinger, Corinna Kolářik, Juliane Fluck, Martin Hofmann-Apitius, Christoph M Friedrich
Publication date
2008/7/1
Journal
Bioinformatics
Volume
24
Issue
13
Pages
i268-i276
Publisher
Oxford University Press
Description
Motivation: Chemical compounds like small signal molecules or other biological active chemical substances are an important entity class in life science publications and patents. Several representations and nomenclatures for chemicals like SMILES, InChI, IUPAC or trivial names exist. Only SMILES and InChI names allow a direct structure search, but in biomedical texts trivial names and Iupac like names are used more frequent. While trivial names can be found with a dictionary-based approach and in such a way mapped to their corresponding structures, it is not possible to enumerate all IUPAC names. In this work, we present a new machine learning approach based on conditional random fields (CRF) to find mentions of IUPAC and IUPAC-like names in scientific text as well as its evaluation and the conversion rate with available name-to-structure tools.
Results: We present an IUPAC name …
Total citations
2008200920102011201220132014201520162017201820192020202120222023202429914151972519311737785
Scholar articles
R Klinger, C Kolářik, J Fluck, M Hofmann-Apitius… - Bioinformatics, 2008