Authors
Anders Gorm Pedersen, Henrik Nielsen
Publication date
1997/6/21
Journal
Ismb
Volume
5
Pages
226-233
Description
Translation in eukaryotes does not always start at the rst AUG in an mRNA, implying that context information also plays a role. This makes prediction of translation initiation sites a non-trivial task, especially when analysing EST and genome data where the entire mature mRNA sequence is not known. In this paper, we employ arti cial neural networks to predict which AUG triplet in an mRNA sequence is the start codon. The trained networks correctly classi ed 88% of Arabidopsis and 85% of vertebrate AUG triplets. We nd that our trained neural networks use a combination of local start codon context and global sequence information. Furthermore, analysis of false predictions shows that AUGs in frame with the actual start codon are more frequently selected than out-of-frame AUGs, suggesting that our networks use reading frame detection. A number of con icts between neural network predictions and database annotations are analysed in detail, leading to identi cation of possible database errors.
Total citations
199619971998199920002001200220032004200520062007200820092010201120122013201420152016201720182019202020212022202320241151214112219262620332322161510161481191391010992