Authors
Asif U Tamuri, Nick Goldman, Mario dos Reis
Publication date
2014/2/14
Journal
Genetics
Volume
197
Issue
5
Pages
257-271
Publisher
Genetics Society of America
Description
We develop a maximum penalized-likelihood (MPL) method to estimate the fitnesses of amino acids and the distribution of selection coefficients (S = 2Ns) in protein-coding genes from phylogenetic data. This improves on a previous maximum-likelihood method. Various penalty functions are used to penalize extreme estimates of the fitnesses, thus correcting overfitting by the previous method. Using a combination of computer simulation and real data analysis, we evaluate the effect of the various penalties on the estimation of the fitnesses and the distribution of S. We show the new method regularizes the estimates of the fitnesses for small, relatively uninformative data sets, but it can still recover the large proportion of deleterious mutations when present in simulated data. Computer simulations indicate that as the number of taxa in the phylogeny or the level of sequence divergence increases, the distribution of …
Total citations
201420152016201720182019202020212022202326371099644