Authors
Russell B Davidson, Mark Coletti, Mu Gao, Jerry M Parks, Ada Sedova
Publication date
2022/4/11
Publisher
Oak Ridge National Lab.(ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
Description
The number of proteins predicted for Pseudodesulfovibrio mercurii is 3,446, each of which have five predicted structures from an AlphaFold run, as well as structural alignment results using the TMscore-based structural alignment method within the APoc program. Specifically, AlphaFold outputs the atoms and coordinates of the protein model in human-readable PDB files and quantitative prediction metrics in Python PICKLE files. The 5 models have been ranked based on the predicted TM-score (pTMS), a quantitative confidence metric output by AlphaFold that reports on protein model quality. The top ranked model has undergone an energy minimization calculation to relax and remove any potential clashes in the atomic coordinates. Structural alignment results are stored in two files for each protein; the top ranked model (as discussed above) is used for all alignment analyses. Both are compressed gzip files that, once unpacked, are human readable. The first file is the TMalign score results and contains the quantitative metrics for the top alignments between the predicted structure and experimental structures from the PDB70, a curated non-redundant database of about 80,000 experimental structures developed by the Soding lab. Each data point in this file is directly associated with one experimental structure; PDB ID and brief meta-data about the protein taken from the PDB70 file are reported alongside the quantitative metrics. The second results file contains the raw results associated with each alignment reported in the score results file. Specifically, the translation and rotation arrays for each alignment are provided so that the structural …
Scholar articles