Authors
Xi Han, Liheng Zhang, Kang Zhou, Xiaonan Wang
Publication date
2019/12/5
Journal
Computers & Chemical Engineering
Volume
131
Pages
106533
Publisher
Pergamon
Description
Protein solubility plays a critical role in improving production yield of recombinant proteins in biocatalysis applications. To some extent, protein solubility can represent the function and activity of biocatalysts which are mainly composed of recombinant proteins. In literature, many machine learning models have been investigated to predict protein solubility from protein sequence, whereas parameters of those models were underdetermined with insufficient data of protein solubility. Here we propose a deep neural network (DNN) as a more accurate regression predictive model. Moreover, to tackle the insufficient data problem, a novel data augmentation algorithm, Protein Solubility Generative Adversarial Nets (ProGAN), was proposed for improving the prediction of protein solubility. After adding mimic data produced from ProGAN, the prediction performance measured by R2 was improved compared with that without …
Total citations
202020212022202320242810138
Scholar articles