Bioinformatics-aided Protein Sequence Analysis and Engineering

Wei      Zhang; Tianwen      Wang

doi:10.2174/1389203724666230509124300

Abstract

Most of the currently available knowledge about protein structure and function has been obtained from laboratory experiments. As a complement to this classical knowledge discovery activity, bioinformatics-assisted sequence analysis, which relies primarily on biological data manipulation, is becoming an indispensable option for the modern discovery of new knowledge, especially when large amounts of protein-encoding sequences can be easily identified from the annotation of highthroughput genomic data. Here, we review the advances in bioinformatics-assisted protein sequence analysis to highlight how bioinformatics analysis will aid in understanding protein structure and function. We first discuss the analyses with individual protein sequences as input, from which some basic parameters of proteins (e.g., amino acid composition, MW and PTM) can be predicted. In addition to these basic parameters that can be directly predicted by analyzing a protein sequence alone, many predictions are based on principles drawn from knowledge of many well-studied proteins, with multiple sequence comparisons as input. Identification of conserved sites by comparing multiple homologous sequences, prediction of the folding, structure or function of uncharacterized proteins, construction of phylogenies of related sequences, analysis of the contribution of conserved related sites to protein function by SCA or DCA, elucidation of the significance of codon usage, and extraction of functional units from protein sequences and coding spaces belong to this category. We then discuss the revolutionary invention of the "QTY code" that can be applied to convert membrane proteins into water- soluble proteins but at the cost of marginal introduced structural and functional changes. As machine learning has been done in other scientific fields, machine learning has profoundly impacted protein sequence analysis. In summary, we have highlighted the relevance of the bioinformatics-assisted analysis for protein research as a valuable guide for laboratory experiments.

Keywords: Protein sequence, structure and function, bioinformatics, evolution, knowledge discovery, QTY code.

« Previous Next »

Graphical Abstract

[1]
Anfinsen, C.B. Principles that govern the folding of protein chains. Science,  1973, 181(4096), 223-230.
 [http://dx.doi.org/10.1126/science.181.4096.223] [PMID:  4124164]

[2]
Standing, K. Peptide and protein de novo sequencing by mass spectrometry. Curr. Opin. Struct. Biol.,  2003, 13(5), 595-601.
 [http://dx.doi.org/10.1016/j.sbi.2003.09.005] [PMID:  14568614]

[3]
Liu, X.; Dekker, L.J.M.; Wu, S.; Vanduijn, M.M.; Luider, T.M.; Tolić, N.; Kou, Q.; Dvorkin, M.; Alexandrova, S.; Vyatkina, K.; Paša-Tolić, L.; Pevzner, P.A. De novo protein sequencing by combining top-down and bottom-up tandem mass spectra. J. Proteome Res.,  2014, 13(7), 3241-3248.
 [http://dx.doi.org/10.1021/pr401300m] [PMID:  24874765]

[4]
Gooley, A.A.; Ou, K.; Russell, J.; Wilkins, M.R.; Sanchez, J.C.; Hochstrasser, D.F.; Williams, K.L. A role for Edman degradation in proteome studies. Electrophoresis,  1997, 18(7), 1068-1072.
 [http://dx.doi.org/10.1002/elps.1150180707] [PMID:  9237557]

[5]
Steinegger, M.; Mirdita, M.; Söding, J. Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold. Nat. Methods,  2019, 16(7), 603-606.
 [http://dx.doi.org/10.1038/s41592-019-0437-4] [PMID:  31235882]

[6]
Zhang, C.; Kim, S.H. Overview of structural genomics: from structure to function. Curr. Opin. Chem. Biol.,  2003, 7(1), 28-32.
 [http://dx.doi.org/10.1016/S1367-5931(02)00015-7] [PMID:  12547423]

[7]
Guigo, R.; de Hoon, M. Recent advances in functional genome analysis. F1000 Res.,  2018, 7, 1968.
 [http://dx.doi.org/10.12688/f1000research.15274.1] [PMID:  30613379]

[8]
Macek, B.; Forchhammer, K.; Hardouin, J.; Weber-Ban, E.; Grangeasse, C.; Mijakovic, I. Protein post-translational modifications in bacteria. Nat. Rev. Microbiol.,  2019, 17(11), 651-664.
 [http://dx.doi.org/10.1038/s41579-019-0243-0] [PMID:  31485032]

[9]
Ardito, F.; Giuliani, M.; Perrone, D.; Troiano, G.; Muzio, L.L. The crucial role of protein phosphorylation in cell signaling and its use as targeted therapy (Review). Int. J. Mol. Med.,  2017, 40(2), 271-280.
 [http://dx.doi.org/10.3892/ijmm.2017.3036] [PMID:  28656226]

[10]
Reily, C.; Stewart, T.J.; Renfrow, M.B.; Novak, J. Glycosylation in health and disease. Nat. Rev. Nephrol.,  2019, 15(6), 346-366.
 [http://dx.doi.org/10.1038/s41581-019-0129-4] [PMID:  30858582]

[11]
Eldeeb, M.A.; Fahlman, R.P.; Ragheb, M.A.; Esmaili, M. Does N-terminal protein acetylation lead to protein degradation? BioEssays,  2019, 41(11), 1800167.
 [http://dx.doi.org/10.1002/bies.201800167] [PMID:  31549739]

[12]
Do, D.T.; Le, T.Q.T.; Le, N.Q.K. Using deep neural networks and biological subwords to detect protein S-sulfenylation sites. Brief. Bioinform.,  2021, 22(3), bbaa128.
 [http://dx.doi.org/10.1093/bib/bbaa128] [PMID:  32613242]

[13]
Wang, R.; Wang, Z.; Wang, H.; Pang, Y.; Lee, T.Y. Characterization and identification of lysine crotonylation sites based on machine learning method on both plant and mammalian. Sci. Rep.,  2020, 10(1), 20447.
 [http://dx.doi.org/10.1038/s41598-020-77173-0] [PMID:  33235255]

[14]
Wedemeyer, W.J.; Welker, E.; Narayan, M.; Scheraga, H.A. Disulfide bonds and protein folding. Biochemistry,  2000, 39(15), 4207-4216.
 [http://dx.doi.org/10.1021/bi992922o] [PMID:  10757967]

[15]
Ghisaidoobe, A.; Chung, S. Intrinsic tryptophan fluorescence in the detection and analysis of proteins: a focus on Förster resonance energy transfer techniques. Int. J. Mol. Sci.,  2014, 15(12), 22518-22538.
 [http://dx.doi.org/10.3390/ijms151222518] [PMID:  25490136]

[16]
Dyson, H.J.; Wright, P.E. Intrinsically unstructured proteins and their functions. Nat. Rev. Mol. Cell Biol.,  2005, 6(3), 197-208.
 [http://dx.doi.org/10.1038/nrm1589] [PMID:  15738986]

[17]
Wright, P.E.; Dyson, H.J. Intrinsically disordered proteins in cellular signalling and regulation. Nat. Rev. Mol. Cell Biol.,  2015, 16(1), 18-29.
 [http://dx.doi.org/10.1038/nrm3920] [PMID:  25531225]

[18]
Hebditch, M.; Carballo-Amador, M.A.; Charonis, S.; Curtis, R.; Warwicker, J. Protein–Sol: A web tool for predicting protein solubility from sequence. Bioinformatics,  2017, 33(19), 3098-3100.
 [http://dx.doi.org/10.1093/bioinformatics/btx345] [PMID:  28575391]

[19]
Eraslan, G.; Avsec, Ž.; Gagneur, J.; Theis, F.J. Deep learning: New computational modelling techniques for genomics. Nat. Rev. Genet.,  2019, 20(7), 389-403.
 [http://dx.doi.org/10.1038/s41576-019-0122-6] [PMID:  30971806]

[20]
Yang, K.K.; Wu, Z.; Arnold, F.H. Machine-learning-guided directed evolution for protein engineering. Nat. Methods,  2019, 16(8), 687-694.
 [http://dx.doi.org/10.1038/s41592-019-0496-6] [PMID:  31308553]

[21]
Liu, X.; Wang, M.; Li, A. PhosVarDeep: Deep-learning based prediction of phospho-variants using sequence information. PeerJ,  2022, 10, e12847.
 [http://dx.doi.org/10.7717/peerj.12847] [PMID:  35310161]

[22]
Kaur, U.; Meng, H.; Lui, F.; Ma, R.; Ogburn, R.N.; Johnson, J.H.R.; Fitzgerald, M.C.; Jones, L.M. Proteome-wide structural biology: An emerging field for the structural analysis of proteins on the proteomic scale. J. Proteome Res.,  2018, 17(11), 3614-3627.
 [http://dx.doi.org/10.1021/acs.jproteome.8b00341] [PMID:  30222357]

[23]
Buermans, H.P.J.; den Dunnen, J.T. Next generation sequencing technology: Advances and applications. Biochim. Biophys. Acta Mol. Basis Dis.,  2014, 1842(10), 1932-1941.
 [http://dx.doi.org/10.1016/j.bbadis.2014.06.015] [PMID:  24995601]

[24]
Gil, N.; Fiser, A. Identifying functionally informative evolutionary sequence profiles. Bioinformatics,  2018, 34(8), 1278-1286.
 [http://dx.doi.org/10.1093/bioinformatics/btx779] [PMID:  29211823]

[25]
Neuwald, A.F. Gleaning structural and functional information from correlations in protein multiple sequence alignments. Curr. Opin. Struct. Biol.,  2016, 38, 1-8.
 [http://dx.doi.org/10.1016/j.sbi.2016.04.006] [PMID:  27179293]

[26]
Chowdhury, B.; Garai, G. A review on multiple sequence alignment from the perspective of genetic algorithm. Genomics,  2017, 109(5-6), 419-431.
 [http://dx.doi.org/10.1016/j.ygeno.2017.06.007] [PMID:  28669847]

[27]
Pirovano, W.; Heringa, J. Multiple sequence alignment. Methods Mol. Biol.,  2008, 452, 143-161.
 [http://dx.doi.org/10.1007/978-1-60327-159-2_7] [PMID:  18566763]

[28]
Wang, T.; Liang, C.; Hou, Y.; Zheng, M.; Xu, H.; An, Y.; Xiao, S.; Liu, L.; Lian, S. Small design from big alignment: Engineering proteins with multiple sequence alignment as the starting point. Biotechnol. Lett.,  2020, 42(8), 1305-1315.
 [http://dx.doi.org/10.1007/s10529-020-02914-0] [PMID:  32430802]

[29]
Kc, D.B. Recent advances in sequence-based protein structure prediction. Brief. Bioinform.,  2017, 18(6), 1021-1032.
 [PMID:  27562963]

[30]
Ashkenazy, H.; Sela, I.; Levy Karin, E.; Landan, G.; Pupko, T. Multiple sequence alignment averaging improves phylogeny reconstruction. Syst. Biol.,  2019, 68(1), 117-130.
 [http://dx.doi.org/10.1093/sysbio/syy036] [PMID:  29771363]

[31]
Davidson, A.R. Multiple sequence alignment as a guideline for protein engineering strategies. Methods Mol. Biol.,  2006, 340, 171-182.
 [http://dx.doi.org/10.1385/1-59745-116-9:171] [PMID:  16957337]

[32]
Gerber, S.A.; Kettenbach, A.N. Metabolic phosphatase moonlights for proteins. Nat. Cell Biol.,  2022, 24(11), 1568-1570.
 [http://dx.doi.org/10.1038/s41556-022-00993-x] [PMID:  36266487]

[33]
Jeffery, C. J. Protein moonlighting: What is it, and why is it important? Philos Trans R Soc Lond B Biol Sci,  2018, 373(1738), 0160523.
 [http://dx.doi.org/10.1098/rstb.2016.0523]

[34]
Ziveri, J.; Tros, F.; Guerrera, I.C.; Chhuon, C.; Audry, M.; Dupuis, M.; Barel, M.; Korniotis, S.; Fillatreau, S.; Gales, L.; Cahoreau, E.; Charbit, A. The metabolic enzyme fructose-1,6-bisphosphate aldolase acts as a transcriptional regulator in pathogenic Francisella. Nat. Commun.,  2017, 8(1), 853.
 [http://dx.doi.org/10.1038/s41467-017-00889-7] [PMID:  29021545]

[35]
Bhattarai-Kline, S.; Lear, S.K.; Fishman, C.B.; Lopez, S.C.; Lockshin, E.R.; Schubert, M.G.; Nivala, J.; Church, G.M.; Shipman, S.L. Recording gene expression order in DNA by CRISPR addition of retron barcodes. Nature,  2022, 608(7921), 217-225.
 [http://dx.doi.org/10.1038/s41586-022-04994-6] [PMID:  35896746]

[36]
Martí-Renom, M.A.; Stuart, A.C.; Fiser, A.; Sánchez, R.; Melo, F.; Šali, A. Comparative protein structure modeling of genes and genomes. Annu. Rev. Biophys. Biomol. Struct.,  2000, 29(1), 291-325.
 [http://dx.doi.org/10.1146/annurev.biophys.29.1.291] [PMID:  10940251]

[37]
Thornton, J.W. Resurrecting ancient genes: Experimental analysis of extinct molecules. Nat. Rev. Genet.,  2004, 5(5), 366-375.
 [http://dx.doi.org/10.1038/nrg1324] [PMID:  15143319]

[38]
Chandler, P.G.; Broendum, S.S.; Riley, B.T.; Spence, M.A.; Jackson, C.J.; McGowan, S.; Buckle, A.M. Strategies for increasing protein stability. Methods Mol. Biol.,  2020, 2073, 163-181.
 [http://dx.doi.org/10.1007/978-1-4939-9869-2_10] [PMID:  31612442]

[39]
Nicoll, C.R.; Bailleul, G.; Fiorentini, F.; Mascotti, M.L.; Fraaije, M.W.; Mattevi, A. Ancestral-sequence reconstruction unveils the structural basis of function in mammalian FMOs. Nat. Struct. Mol. Biol.,  2020, 27(1), 14-24.
 [http://dx.doi.org/10.1038/s41594-019-0347-2] [PMID:  31873300]

[40]
Schupfner, M.; Straub, K.; Busch, F.; Merkl, R.; Sterner, R. Analysis of allosteric communication in a multienzyme complex by ancestral sequence reconstruction. Proc. Natl. Acad. Sci. USA,  2020, 117(1), 346-354.
 [http://dx.doi.org/10.1073/pnas.1912132117] [PMID:  31871208]

[41]
Thomas, A.; Cutlan, R.; Finnigan, W.; van der Giezen, M.; Harmer, N. Highly thermostable carboxylic acid reductases generated by ancestral sequence reconstruction. Commun. Biol.,  2019, 2(1), 429.
 [http://dx.doi.org/10.1038/s42003-019-0677-y] [PMID:  31799431]

[42]
Tang, H.; Zhang, P.; Luo, X. Recent Technologies for genetic code expansion and their implications on synthetic biology applications. J. Mol. Biol.,  2022, 434(8), 167382.
 [http://dx.doi.org/10.1016/j.jmb.2021.167382] [PMID:  34863778]

[43]
Wang, T.; Liang, C.; Xu, H.; An, Y.; Xiao, S.; Zheng, M.; Liu, L.; Nie, L. Incorporation of nonstandard amino acids into proteins: Principles and applications. World J. Microbiol. Biotechnol.,  2020, 36(4), 60.
 [http://dx.doi.org/10.1007/s11274-020-02837-y] [PMID:  32266578]

[44]
Cocco, S.; Monasson, R.; Weigt, M. From principal component to direct coupling analysis of coevolution in proteins: low-eigenvalue modes are needed for structure prediction. PLOS Comput. Biol.,  2013, 9(8), e1003176.
 [http://dx.doi.org/10.1371/journal.pcbi.1003176] [PMID:  23990764]

[45]
Callaway, D.J.E.; Bu, Z. Visualizing the nanoscale: Protein internal dynamics and neutron spin echo spectroscopy. Curr. Opin. Struct. Biol.,  2017, 42, 1-5.
 [http://dx.doi.org/10.1016/j.sbi.2016.10.001] [PMID:  27756047]

[46]
Rivoire, O.; Reynolds, K.A.; Ranganathan, R. Evolution-based functional decomposition of proteins. PLOS Comput. Biol.,  2016, 12(6), e1004817.
 [http://dx.doi.org/10.1371/journal.pcbi.1004817] [PMID:  27254668]

[47]
Morcos, F.; Pagnani, A.; Lunt, B.; Bertolino, A.; Marks, D.S.; Sander, C.; Zecchina, R.; Onuchic, J.N.; Hwa, T.; Weigt, M. Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proc. Natl. Acad. Sci. USA,  2011, 108(49), E1293-E1301.
 [http://dx.doi.org/10.1073/pnas.1111471108] [PMID:  22106262]

[48]
Salinas, V.H.; Ranganathan, R. Coevolution-based inference of amino acid interactions underlying protein function. eLife,  2018, 7, e34300.
 [http://dx.doi.org/10.7554/eLife.34300] [PMID:  30024376]

[49]
Jiao, W.; Fan, Y.; Blackmore, N.J.; Parker, E.J. A single amino acid substitution uncouples catalysis and allostery in an essential biosynthetic enzyme in Mycobacterium tuberculosis. J. Biol. Chem.,  2020, 295(19), 6252-6262.
 [http://dx.doi.org/10.1074/jbc.RA120.012605] [PMID:  32217694]

[50]
Wu, Z.; Liu, H.; Xu, L.; Chen, H.F.; Feng, Y. Algorithm‐based coevolution network identification reveals key functional residues of the α/β hydrolase subfamilies. FASEB J.,  2020, 34(2), 1983-1995.
 [http://dx.doi.org/10.1096/fj.201900948RR] [PMID:  31907985]

[51]
Sutto, L.; Marsili, S.; Valencia, A.; Gervasio, F.L. From residue coevolution to protein conformational ensembles and functional dynamics. Proc. Natl. Acad. Sci. USA,  2015, 112(44), 13567-13572.
 [http://dx.doi.org/10.1073/pnas.1508584112] [PMID:  26487681]

[52]
Kamisetty, H.; Ovchinnikov, S.; Baker, D. Assessing the utility of coevolution-based residue–residue contact predictions in a sequence- and structure-rich era. Proc. Natl. Acad. Sci. USA,  2013, 110(39), 15674-15679.
 [http://dx.doi.org/10.1073/pnas.1314045110] [PMID:  24009338]

[53]
Malinverni, D.; Marsili, S.; Barducci, A.; De Los Rios, P. Large-scale conformational transitions and dimerization are encoded in the amino-acid sequences of Hsp70 chaperones. PLOS Comput. Biol.,  2015, 11(6), e1004262.
 [http://dx.doi.org/10.1371/journal.pcbi.1004262] [PMID:  26046683]

[54]
Neuwald, A.F.; Altschul, S.F. Inference of functionally-relevant N-acetyltransferase residues based on statistical correlations. PLOS Comput. Biol.,  2016, 12(12), e1005294.
 [http://dx.doi.org/10.1371/journal.pcbi.1005294] [PMID:  28002465]

[55]
Tondnevis, F.; Dudenhausen, E.E.; Miller, A.M.; McKenna, R.; Altschul, S.F.; Bloom, L.B.; Neuwald, A.F. Deep Analysis of Residue Constraints (DARC): Identifying determinants of protein functional specificity. Sci. Rep.,  2020, 10(1), 1691.
 [http://dx.doi.org/10.1038/s41598-019-55118-6] [PMID:  32015389]

[56]
Wang, L.Y. Covariation analysis of local amino acid sequences in recurrent protein local structures. J. Bioinform. Comput. Biol.,  2005, 3(6), 1391-1409.
 [http://dx.doi.org/10.1142/S0219720005001648] [PMID:  16374913]

[57]
Huang, Y.; Bonett, S.; Kloczkowski, A.; Jernigan, R.; Wu, Z. Statistical measures on residue-level protein structural properties. J. Struct. Funct. Genomics,  2011, 12(2), 119-136.
 [http://dx.doi.org/10.1007/s10969-011-9104-4] [PMID:  21452025]

[58]
Wang, S.; Wei, W.; Zheng, Y.; Hou, J.; Dou, Y.; Zhang, S.; Luo, X.; Cai, X. The role of insulin C-peptide in the coevolution analyses of the insulin signaling pathway: A hint for its functions. PLoS One,  2012, 7(12), e52847.
 [http://dx.doi.org/10.1371/journal.pone.0052847] [PMID:  23300796]

[59]
Scott, L.H.; Mathews, J.C.; Filipovska, A.; Rackham, O. Building artificial genetic circuits to understand protein function. Methods Enzymol.,  2020, 633, 231-250.
 [http://dx.doi.org/10.1016/bs.mie.2019.11.003] [PMID:  32046848]

[60]
Scott, L.H.; Mathews, J.C.; Flematti, G.R.; Filipovska, A.; Rackham, O. An artificial yeast genetic circuit enables deep mutational scanning of an antimicrobial resistance protein. ACS Synth. Biol.,  2018, 7(8), 1907-1917.
 [http://dx.doi.org/10.1021/acssynbio.8b00121] [PMID:  29979580]

[61]
Sander, I.M.; Chaney, J.L.; Clark, P.L. Expanding Anfinsen’s principle: Contributions of synonymous codon selection to rational protein design. J. Am. Chem. Soc.,  2014, 136(3), 858-861.
 [http://dx.doi.org/10.1021/ja411302m] [PMID:  24392935]

[62]
Komar, A.A. A pause for thought along the co-translational folding pathway. Trends Biochem. Sci.,  2009, 34(1), 16-24.
 [http://dx.doi.org/10.1016/j.tibs.2008.10.002] [PMID:  18996013]

[63]
O’Brien, E.P.; Ciryam, P.; Vendruscolo, M.; Dobson, C.M. Understanding the influence of codon translation rates on cotranslational protein folding. Acc. Chem. Res.,  2014, 47(5), 1536-1544.
 [http://dx.doi.org/10.1021/ar5000117] [PMID:  24784899]

[64]
Uddin, A.; Paul, N.; Chakraborty, S. The codon usage pattern of genes involved in ovarian cancer. Ann. N. Y. Acad. Sci.,  2019, 1440(1), 67-78.
 [http://dx.doi.org/10.1111/nyas.14019] [PMID:  30843242]

[65]
Brar, G.A. Beyond the triplet code: Context cues transform translation. Cell,  2016, 167(7), 1681-1692.
 [http://dx.doi.org/10.1016/j.cell.2016.09.022] [PMID:  27984720]

[66]
Dinman, J.D. Translational recoding signals: Expanding the synthetic biology toolbox. J. Biol. Chem.,  2019, 294(19), 7537-7545.
 [http://dx.doi.org/10.1074/jbc.REV119.006348] [PMID:  30936208]

[67]
Hussain, S.; Rasool, S.T. Analysis of synonymous codon usage in Zika virus. Acta Trop.,  2017, 173, 136-146.
 [http://dx.doi.org/10.1016/j.actatropica.2017.06.006] [PMID:  28606821]

[68]
Shen, X.; Song, S.; Li, C.; Zhang, J. Synonymous mutations in representative yeast genes are mostly strongly non-neutral. Nature,  2022, 606(7915), 725-731.
 [http://dx.doi.org/10.1038/s41586-022-04823-w] [PMID:  35676473]

[69]
Groß, M. Linguistic analysis of protein folding. FEBS Lett.,  1996, 390(3), 249-252.
 [http://dx.doi.org/10.1016/0014-5793(96)00727-2] [PMID:  8706870]

[70]
Searls, D.B. The language of genes. Nature,  2002, 420(6912), 211-217.
 [http://dx.doi.org/10.1038/nature01255] [PMID:  12432405]

[71]
Motomura, K.; Fujita, T.; Tsutsumi, M.; Kikuzato, S.; Nakamura, M.; Otaki, J.M. Word decoding of protein amino Acid sequences with availability analysis: A linguistic approach. PLoS One,  2012, 7(11), e50039.
 [http://dx.doi.org/10.1371/journal.pone.0050039] [PMID:  23185527]

[72]
Laurie, J.; Chattopadhyay, A.K.; Flower, D.R. Protein lipograms. J. Theor. Biol.,  2017, 430, 109-116.
 [http://dx.doi.org/10.1016/j.jtbi.2017.07.009] [PMID:  28716385]

[73]
Chou, K.C. Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins,  2001, 43(3), 246-255.
 [http://dx.doi.org/10.1002/prot.1035] [PMID:  11288174]

[74]
Yu, C.; Deng, M.; Cheng, S.Y.; Yau, S.C.; He, R.L.; Yau, S.S.T. Protein space: A natural method for realizing the nature of protein universe. J. Theor. Biol.,  2013, 318, 197-204.
 [http://dx.doi.org/10.1016/j.jtbi.2012.11.005] [PMID:  23154188]

[75]
Yau, S.S.T.; Mao, W.G.; Benson, M.; He, R.L. Distinguishing proteins from arbitrary amino acid sequences. Sci. Rep.,  2015, 5(1), 7972.
 [http://dx.doi.org/10.1038/srep07972] [PMID:  25609314]

[76]
Callaway, E. ‘It will change everything’: DeepMind’s AI makes gigantic leap in solving protein structures. Nature,  2020, 588(7837), 203-204.
 [http://dx.doi.org/10.1038/d41586-020-03348-4] [PMID:  33257889]

[77]
Dauparas, J.; Anishchenko, I.; Bennett, N.; Bai, H.; Ragotte, R.J.; Milles, L.F.; Wicky, B.I.M.; Courbet, A.; de Haas, R.J.; Bethel, N.; Leung, P.J.Y.; Huddy, T.F.; Pellock, S.; Tischer, D.; Chan, F.; Koepnick, B.; Nguyen, H.; Kang, A.; Sankaran, B.; Bera, A.K.; King, N.P.; Baker, D. Robust deep learning–based protein sequence design using ProteinMPNN. Science,  2022, 378(6615), 49-56.
 [http://dx.doi.org/10.1126/science.add2187] [PMID:  36108050]

[78]
Ding, W.; Nakai, K.; Gong, H. Protein design via deep learning. Brief. Bioinform.,  2022, 23(3), bbac102.
 [http://dx.doi.org/10.1093/bib/bbac102] [PMID:  35348602]

[79]
Kuhlman, B.; Bradley, P. Advances in protein structure prediction and design. Nat. Rev. Mol. Cell Biol.,  2019, 20(11), 681-697.
 [http://dx.doi.org/10.1038/s41580-019-0163-x] [PMID:  31417196]

[80]
Cambray, G.; Guimaraes, J.C.; Arkin, A.P. Evaluation of 244,000 synthetic sequences reveals design principles to optimize translation in Escherichia coli. Nat. Biotechnol.,  2018, 36(10), 1005-1015.
 [http://dx.doi.org/10.1038/nbt.4238] [PMID:  30247489]

[81]
Zhang, S.; Tao, F.; Qing, R.; Tang, H.; Skuhersky, M.; Corin, K.; Tegler, L.; Wassie, A.; Wassie, B.; Kwon, Y.; Suter, B.; Entzian, C.; Schubert, T.; Yang, G.; Labahn, J.; Kubicek, J.; Maertens, B. QTY code enables design of detergent-free chemokine receptors that retain ligand-binding activities. Proc. Natl. Acad. Sci. USA,  2018, 115(37), E8652-E8659.
 [http://dx.doi.org/10.1073/pnas.1811031115] [PMID:  30154163]

[82]
Qing, R.; Tao, F.; Chatterjee, P.; Yang, G.; Han, Q.; Chung, H.; Ni, J.; Suter, B.P.; Kubicek, J.; Maertens, B.; Schubert, T.; Blackburn, C.; Zhang, S. Non-full-length water-soluble CXCR4QTY and CCR5QTY chemokine receptors: Implication for overlooked truncated but functional membrane receptors. iScience,  2020, 23(12), 101670.
 [http://dx.doi.org/10.1016/j.isci.2020.101670] [PMID:  33376963]

[83]
Qing, R.; Han, Q.; Skuhersky, M.; Chung, H.; Badr, M.; Schubert, T.; Zhang, S. QTY code designed thermostable and water-soluble chimeric chemokine receptors with tunable ligand affinity. Proc. Natl. Acad. Sci. USA,  2019, 116(51), 25668-25676.
 [http://dx.doi.org/10.1073/pnas.1909026116] [PMID:  31776256]

[84]
Hie, B.L.; Yang, K.K. Adaptive machine learning for protein engineering. Curr. Opin. Struct. Biol.,  2022, 72, 145-152.
 [http://dx.doi.org/10.1016/j.sbi.2021.11.002] [PMID:  34896756]

[85]
Freschlin, C.R.; Fahlberg, S.A.; Romero, P.A. Machine learning to navigate fitness landscapes for protein engineering. Curr. Opin. Biotechnol.,  2022, 75, 102713.
 [http://dx.doi.org/10.1016/j.copbio.2022.102713] [PMID:  35413604]

[86]
Gaur, N.K.; Goyal, V.D.; Kulkarni, K.; Makde, R.D. Machine learning classifiers aid virtual screening for efficient design of mini-protein therapeutics. Bioorg. Med. Chem. Lett.,  2021, 38, 127852.
 [http://dx.doi.org/10.1016/j.bmcl.2021.127852] [PMID:  33609660]

[87]
Mardikoraem, M.; Woldring, D. Machine learning-driven protein library design: A path toward smarter libraries. Methods Mol. Biol.,  2022, 2491, 87-104.
 [http://dx.doi.org/10.1007/978-1-0716-2285-8_5] [PMID:  35482186]

[88]
AlQuraishi, M. Machine learning in protein structure prediction. Curr. Opin. Chem. Biol.,  2021, 65, 1-8.
 [http://dx.doi.org/10.1016/j.cbpa.2021.04.005] [PMID:  34015749]

[89]
Jisna, V.A.; Jayaraj, P.B. Protein structure prediction: Conventional and deep learning perspectives. Protein J.,  2021, 40(4), 522-544.
 [http://dx.doi.org/10.1007/s10930-021-10003-y] [PMID:  34050498]

[90]
Chen, J.; Siu, S.W.I. Machine learning approaches for quality assessment of protein structures. Biomolecules,  2020, 10(4), 626.
 [http://dx.doi.org/10.3390/biom10040626] [PMID:  32316682]

[91]
Yang, C.; Chen, E.A.; Zhang, Y. Protein-ligand docking in the machine-learning era. Molecules,  2022, 27(14), 4568.
 [http://dx.doi.org/10.3390/molecules27144568] [PMID:  35889440]

[92]
Noé, F.; De Fabritiis, G.; Clementi, C. Machine learning for protein folding and dynamics. Curr. Opin. Struct. Biol.,  2020, 60, 77-84.
 [http://dx.doi.org/10.1016/j.sbi.2019.12.005] [PMID:  31881449]

[93]
Avery, C.; Patterson, J.; Grear, T.; Frater, T.; Jacobs, D.J. Protein function analysis through machine learning. Biomolecules,  2022, 12(9), 1246.
 [http://dx.doi.org/10.3390/biom12091246] [PMID:  36139085]

[94]
Annunziato, G.; Costantino, G. Antimicrobial peptides (AMPs): A patent review (2015–2020). Expert Opin. Ther. Pat.,  2020, 30(12), 931-947.
 [http://dx.doi.org/10.1080/13543776.2020.1851679] [PMID:  33187458]

[95]
Browne, K.; Chakraborty, S.; Chen, R.; Willcox, M.D.P.; Black, D.S.; Walsh, W.R.; Kumar, N. A new era of antibiotics: The clinical potential of antimicrobial peptides. Int. J. Mol. Sci.,  2020, 21(19), 7047.
 [http://dx.doi.org/10.3390/ijms21197047] [PMID:  32987946]

[96]
Carratalá, J.V.; Serna, N.; Villaverde, A.; Vázquez, E.; Ferrer-Miralles, N. Nanostructured antimicrobial peptides: The last push towards clinics. Biotechnol. Adv.,  2020, 44, 107603.
 [http://dx.doi.org/10.1016/j.biotechadv.2020.107603] [PMID:  32738381]

[97]
Tornesello, A.L.; Borrelli, A.; Buonaguro, L.; Buonaguro, F.M.; Tornesello, M.L. Antimicrobial peptides as anticancer agents: Functional properties and biological activities. Molecules,  2020, 25(12), 2850.
 [http://dx.doi.org/10.3390/molecules25122850] [PMID:  32575664]

[98]
Plisson, F.; Ramírez-Sánchez, O.; Martínez-Hernández, C. Machine learning-guided discovery and design of non-hemolytic peptides. Sci. Rep.,  2020, 10(1), 16581.
 [http://dx.doi.org/10.1038/s41598-020-73644-6] [PMID:  33024236]

[99]
Makigaki, S.; Ishida, T. Sequence alignment using machine learning for accurate template-based protein structure prediction. Bioinformatics,  2020, 36(1), 104-111.
 [http://dx.doi.org/10.1093/bioinformatics/btz483] [PMID:  31197318]

Rights & Permissions Print Cite

Article Metrics

60

4

Journal Information

For Authors

For Editors

For Reviewers

Explore Articles

Open Access

Open Access Articles

For Visitors

DOI https://dx.doi.org/10.2174/1389203724666230509124300	Print ISSN 1389-2037
Publisher Name Bentham Science Publisher	Online ISSN 1875-5550

Current Protein & Peptide Science

Bioinformatics-aided Protein Sequence Analysis and Engineering

Abstract

Graphical Abstract

Advancements in Proteomic and Peptidomic Approaches in Cancer Immunotherapy: Unveiling the Immune Microenvironment

Artificial Intelligence for Protein Research

Nutrition and Metabolism in Musculoskeletal Diseases

Protein Folding, Aggregation and Liquid-Liquid Phase Separation

Current Protein & Peptide Science

Bioinformatics-aided Protein Sequence Analysis and Engineering

Abstract

Graphical Abstract

Call for Papers in Thematic Issues

Advancements in Proteomic and Peptidomic Approaches in Cancer Immunotherapy: Unveiling the Immune Microenvironment

Artificial Intelligence for Protein Research

Nutrition and Metabolism in Musculoskeletal Diseases

Protein Folding, Aggregation and Liquid-Liquid Phase Separation

Related Journals

Related Books

Related Articles