Research Article

Prediction of LncRNA-protein Interactions Using Auto-Encoder, SE-ResNet Models and Transfer Learning

Author(s): Jiang Huiwen and Song Kai*

Volume 13, Issue 2, 2024

Published on: 08 April, 2024

Page: [155 - 165] Pages: 11

DOI: 10.2174/0122115366288068240322064431

Price: $65

Open Access Journals Promotions 2
Abstract

Background: Long non-coding RNA (lncRNA) plays a crucial role in various biological processes, and mutations or imbalances of lncRNAs can lead to several diseases, including cancer, Prader-Willi syndrome, autism, Alzheimer's disease, cartilage-hair hypoplasia, and hearing loss. Understanding lncRNA-protein interactions (LPIs) is vital for elucidating basic cellular processes, human diseases, viral replication, transcription, and plant pathogen resistance. Despite the development of several LPI calculation methods, predicting LPI remains challenging, with the selection of variables and deep learning structure being the focus of LPI research.

Methods: We propose a deep learning framework called AR-LPI, which extracts sequence and secondary structure features of proteins and lncRNAs. The framework utilizes an auto-encoder for feature extraction and employs SE-ResNet for prediction. Additionally, we apply transfer learning to the deep neural network SE-ResNet for predicting small-sample datasets.

Results: Through comprehensive experimental comparison, we demonstrate that the AR-LPI architecture performs better in LPI prediction. Specifically, the accuracy of AR-LPI increases by 2.86% to 94.52%, while the F-value of AR-LPI increases by 2.71% to 94.73%.

Conclusion: Our experimental results show that the overall performance of AR-LPI is better than that of other LPI prediction tools.

Keywords: LncRNA-protein interactions, auto-encoder, SE-ResNet module, sequence feature, secondary structure characteristics, transfer learning, feature extraction, feature selection.

Graphical Abstract
[1]
Shahrouki P, Larsson E. The non-coding oncogene: A case of missing DNA evidence? Front Genet 2012; 3: 170.
[http://dx.doi.org/10.3389/fgene.2012.00170] [PMID: 22988449]
[2]
Faghihi MA, Modarresi F, Khalil AM, et al. Expression of a noncoding RNA is elevated in Alzheimer’s disease and drives rapid feed-forward regulation of β-secretase. Nat Med 2008; 14(7): 723-30.
[http://dx.doi.org/10.1038/nm1784] [PMID: 18587408]
[3]
Khalil AM, Rinn JL. RNA–protein interactions in human health and disease. Semin Cell Dev Biol 2011; 22(4): 359-65.
[http://dx.doi.org/10.1016/j.semcdb.2011.02.016] [PMID: 21333748]
[4]
Li Z, Nagy PD. Diverse roles of host RNA binding proteins in RNA virus replication. RNA Biol 2011; 8(2): 305-15.
[http://dx.doi.org/10.4161/rna.8.2.15391] [PMID: 21505273]
[5]
Sola I, Mateos-Gomez PA, Almazan F, Zuñiga S, Enjuanes L. RNA-RNA and RNA-protein interactions in coronavirus replication and transcription. RNA Biol 2011; 8(2): 237-48.
[http://dx.doi.org/10.4161/rna.8.2.14991] [PMID: 21378501]
[6]
Pancaldi V, Bähler J. In silico characterization and prediction of global protein–mRNA interactions in yeast. Nucleic Acids Res 2011; 39(14): 5826-36.
[http://dx.doi.org/10.1093/nar/gkr160] [PMID: 21459850]
[7]
Muppirala UK, Honavar VG, Dobbs D. Predicting RNA-protein interactions using only sequence information. BMC Bioinformatics 2011; 12(1): 489.
[http://dx.doi.org/10.1186/1471-2105-12-489] [PMID: 22192482]
[8]
Suresh V, Liu L, Adjeroh D, Zhou X. RPI-Pred: Predicting ncRNA-protein interaction using sequence and structural information. Nucleic Acids Res 2015; 43(3): 1370-9.
[http://dx.doi.org/10.1093/nar/gkv020] [PMID: 25609700]
[9]
Li A, Ge M, Zhang Y, Peng C, Wang M. Predicting long noncoding RNA and protein interactions using heterogeneous network model. BioMed Res Int 2015; 2015: 1-11.
[http://dx.doi.org/10.1155/2015/671950] [PMID: 26839884]
[10]
Ge M, Li A, Wang M. A bipartite network-based method for prediction of long non-coding RNA-protein interactions. Genomics Proteomics Bioinformatics 2016; 14(1): 62-71.
[http://dx.doi.org/10.1016/j.gpb.2016.01.004] [PMID: 26917505]
[11]
Yi HC, You ZH, Wang MN, Guo ZH, Wang YB, Zhou JR. RPI-SE: A stacking ensemble learning framework for ncRNA-protein interactions prediction using sequence information. BMC Bioinformatics 2020; 21(1): 60.
[http://dx.doi.org/10.1186/s12859-020-3406-0] [PMID: 32070279]
[12]
Lu Q, Ren S, Lu M, et al. Computational prediction of associations between long non-coding RNAs and proteins. BMC Genomics 2013; 14(1): 651.
[http://dx.doi.org/10.1186/1471-2164-14-651] [PMID: 24063787]
[13]
Fan XN, Zhang SW. LPI-BLS: Predicting lncRNA–protein interactions with a broad learning system-based stacked ensemble classifier. Neurocomputing 2019; 370: 88-93.
[http://dx.doi.org/10.1016/j.neucom.2019.08.084]
[14]
Pan X, Fan YX, Yan J, Shen HB. IPMiner: Hidden ncRNA-protein interaction sequential pattern mining with stacked autoencoder for accurate computational prediction. BMC Genomics 2016; 17(1): 582.
[http://dx.doi.org/10.1186/s12864-016-2931-8] [PMID: 27506469]
[15]
Wang L, You Z-H, Chen X, et al. Computational methods for the prediction of drug-target interactions from drug fingerprints and protein sequences by stacked auto-encoder deep neural network.International Symposium on Bioinformatics Research and Applications. Cham: Springer 2017; pp. 46-58.
[http://dx.doi.org/10.1007/978-3-319-59575-7_5]
[16]
Yang C, Yang L, Zhou M, et al. LncADeep: an ab initio lncRNA identification and functional annotation tool based on deep learning. Bioinformatics 2018; 34(22): 3825-34.
[http://dx.doi.org/10.1093/bioinformatics/bty428] [PMID: 29850816]
[17]
Shen ZA, Luo T, Zhou YK, Yu H, Du PF. NPI-GNN: Predicting ncRNA–protein interactions with deep graph neural networks. Brief Bioinform 2021; 22(5): bbab051.
[http://dx.doi.org/10.1093/bib/bbab051] [PMID: 33822882]
[18]
Li Y, Sun H, Feng S, Zhang Q, Han S, Du W. Capsule-LPI: A LncRNA–protein interaction predicting tool based on a capsule network. BMC Bioinformatics 2021; 22(1): 246.
[http://dx.doi.org/10.1186/s12859-021-04171-y] [PMID: 33985444]
[19]
Zeiler MD, Fergus R. Visualizing and understanding convolutional networks. European Conference on Computer Vision arXiv:13112901 2013.
[20]
Hinton GEJS. Deep belief networks 2009; 4: 5947.
[http://dx.doi.org/10.4249/scholarpedia.5947]
[21]
Williams R, Zipser DJNC. A learning algorithm for continually running fully recurrent neural networks. Appears in Neural Computation 2014; 1: 270-80.
[22]
Masci J, Meier U, Cireşan D. Stacked convolutional auto-encoders for hierarchical feature extraction. International conference on artificial neural networks. Berlin, Heidelberg 2011; 52-9.
[23]
Sabour S, Frosst N, Hinton GE. Dynamic routing between capsules arXiv: 171009829 2017.
[24]
Yuan J, Wu W, Xie C, Zhao G, Zhao Y, Chen R. NPInter v2.0: An updated database of ncRNA interactions. Nucleic Acids Res 2014; 42(D1): D104-8.
[http://dx.doi.org/10.1093/nar/gkt1057] [PMID: 24217916]
[25]
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: Synthetic minority over-sampling technique. J Artif Intell Res 2002; 16: 321-57. [J].
[http://dx.doi.org/10.1613/jair.953]
[26]
Lorenz Ronny, Bernhart Stephan H, zu Siederdissen Höner. ViennaRNA package 2.0. Algorithms Mol Biol 2011; 6(1): 26.
[http://dx.doi.org/10.1186/1748-7188-6-26]
[27]
Frishman D, Argos P. Incorporation of non-local interactions in protein secondary structure prediction from the amino acid sequence. Protein Eng Des Sel 1996; 9(2): 133-42.
[http://dx.doi.org/10.1093/protein/9.2.133] [PMID: 9005434]
[28]
Chou PY. Fasman GDJAiE, Biology RAoM. Prediction of the secondary structure of proteins from their amino acid sequence. Adv Enzymol Relat Areas Mol Biol 1978; 47: 45-148.
[29]
Wang Y, Yao H, Zhao S. Auto-encoder based dimensionality reduction. Neurocomputing 2016; 184: 232-42.
[http://dx.doi.org/10.1016/j.neucom.2015.08.104]
[30]
He K, Zhang X, Ren S. Deep residual learning for image recognition. arXiv: 151203385 2016; 770-8.
[31]
Hu J, Shen L, Sun G. Squeeze-and-excitation networks arXiv: 170901507 2018; 7132-41.
[32]
Akbaripour-Elahabad M, Zahiri J, Rafeh R, Eslami M. Azari MJJoTB. rpiCOOL: A tool for in silico RNA-protein interaction detection using random forest. J Theor Biol 2016; 402: 1-8.
[33]
Zhang SW, Zhang XX, Fan XN, Li WN. LPI-CNNCP: Prediction of lncRNA-protein interactions by using convolutional neural network with the copy-padding trick. Anal Biochem 2020; 601: 113767.
[http://dx.doi.org/10.1016/j.ab.2020.113767] [PMID: 32454029]
[34]
Cody T, Beling PA. A systems theory of transfer learning. IEEE Syst J 2023; 17(1): 26-37. [J].
[http://dx.doi.org/10.1109/JSYST.2022.3224650]
[35]
Pan SJ, Yang Q. A survey on transfer learning. IEEE Trans Knowl Data Eng 2010; 22(10): 1345-59.
[http://dx.doi.org/10.1109/TKDE.2009.191]
[36]
Bengio Y, Courville A, Vincent P. Representation learning: A review and new perspectives. IEEE Trans Pattern Anal Mach Intell 2013; 35(8): 1798-828.
[http://dx.doi.org/10.1109/TPAMI.2013.50] [PMID: 23787338]

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy