Title:Sia-m7G: Predicting m7G Sites through the Siamese Neural Network with an Attention Mechanism
Volume: 19
Issue: 10
Author(s): Jia Zheng*Yetong Zhou
Affiliation:
- School of Science, Dalian Maritime University, Dalian, 116026, China
Keywords:
N7 methylguanosine, predictor, deep learning, siamese neural network, protein synthesis, genomic sequences.
Abstract:
Background: The chemical modification of RNA plays a crucial role in many biological
processes. N7-methylguanosine (m7G), being one of the most important epigenetic modifications,
plays an important role in gene expression, processing metabolism, and protein synthesis. Detecting
the exact location of m7G sites in the transcriptome is key to understanding their relevant mechanism
in gene expression. On the basis of experimentally validated data, several machine learning or
deep learning tools have been designed to identify internal m7G sites and have shown advantages
over traditional experimental methods in terms of speed, cost-effectiveness and robustness.
Aims: In this study, we aim to develop a computational model to help predict the exact location of
m7G sites in humans.
Objective: Simple and advanced encoding methods and deep learning networks are designed to
achieve excellent m7G prediction efficiently.
Methods: Three types of feature extractions and six classification algorithms were tested to identify
m7G sites. Our final model, named Sia-m7G, adopts one-hot encoding and a delicate Siamese neural
network with an attention mechanism. In addition, multiple 10-fold cross-validation tests were
conducted to evaluate our predictor.
Results: Sia-m7G achieved the highest sensitivity, specificity and accuracy on 10-fold crossvalidation
tests compared with the other six m7G predictors. Nucleotide preference and model visualization
analyses were conducted to strengthen the interpretability of Sia-m7G and provide a further
understanding of m7G site fragments in genomic sequences.
Conclusion: Sia-m7G has significant advantages over other classifiers and predictors, which proves
the superiority of the Siamese neural network algorithm in identifying m7G sites.