Title:A Three-gene-based Type 1 Diabetes Diagnostic Signature
Volume: 27
Issue: 24
Author(s): Rongrong Wang, Yanan Zhou, Yan Zhang, Shaoqing Li, Runzhou Pan and Yongcai Zhao*
Affiliation:
- Endocrine Diabetes Department, Cangzhou Central Hospital, Cangzhou, Hebei, 061000,China
Keywords:
Type 1 diabetes, bioinformatic analysis, logistic regression model, diagnosis, GO, KEGG.
Abstract:
Background: Type 1 diabetes is a chronic autoimmune disease featured by insulin deprivation
caused by pancreatic β-cell loss, followed by hyperglycaemia.
Objective: Currently, there is no cure for this disease in clinical treatment, and patients have to accept a lifelong
injection of insulin. The exploration of potential diagnosis biomarkers through analysis of mass data by bioinformatics
tools and machine learning is important for type 1 diabetes.
Methods: We collected two mRNA expression datasets of type 1 diabetes peripheral blood samples from GEO,
screened differentially expressed genes (DEGs) by R software, and conducted GO and KEGG pathway enrichment
using the DEGs. Moreover, the STRING database and Cytoscape were used to build PPI network and predict
hub genes. We constructed a logistic regression model by using the hub genes to assess sample type.
Results: Bioinformatic analysis of the GEO dataset revealed 92 and 75 DEGs in GSE50098 and GSE9006 datasets,
separately, and 10 overlapping DEGs. PPI network of these 10 DEGs showed 7 hub genes, namely
EGR1, LTF, CXCL1, TNFAIP6, PGLYRP1, CHI3L1 and CAMP. We built a logistic regression model based
on these hub genes and optimized the model to 3 genes (LTF, CAMP and PGLYRP1) based logistic model.
The values of the area under the curve (AUC) of training set GSE50098 and testing set GSE9006 were 0.8452
and 0.8083, indicating the efficacy of this model.
Conclusion: Integrated bioinformatic analysis of gene expression in type 1 diabetes and the effective logistic regression
model built in our study may provide promising diagnostic methods for type 1 diabetes.