6mAPred-MSFF: A Deep Learning Model for Predicting DNA N6-Methyladenine Sites across Species Based on a Multi-Scale Feature Fusion Mechanism
Abstract
:1. Introduction
2. Materials and Methods
2.1. Dataset Collection
2.2. Architecture of 6mAPred-MSFF
2.2.1. Sequence Embedding Module
2.2.2. Feature Extraction Module
2.2.3. Feature Fusion Module
2.2.4. Prediction Module
2.2.5. Evaluation Metrics
3. Results and Discussion
3.1. Performance Comparison on Seven Benchmark Datasets
3.2. Validation on Other Species
3.3. Performance Impact by Different Encoding Schemes and Feature Descriptors
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zuo, Y.; Song, M.; Li, H.; Chen, X.; Cao, P.; Zheng, L.; Cao, G. Analysis of the Epigenetic Signature of Cell Reprogramming by Computational DNA Methylation Profiles. Curr. Bioinform. 2020, 15, 589–599. [Google Scholar] [CrossRef]
- Ratel, D.; Ravanat, J.-L.; Berger, F.; Wion, D. N6-methyladenine: The other methylated base of DNA. BioEssays 2006, 28, 309–315. [Google Scholar] [CrossRef] [Green Version]
- Chen, W.; Yang, H.; Feng, P.; Ding, H.; Lin, H. iDNA4mC: Identifying DNA N4-methylcytosine sites based on nucleotide chemical properties. Bioinformatics 2017, 33, 3518–3523. [Google Scholar] [CrossRef] [PubMed]
- Wei, L.; Su, R.; Luan, S.; Liao, Z.; Manavalan, B.; Zou, Q.; Shi, X. Iterative feature representations improve N4-methylcytosine site prediction. Bioinformatics 2019, 35, 4930–4937. [Google Scholar] [CrossRef] [PubMed]
- Liang, Z.; Shen, L.; Cui, X.; Bao, S.; Geng, Y.; Yu, G.; Liang, F.; Xie, S.; Lu, T.; Gu, X.; et al. DNA N6-adenine methylation in Arabidopsis thaliana. Dev. Cell 2018, 45, 406–416. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Liu, J.; Zhu, Y.; Luo, G.-Z.; Wang, X.; Yue, Y.; Wang, X.; Zong, X.; Chen, K.; Yin, H.; Fu, Y.; et al. Abundant DNA 6mA methylation during early embryogenesis of zebrafish and pig. Nat. Commun. 2016, 7, 13052. [Google Scholar] [CrossRef] [PubMed]
- Yao, B.; Cheng, Y.; Wang, Z.; Li, Y.; Chen, L.; Huang, L.; Zhang, W.; Chen, D.; Wu, H.; Tang, B.; et al. DNA N6-methyladenine is dynamically regulated in the mouse brain following environmental stress. Nat. Commun. 2015, 8, 1122. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhang, G.; Huang, H.; Liu, D.; Cheng, Y.; Liu, X.; Zhang, W.; Yin, R.; Zhang, D.; Zhang, P.; Liu, J.; et al. N6-Methyladenine DNA Modification in Drosophila. Cell 2015, 161, 893–906. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhang, Y.; Kou, C.; Wang, S.; Zhang, Y. Genome-wide Differential-based Analysis of the Relationship between DNA Methylation and Gene Expression in Cancer. Curr. Bioinform. 2019, 14, 783–792. [Google Scholar] [CrossRef]
- Zhou, C.; Wang, C.; Liu, H.; Zhou, Q.; Liu, Q.; Guo, Y.; Peng, T.; Song, J.; Zhang, J.; Chen, L.; et al. Identification and analysis of adenine N6-methylation sites in the rice genome. Nat. Plants 2018, 4, 554–563. [Google Scholar] [CrossRef]
- Zhang, Q.; Liang, Z.; Cui, X.; Ji, C.; Li, Y.; Zhang, P.; Liu, J.; Riaz, A.; Yao, P.; Liu, M.; et al. N6-Methyladenine DNA Methylation in Japonica and Indica Rice Genomes and Its Association with Gene Expression, Plant Development, and Stress Responses. Mol. Plant 2018, 11, 1492–1508. [Google Scholar] [CrossRef] [Green Version]
- Xiao, C.-L.; Zhu, S.; He, M.; Chen, D.; Zhang, Q.; Chen, Y.; Yu, G.; Liu, J.; Xie, S.-Q.; Luo, F.; et al. N6-Methyladenine DNA Modification in the Human Genome. Mol. Cell 2018, 71, 1–13. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhou, C.; Liu, Y.; Li, X.; Zou, J.; Zou, S. DNA N6-methyladenine demethylase ALKBH1 enhances osteogenic differentiation of human MSCs. Bone Res. 2016, 4, 16033. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Xie, Q.; Wu, T.P.; Gimple, R.C.; Li, Z.; Prager, B.C.; Wu, Q.; Yu, Y.; Wang, P.; Wang, Y.; Gorkin, D.U.; et al. N6-methyladenine DNA Modification in Glioblastoma. Cell 2018, 175, 306–318. [Google Scholar] [CrossRef] [Green Version]
- Pomraning, K.R.; Smith, K.M.; Freitag, M. Genome-wide high throughput analysis of DNA methylation in eukaryotes. Methods 2009, 47, 142–150. [Google Scholar] [CrossRef]
- Krais, A.M.; Cornelius, M.G.; Schmeiser, H.H. Genomic N6-methyladenine determination by MEKC with LIF. Electrophoresis 2010, 31, 3548–3551. [Google Scholar] [CrossRef]
- Flusberg, B.A.; Webster, D.R.; Lee, J.H.; Travers, K.J.; Olivares, E.C.; Clark, T.A.; Korlach, J.; Turner, S.W. Direct detection of dnA methylation during single-molecule, real-time sequencing. Nat. Methods 2010, 7, 461–465. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Greer, E.L.; Blanco, M.A.; Gu, L.; Sendinc, E.; Liu, J.; Aristizábal-Corrales, D.; Hsu, C.-H.; Aravind, L.; He, C.; Shi, Y. DNA Methylation on N6 Adenine in C. elegans. Cell 2015, 161, 868–878. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ao, C.; Yu, L.; Zou, Q. Prediction of bio-sequence modifications and the associations with diseases. Brief. Funct. Genom. 2021, 20, 1–18. [Google Scholar] [CrossRef] [PubMed]
- Chen, W.; Lv, H.; Nie, F.; Lin, H. i6mA-Pred: Identifying DNA N6-methyladenine sites in the rice genome. Bioinformatics 2019, 35, 2796–2800. [Google Scholar] [CrossRef]
- Pian, C.; Zhang, G.; Li, F.; Fan, X. MM-6mAPred: Identifying DNA N6-methyladenine sites based on Markov Model. Bioinformatics 2019, 36, 388–392. [Google Scholar] [CrossRef] [PubMed]
- Basith, S.; Manavalan, B.; Shin, T.H.; Lee, G. SDM6A: A Web-Based Integrative Machine-Learning Framework for Predicting 6mA Sites in the Rice Genome. Mol. Ther. Nucleic Acids 2019, 18, 131–141. [Google Scholar] [CrossRef] [Green Version]
- Lv, H.; Dao, F.-Y.; Guan, Z.-X.; Zhang, D.; Tan, J.-X.; Zhang, Y.; Chen, W.; Lin, H. iDNA6mA-Rice: A Computational Tool for Detecting N6-Methyladenine Sites in Rice. Front. Genet. 2019, 10, 793. [Google Scholar] [CrossRef] [PubMed]
- Chen, Y.; Ma, T.; Yang, X.; Wang, J.; Song, B.; Zeng, X. MUFFIN: Multi-scale feature fusion for drug–drug interaction prediction. Bioinformatics 2021, 10, 793. [Google Scholar]
- Jin, S.; Zeng, X.; Xia, F.; Huang, W.; Liu, X. Application of deep learning methods in biological networks. Brief. Bioinform. 2021, 22, 1902–1917. [Google Scholar] [CrossRef]
- Min, X.; Ye, C.; Liu, X.; Zeng, X. Predicting enhancer-promoter interactions by deep learning and matching heuristic. Brief. Bioinform. 2021, 22, bbaa254. [Google Scholar] [CrossRef] [PubMed]
- Zeng, X.; Zhu, S.; Lu, W.; Liu, Z.; Huang, J.; Zhou, Y.; Fang, J.; Huang, Y.; Guo, H.; Li, L.; et al. Target identification among known drugs by deep learning from heterogeneous networks. Chem. Sci. 2020, 11, 1775–1797. [Google Scholar] [CrossRef] [Green Version]
- Zeng, X.; Song, X.; Ma, T.; Pan, X.; Zhou, Y.; Hou, Y.; Zhang, Z.; Li, K.; Karypis, G.; Cheng, F. Repurpose open data to discover therapeutics for COVID-19 using deep learning. J. Proteome Res. 2020, 19, 4624–4636. [Google Scholar] [CrossRef]
- Zhang, Y.; Yan, J.; Chen, S.; Gong, M.; Gao, D.; Zhu, M.; Gan, W. Review of the Applications of Deep Learning in Bioinformatics. Curr. Bioinform. 2020, 15, 898–911. [Google Scholar] [CrossRef]
- Zeng, X.; Lin, Y.; He, Y.; Lv, L.; Min, X. Deep collaborative filtering for prediction of disease genes. IEEE/ACM Trans. Comput. Biol. Bioinform. 2020, 17, 1639–1647. [Google Scholar] [CrossRef]
- Du, Z.; Xiao, X.; Uversky, V.N. Classification of Chromosomal DNA Sequences Using Hybrid Deep Learning Architectures. Curr. Bioinform. 2020, 15, 1130–1136. [Google Scholar] [CrossRef]
- Tahir, M.; Tayara, H.; Chong, K.T. iDNA6mA (5-step rule): Identification of DNA N6-methyladenine sites in the rice genome by intelligent computational model via Chou’s 5-step rule. Chemom. Intell. Lab. Syst. 2019, 189, 96–101. [Google Scholar] [CrossRef]
- Yu, H.; Dai, Z. SNNRice6mA: A Deep Learning Method for Predicting DNA N6-Methyladenine Sites in Rice Genome. Front. Genet. 2019, 10, 1071. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Li, Z.; Jiang, H.; Kong, L.; Chen, Y.; Lang, K.; Fan, X.; Zhang, L.; Pian, C. Deep6mA: A deep learning framework for exploring similar patterns in DNA N6-methyladenine sites across different species. Plos Comput. Biol. 2021, 17, e1008767. [Google Scholar] [CrossRef] [PubMed]
- Ye, P.; Luan, Y.; Chen, K.; Liu, Y.; Xiao, C.; Xie, Z. MethSMRT: An integrative database for DNA N6-methyladenine and N4-methylcytosine generated by single-molecular real-time sequencing. Nucleic Acids Res. 2017, 45, D85–D89. [Google Scholar] [CrossRef] [Green Version]
- Liu, Z.-Y.; Xing, J.-F.; Chen, W.; Luan, M.-W.; Xie, R.; Huang, J.; Xie, S.-Q.; Xiao, C.-L. MDR: An integrative DNA N6-methyladenine and N4-methylcytosine modification database for Rosaceae. Hortic. Res. 2019, 6, 1–7. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Li, W.; Godzik, A. Cd-hit: A fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 2006, 22, 1658–1659. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Lv, H.; Dao, F.-Y.; Zhang, D. iDNA-MS: An Integrated Computational Tool for Detecting DNA Modification Sites in Multiple Genomes. iScience 2020, 23, 100991. [Google Scholar] [CrossRef] [PubMed]
- Sharma, A.K.; Srivastava, R. Protein Secondary Structure Prediction Using Character bi-gram Embedding and Bi-LSTM. Curr. Bioinform. 2021, 16, 333–338. [Google Scholar] [CrossRef]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.-C. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520. [Google Scholar]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
- Yang, Q.; Wu, J.; Zhao, J.; Xu, T.; Han, P.; Song, X. The Expression Profiles of lncRNAs and Their Regulatory Network During Smek1/2 Knockout Mouse Neural Stem Cells Differentiation. Curr. Bioinform. 2020, 15, 77–88. [Google Scholar] [CrossRef]
- Zhang, X.; Zhou, X.; Lin, M.; Sun, J. Shufflenet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Geete, K.; Pandey, M. Robust Transcription Factor Binding Site Prediction Using Deep Neural Networks. Curr. Bioinform. 2020, 15, 1137–1152. [Google Scholar] [CrossRef]
- Fu, K.; Fan, D.-P.; Ji, G.-P.; Zhao, Q. JLDCF: Joint Learning and Densely-Cooperative Fusion Framework for RGB-D Salient Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 3049–3059. [Google Scholar]
- Fan, D.-P.; Wang, W.; Cheng, M.-M.; Shen, J. Shifting More Attention to Video Salient Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 8554–8564. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Llion Jones, A.N.G.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; pp. 5998–6008. [Google Scholar]
- Bello, I.; Zoph, B.; Vaswani, A.; Shlens, J.; Le, Q.V. Attention Augmented Convolutional Networks. In Proceedings of the 2019 IEEE International Conference on Computer Vision (ICCV), Seoul, Korea, 27 October–3 November 2019; pp. 3286–3295. [Google Scholar]
- Fu, J.; Liu, J.; Tian, H.; Li, Y.; Bao, Y.; Fang, Z.; Lu, H. Dual Attention Network for Scene Segmentation. In Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 3146–3154. [Google Scholar]
- Wang, X.; Girshick, R.B.; Gupta, A.; He, K. Non-Local Neural Networks. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 19–23 June 2018; pp. 7794–7803. [Google Scholar]
- Ma, X.S.; Xi, B.H.; Zhang, Y.; Zhu, L.J.; Sui, X.; Tian, G.; Yang, J.L. A Machine Learning-based Diagnosis of Thyroid Cancer Using Thyroid Nodules Ultrasound Images. Curr. Bioinform. 2020, 15, 349–358. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 19–23 June 2018; pp. 7132–7141. [Google Scholar]
- Dai, Y.; Gieseke, F.; Oehmcke, S.; Wu, Y.; Barnard, K. Attentional Features Fusion. In Proceedings of the 2021 Winter Conference on Applications of Computer Vision, Waikola, HI, USA, 5–9 January 2021. [Google Scholar]
- Ioffe, S.; Szegedy, C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In Proceedings of the 32nd International Conference on Machine Learning (ICML), Lille, France, 6–11 July 2015; pp. 448–456. [Google Scholar]
- Naseer, S.; Hussain, W.; Khan, Y.D.; Rasool, N. NPalmitoylDeep-pseaac: A predictor of N-Palmitoylation Sites in Proteins Using Deep Representations of Proteins and PseAAC via Modified 5-Steps Rule. Curr. Bioinform. 2021, 16, 294–305. [Google Scholar] [CrossRef]
- Greff, K.; Srivastava, R.K.; Koutnik, J.; Steunebrink, B.R.; Schmidhuber, J. LSTM: A Search Space Odyssey. IEEE Trans. Neural Netw. Learn. Syst. 2016, 28, 2222–2232. [Google Scholar] [CrossRef] [Green Version]
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Nasir, M.A.; Nawaz, S.; Huang, J. A Mini-review of Computational Approaches to Predict Functions and Findings of Novel Micro Peptides. Curr. Bioinform. 2020, 15, 1027–1035. [Google Scholar] [CrossRef]
- Wang, X.-F.; Gao, P.; Liu, Y.-F.; Li, H.-F.; Lu, F. Predicting Thermophilic Proteins by Machine Learning. Curr. Bioinform. 2020, 15, 493–502. [Google Scholar] [CrossRef]
- Guo, Z.; Wang, P.; Liu, Z.; Zhao, Y. Discrimination of Thermophilic Proteins and Non-thermophilic Proteins Using Feature Dimension Reduction. Front. Bioeng. Biotechnol. 2020, 8, 584807. [Google Scholar] [CrossRef]
- Zhao, X.; Wang, H.; Li, H.; Wu, Y.; Wang, G. Identifying Plant Pentatricopeptide Repeat Proteins Using a Variable Selection Method. Front. Plant Sci. 2021, 12, 298. [Google Scholar]
- Tao, Z.; Li, Y.; Teng, Z.; Zhao, Y. A Method for Identifying Vesicle Transport Proteins Based on LibSVM and MRMD. Comput. Math. Methods Med. 2020, 2020, 8926750. [Google Scholar] [CrossRef]
- Zhai, Y.; Chen, Y.; Teng, Z.; Zhao, Y. Identifying Antioxidant Proteins by Using Amino Acid Composition and Protein-Protein Interactions. Front. Cell Dev. Biol. 2020, 8, 591487. [Google Scholar] [CrossRef] [PubMed]
- Wei, L.; Liao, M.; Gao, Y.; Ji, R.; He, Z.; Zou, Q. Improved and Promising Identification of Human MicroRNAs by Incorporating a High-Quality Negative Set. IEEE/ACM Trans. Comput. Biol. Bioinform. 2014, 11, 192–201. [Google Scholar] [CrossRef]
- Wei, L.; Tang, J.; Zou, Q. Local-DPP: An improved DNA-binding protein prediction method by exploring local evolutionary information. Inf. Sci. 2017, 384, 135–144. [Google Scholar] [CrossRef]
- Wei, L.; Wan, S.; Guo, J.; Wong, K.K.L. A novel hierarchical selective ensemble classifier with bioinformatics application. Artif. Intell. Med. 2017, 83, 82–90. [Google Scholar] [CrossRef]
- Wei, L.; Xing, P.; Shi, G.; Ji, Z.; Zou, Q. Fast Prediction of Protein Methylation Sites Using a Sequence-Based Feature Selection Technique. IEEE/ACM Trans. Comput. Biol. Bioinform. 2019, 16, 1264–1273. [Google Scholar] [CrossRef]
- Wei, L.; Xing, P.; Zeng, J.; Chen, J.; Su, R.; Guo, F. Improved prediction of protein-protein interactions using novel negative samples, features, and an ensemble classifier. Artif. Intell. Med. 2017, 83, 67–74. [Google Scholar] [CrossRef] [PubMed]
- Wei, L.; Zhou, C.; Chen, H.; Song, J.; Su, R. ACPred-FL: A sequence-based predictor using effective feature representation to improve the prediction of anti-cancer peptides. Bioinformatics 2018, 34, 4007–4016. [Google Scholar] [CrossRef] [PubMed]
- Hong, Z.; Zeng, X.; Wei, L.; Liu, X. Identifying enhancer-promoter interactions with neural network based on pre-trained DNA vectors and attention mechanism. Bioinformatics 2020, 36, 1037–1043. [Google Scholar] [CrossRef] [PubMed]
- Jin, Q.; Meng, Z.; Tuan, D.P.; Chen, Q.; Wei, L.; Su, R. DUNet: A deformable network for retinal vessel segmentation. Knowl.-Based Syst. 2019, 178, 149–162. [Google Scholar] [CrossRef] [Green Version]
- Manavalan, B.; Basith, S.; Shin, T.H.; Wei, L.; Lee, G. Meta-4mCpred: A Sequence-Based Meta-Predictor for Accurate DNA 4mC Site Prediction Using Effective Feature Representation. Mol. Ther. Nucleic Acids 2019, 16, 733–744. [Google Scholar] [CrossRef] [Green Version]
- Manayalan, B.; Basith, S.; Shin, T.H.; Wei, L.; Lee, G. mAHTPred: A sequence-based meta-predictor for improving the prediction of anti-hypertensive peptides using effective feature representation. Bioinformatics 2019, 35, 2757–2765. [Google Scholar] [CrossRef]
- Qiang, X.; Zhou, C.; Ye, X.; Du, P.-f.; Su, R.; Wei, L. CPPred-FL: A sequence-based predictor for large-scale identification of cell-penetrating peptides by feature representation learning. Brief. Bioinform. 2020, 21, 11–23. [Google Scholar] [CrossRef]
- Su, R.; Hu, J.; Zou, Q.; Manavalan, B.; Wei, L. Empirical comparison and analysis of web-based cell-penetrating peptide prediction tools. Brief. Bioinform. 2020, 21, 408–420. [Google Scholar] [CrossRef]
- Su, R.; Liu, X.; Wei, L. MinE-RFE: Determine the optimal subset from RFE by minimizing the subset-accuracy-defined energy. Brief. Bioinform. 2020, 21, 687–698. [Google Scholar] [CrossRef]
- Su, R.; Liu, X.; Wei, L.; Zou, Q. Deep-Resp-Forest: A deep forest model to predict anti-cancer drug response. Methods 2019, 166, 91–102. [Google Scholar] [CrossRef] [PubMed]
- Su, R.; Liu, X.; Xiao, G.; Wei, L. Meta-GDBP: A high-level stacked regression model to improve anticancer drug response prediction. Brief. Bioinform. 2020, 21, 996–1005. [Google Scholar] [CrossRef] [PubMed]
- Su, R.; Wu, H.; Xu, B.; Liu, X.; Wei, L. Developing a Multi-Dose Computational Model for Drug-Induced Hepatotoxicity Prediction Based on Toxicogenomics Data. IEEE/ACM Trans. Comput. Biol. Bioinform. 2019, 16, 1231–1239. [Google Scholar] [CrossRef] [PubMed]
- Wei, L.; Chen, H.; Su, R. M6APred-EL: A Sequence-Based Predictor for Identifying N6-methyladenosine Sites Using Ensemble Learning. Mol. Ther.-Nucleic Acids 2018, 12, 635–644. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wei, L.; Ding, Y.; Su, R.; Tang, J.; Zou, Q. Prediction of human protein subcellular localization using deep learning. J. Parallel Distrib. Comput. 2018, 117, 212–217. [Google Scholar] [CrossRef]
- Wei, L.; Hu, J.; Li, F.; Song, J.; Su, R.; Zou, Q. Comparative analysis and prediction of quorum-sensing peptides using feature representation learning and machine learning algorithms. Brief. Bioinform. 2020, 21, 106–119. [Google Scholar] [CrossRef] [PubMed]
- Jin, S.; Zeng, X.; Fang, J.; Lin, J.; Chan, S.Y.; Erzurum, S.C.; Cheng, F. A network-based approach to uncover microRNA-mediated disease comorbidities and potential pathobiological implications. NPJ Syst. Biol. Appl. 2019, 5, 41. [Google Scholar] [CrossRef]
- Wei, L.; Su, R.; Wang, B.; Li, X.; Zou, Q.; Gao, X. Integration of deep feature representations and handcrafted features to improve the prediction of N-6-methyladenosine sites. Neurocomputing 2019, 324, 3–9. [Google Scholar] [CrossRef]
- Zou, Q.; Xing, P.; Wei, L.; Liu, B. Gene2vec: Gene subsequence embedding for prediction of mammalian N6-methyladenosine sites from mRNA. RNA 2019, 25, 205–218. [Google Scholar] [CrossRef] [Green Version]
- Dai, C.; Feng, P.; Cui, L.; Su, R.; Chen, W.; Wei, L. Iterative feature representation algorithm to improve the predictive performance of N7-methylguanosine sites. Brief. Bioinform. 2020, 22, bbaa278. [Google Scholar] [CrossRef] [PubMed]
- Wei, L.; He, W.; Malik, A.; Su, R.; Cui, L.; Manavalan, B. Computational prediction and interpretation of cell-specific replication origin sites from multiple eukaryotes by exploiting stacking framework. Brief. Bioinform. 2020, 22, bbaa275. [Google Scholar] [CrossRef] [PubMed]
- Zhao, X.; Jiao, Q.; Li, H.; Wu, Y.; Wang, H.; Huang, S.; Wang, G. ECFS-DEA: An ensemble classifier-based feature selection for differential expression analysis on expression profiles. BMC Bioinform. 2020, 21, 43. [Google Scholar] [CrossRef]
- Zeng, X.; Zhu, S.; Liu, X.; Zhou, Y.; Nussinov, R.; Cheng, F.J.B. deepDR: A network-based deep learning approach to in silico drug repositioning. Bioinformatics 2019, 35, 5191–5198. [Google Scholar] [CrossRef]
- Fu, X.; Cai, L.; Zeng, X.; Zou, Q. StackCPPred: A stacking and pairwise energy content-based prediction of cell-penetrating peptides and their uptake efficiency. Bioinformatics 2020, 36, 3028–3034. [Google Scholar] [CrossRef]
- Liu, Y.; Zhang, X.; Zou, Q.; Zeng, X.J.B. Minirmd: Accurate and fast duplicate removal tool for short reads via multiple minimizers. Bioinformatics 2020, 37, 1604–1606. [Google Scholar] [CrossRef]
Dataset | Positives | Negatives | Total |
---|---|---|---|
6mA-rice-Lv | 154,000 | 154,000 | 300,800 |
6mA-rice-Chen | 880 | 880 | 1760 |
A. thaliana | 15,937 | 15,937 | 31,874 |
R. chinensis | 11,815 | 11,815 | 23,630 |
F. vesca | 22,700 | 22,700 | 45,400 |
H. sapiens | 18,335 | 18,335 | 36,670 |
D. melanogaster | 11,191 | 11,191 | 22,382 |
Benchmark Dataset | Method | SN (%) | SP (%) | ACC (%) | MCC | AUC |
---|---|---|---|---|---|---|
6mA-rice-Lv | 6mAPred-MSFF | 97.88 | 94.64 | 96.26 | 0.926 | 0.989 |
Deep6mA | 94.73 | 93.72 | 94.23 | 0.885 | 0.981 | |
SNNRice6mA-large | 94.97 | 92.22 | 93.59 | 0.872 | 0.978 | |
MM-6mAPred | 93.27 | 88.84 | 91.05 | 0.822 | 0.955 | |
6mA-rice-Chen | 6mAPred-MSFF | 96.59 | 89.43 | 93.01 | 0.862 | 0.976 |
Deep6mA | 86.70 | 93.75 | 90.22 | 0.807 | 0.958 | |
SNNRice6mA-large | 87.61 | 86.70 | 87.16 | 0.743 | 0.942 | |
MM-6mAPred | 87.50 | 88.86 | 88.18 | 0.764 | 0.943 | |
A. thaliana | 6mAPred-MSFF | 89.92 | 87.85 | 88.88 | 0.778 | 0.954 |
Deep6mA | 84.72 | 89.52 | 87.12 | 0.743 | 0.942 | |
SNNRice6mA-large | 85.28 | 88.15 | 86.71 | 0.735 | 0.939 | |
MM-6mAPred | 82.62 | 88.82 | 83.82 | 0.664 | 0.904 | |
R. chinensis | 6mAPred-MSFF | 95.61 | 96.56 | 96.09 | 0.922 | 0.988 |
Deep6mA | 94.97 | 96.66 | 95.81 | 0.917 | 0.987 | |
SNNRice6mA-large | 91.96 | 94.55 | 93.25 | 0.865 | 0.977 | |
MM-6mAPred | 92.27 | 90.72 | 91.50 | 0.830 | 0.959 | |
F. vesca | 6mAPred-MSFF | 98.79 | 98.44 | 98.62 | 0.972 | 0.998 |
Deep6mA | 98.35 | 97.10 | 98.48 | 0.970 | 0.998 | |
SNNRice6mA-large | 97.72 | 97.65 | 97.72 | 0.955 | 0.996 | |
MM-6mAPred | 95.13 | 94.72 | 94.93 | 0.899 | 0.980 | |
H. sapiens | 6mAPred-MSFF | 94.65 | 94.91 | 94.78 | 0.896 | 0.985 |
Deep6mA | 90.16 | 95.18 | 92.67 | 0.854 | 0.985 | |
SNNRice6mA-large | 88.62 | 93.56 | 91.09 | 0.823 | 0.972 | |
MM-6mAPred | 88.08 | 87.22 | 87.65 | 0.753 | 0.937 | |
D. melanogaster | 6mAPred-MSFF | 97.08 | 96.72 | 96.90 | 0.938 | 0.990 |
Deep6mA | 95.12 | 95.04 | 95.08 | 0.902 | 0.984 | |
SNNRice6mA-large | 94.64 | 93.36 | 94.00 | 0.880 | 0.982 | |
MM-6mAPred | 90.65 | 89.94 | 90.30 | 0.806 | 0.955 |
Species | Method | SN (%) | SP (%) | ACC (%) | MCC | AUC |
---|---|---|---|---|---|---|
R. chinensis | 6mAPred-MSFF | 90.66 | 94.26 | 92.46 | 0.850 | 0.966 |
Deep6mA | 88.87 | 95.29 | 92.07 | 0.843 | 0.966 | |
SNNRice6mA-large | 89.79 | 94.13 | 91.96 | 0.840 | 0.964 | |
MM-6mAPred | 88.51 | 91.35 | 89.93 | 0.799 | 0.949 | |
F. vesca | 6mAPred-MSFF | 93.95 | 95.15 | 94.55 | 0.891 | 0.982 |
Deep6mA | 92.52 | 96.22 | 94.37 | 0.888 | 0.982 | |
SNNRice6mA-large | 93.26 | 95.15 | 94.20 | 0.884 | 0.980 | |
MM-6mAPred | 90.33 | 93.48 | 91.91 | 0.839 | 0.965 | |
H. sapiens | 6mAPred-MSFF | 72.51 | 93.33 | 82.92 | 0.673 | 0.884 |
Deep6mA | 68.82 | 94.44 | 81.63 | 0.655 | 0.881 | |
SNNRice6mA-large | 70.93 | 93.22 | 82.07 | 0.658 | 0.894 | |
MM-6mAPred | 71.85 | 88.67 | 80.26 | 0.614 | 0.873 | |
D. melanogaster | 6mAPred-MSFF | 74.30 | 92.62 | 83.46 | 0.680 | 0.903 |
Deep6mA | 68.81 | 93.78 | 81.30 | 0.646 | 0.901 | |
SNNRice6mA-large | 70.73 | 92.27 | 81.50 | 0.645 | 0.906 | |
MM-6mAPred | 64.96 | 88.84 | 76.90 | 0.554 | 0.868 | |
A. thaliana | 6mAPred-MSFF | 58.66 | 94.28 | 76.47 | 0.567 | 0.837 |
Deep6mA | 54.56 | 95.26 | 74.91 | 0.545 | 0.834 | |
SNNRice6mA-large | 57.92 | 94.11 | 76.01 | 0.558 | 0.853 | |
MM-6mAPred | 57.14 | 91.44 | 74.30 | 0.517 | 0.845 |
Method | SN (%) | SP (%) | ACC (%) | MCC | AUC |
---|---|---|---|---|---|
6mAPred-MSFF | 97.88 | 94.64 | 96.26 | 0.926 | 0.989 |
1-gram | 97.15 | 93.68 | 95.42 | 0.909 | 0.987 |
2-grams | 95.63 | 93.56 | 94.59 | 0.892 | 0.983 |
3-grams | 96.10 | 93.60 | 94.85 | 0.897 | 0.984 |
1-gram and 2-grams | 97.05 | 94.67 | 95.86 | 0.917 | 0.988 |
1-gram and 3-grams | 95.97 | 93.49 | 94.73 | 0.895 | 0.983 |
2-grams and 3-grams | 95.77 | 94.51 | 95.14 | 0.903 | 0.985 |
Feature | Method | SN (%) | SP (%) | ACC (%) | MCC | AUC |
---|---|---|---|---|---|---|
Kmer (1mer + 2mer + 3mer) | 6mAPred-MSFF | 67.17 | 64.94 | 66.08 | 0.321 | 0.719 |
Deep6mA | 69.00 | 60.35 | 64.67 | 0.294 | 0.700 | |
RF | 66.96 | 62.33 | 64.65 | 0.293 | 0.697 | |
ANN | 66.87 | 62.35 | 64.61 | 0.293 | 0.697 | |
NCP | 6mAPred-MSFF | 96.05 | 94.56 | 95.31 | 0.906 | 0.986 |
Deep6mA | 95.82 | 92.95 | 94.38 | 0.888 | 0.983 | |
RF | 92.90 | 91.51 | 92.20 | 0.844 | 0.966 | |
ANN | 92.92 | 91.53 | 92.22 | 0.845 | 0.966 | |
EIIP | 6mAPred-MSFF | 95.35 | 91.10 | 93.22 | 0.865 | 0.975 |
Deep6mA | 94.20 | 88.77 | 91.48 | 0.831 | 0.966 | |
RF | 92.56 | 91.21 | 91.89 | 0.838 | 0.964 | |
ANN | 92.61 | 91.15 | 91.88 | 0.838 | 0.964 | |
ENAC | 6mAPred-MSFF | 95.34 | 91.89 | 93.62 | 0.872 | 0.979 |
Deep6mA | 95.35 | 90.29 | 92.82 | 0.858 | 0.972 | |
RF | 90.82 | 91.16 | 90.99 | 0.820 | 0.959 | |
ANN | 90.80 | 91.17 | 90.99 | 0.820 | 0.959 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zeng, R.; Liao, M. 6mAPred-MSFF: A Deep Learning Model for Predicting DNA N6-Methyladenine Sites across Species Based on a Multi-Scale Feature Fusion Mechanism. Appl. Sci. 2021, 11, 7731. https://doi.org/10.3390/app11167731
Zeng R, Liao M. 6mAPred-MSFF: A Deep Learning Model for Predicting DNA N6-Methyladenine Sites across Species Based on a Multi-Scale Feature Fusion Mechanism. Applied Sciences. 2021; 11(16):7731. https://doi.org/10.3390/app11167731
Chicago/Turabian StyleZeng, Rao, and Minghong Liao. 2021. "6mAPred-MSFF: A Deep Learning Model for Predicting DNA N6-Methyladenine Sites across Species Based on a Multi-Scale Feature Fusion Mechanism" Applied Sciences 11, no. 16: 7731. https://doi.org/10.3390/app11167731
APA StyleZeng, R., & Liao, M. (2021). 6mAPred-MSFF: A Deep Learning Model for Predicting DNA N6-Methyladenine Sites across Species Based on a Multi-Scale Feature Fusion Mechanism. Applied Sciences, 11(16), 7731. https://doi.org/10.3390/app11167731