New Multi-View Feature Learning Method for Accurate Antifungal Peptide Detection
Abstract
:1. Introduction
2. Materials and Methods
2.1. Dataset
2.2. Classifiers
2.2.1. Support Vector Machine (SVM)
2.2.2. Logistic Regression (LR)
2.2.3. Naive Bayes (NB)
2.2.4. AdaBoost
2.2.5. Random Forest (RF)
2.2.6. Stochastic Gradient Descent (SGT)
2.2.7. Decision Tree (DT)
2.3. Feature Extraction
2.3.1. Amino Acid Composition (AAC)
2.3.2. Composition of Tripeptide (CTDC, CTDT, CTDD)
2.3.3. Dipeptide Composition (DPC)
2.3.4. Grouped Amino Acid Composition (GAAC)
2.3.5. Global Descriptors of Protein Composition (GDPC)
2.3.6. Grouped Tripeptide Composition (GTPC)
2.3.7. Tripeptide Position-Specific Composition (TPC)
2.4. Feature Selection
2.5. Performance Evaluation
2.6. Evaluation Metrics
3. Results and Discussion
3.1. Performance of the Model for Different Classifiers
3.2. Results Achieved on the Selected Feature Set
3.3. Comparison of the Proposed Model with Existing Models
4. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Bongomin, F.; Gago, S.; Oladele, R.O.; Denning, D.W. Global and Multi-National Prevalence of Fungal Diseases—Estimate Precision. J. Fungi 2017, 3, 57. [Google Scholar] [CrossRef] [PubMed]
- Richardson, M.D. Changing patterns and trends in systemic fungal infections. J. Antimicrob. Chemother. 2005, 56, i5–i11. [Google Scholar] [CrossRef] [PubMed]
- Miceli, M.H.; Diaz, J.A.; Lee, S.A. Emerging opportunistic yeast infections. Lancet Infect. Dis. 2011, 11, 142–151. [Google Scholar] [CrossRef] [PubMed]
- Brown, G.D.; Denning, D.W.; Gow, N.A.R.; Levitz, S.M.; Netea, M.G.; White, T.C. Hidden Killers: Human Fungal Infections. Sci. Transl. Med. 2012, 4, 165rv13. [Google Scholar] [CrossRef] [PubMed]
- Perfect, J.R. The antifungal pipeline: A reality check. Nat. Rev. Drug Discov. 2017, 16, 603–616. [Google Scholar] [CrossRef] [PubMed]
- Butts, A.; Krysan, D.J. Antifungal Drug Discovery: Something Old and Something New. PLOS Pathog. 2012, 8, e1002870. [Google Scholar] [CrossRef]
- Dhama, K.; Chakrabort, S.; Verma, A.K.; Tiwari, R.; Barathidas, R.; Kumar, A.; Singh, S.D. Fungal/mycotic diseases of poultry-diagnosis, treatment and control: A review. Pak. J. Biol. Sci. 2013, 16, 1626–1640. [Google Scholar] [CrossRef] [PubMed]
- Lestrade, P.P.; Bentvelsen, R.G.; Schauwvlieghe, A.F.; Schalekamp, S.; van der Velden, W.J.; Kuiper, E.J.; van Paassen, J.; van der Hoven, B.; van der Lee, H.A.; Melchers, W.J.; et al. Voricona-zole resistance and mortality in invasive aspergillosis: A multi-center retrospective cohort study. Clin. Infect. Dis. 2019, 68, 1463–1471. [Google Scholar] [CrossRef]
- Fang, Y.; Xu, F.; Wei, L.; Jiang, Y.; Chen, J.; Wei, L.; Wei, D.-Q. AFP-MFL: Accurate identification of antifungal peptides using multi-view feature learning. Brief. Bioinform. 2023, 24, bbac606. [Google Scholar] [CrossRef]
- Agrawal, P.; Bhalla, S.; Chaudhary, K.; Kumar, R.; Sharma, M.; Raghava, G.P. In silico approach for prediction of antifungal peptides. Front. Microbiol. 2018, 9, 323. [Google Scholar] [CrossRef]
- Fisher, M.C.; Hawkins, N.J.; Sanglard, D.; Gurr, S.J. Worldwide emergence of resistance to antifungal drugs challenges human health and food security. Science 2018, 360, 739–742. [Google Scholar] [CrossRef] [PubMed]
- Ahmad, A.; Akbar, S.; Khan, S.; Hayat, M.; Ali, F.; Ahmed, A.; Tahir, M. Deep-AntiFP: Prediction of antifungal peptides using distanct multi-informative features incorporating with deep neural networks. Chemom. Intell. Lab. Syst. 2021, 208, 104214. [Google Scholar] [CrossRef]
- Akbar, S.; Mohamed, H.G.; Ali, H.; Saeed, A.; Khan, A.A.; Gul, S.; Ahmad, A.; Ali, F.; Ghadi, Y.Y.; Assam, M. Identifying Neuropeptides via Evolutionary and Sequential Based Multi-Perspective Descriptors by Incorporation With Ensemble Classification Strategy. IEEE Access 2023, 11, 49024–49034. [Google Scholar] [CrossRef]
- Yao, L.; Zhang, Y.; Li, W.; Chung, C.; Guan, J.; Zhang, W.; Chiang, Y.; Lee, T. DeepAFP: An effective computational framework for identifying antifungal peptides based on deep learning. Protein Sci. 2023, 32, e4758. [Google Scholar] [CrossRef] [PubMed]
- Wang, K.; Dang, W.; Xie, J.; Zhu, R.; Sun, M.; Jia, F.; Zhao, Y.; An, X.; Qiu, S.; Li, X.; et al. Antimicrobial peptide protonectin disturbs the membrane integrity and induces ROS production in yeast cells. Biochim. Biophys. Acta (BBA)-Biomembr. 2015, 1848, 2365–2373. [Google Scholar] [CrossRef]
- Landon, C.; Meudal, H.; Boulanger, N.; Bulet, P.; Vovelle, F. Solution structures of stomoxyn and spinigerin, two insect antimicrobial peptides with an α-helical conformation. Biopolym. Orig. Res. Biomol. 2006, 81, 92–103. [Google Scholar] [CrossRef]
- Mousavizadegan, M.; Mohabatkar, H. Computational prediction of antifungal peptides via Chou’s PseAAC and SVM. J. Bioinform. Comput. Biol. 2018, 16, 1850016. [Google Scholar] [CrossRef] [PubMed]
- Ahmed, S.; Muhammod, R.; Khan, Z.H.; Adilina, S.; Sharma, A.; Shatabda, S.; Dehzangi, A. ACP-MHCNN: An accurate multi-headed deep-convolutional neural network to predict anticancer peptides. Sci. Rep. 2021, 11, 23676. [Google Scholar] [CrossRef] [PubMed]
- Fang, C.; Moriwaki, Y.; Li, C.; Shimizu, K. Prediction of antifungal peptides by deep learning with character embedding. IPSJ Trans. Bioinform. 2019, 12, 21–299. [Google Scholar] [CrossRef]
- Fan, L.; Sun, J.; Zhou, M.; Zhou, J.; Lao, X.; Zheng, H.; Xu, H. DRAMP: A comprehensive data repository of antimicrobial peptides. Sci. Rep. 2016, 6, 24482. [Google Scholar] [CrossRef]
- Ahmad, A.; Akbar, S.; Tahir, M.; Hayat, M.; Ali, F. iAFPs-EnC-GA: Identifying antifungal peptides using sequential and evolutionary descrip-tors based multi-information fusion and ensemble learning approach. Chemom. Intell. Lab. Syst. 2022, 222, 104516. [Google Scholar] [CrossRef]
- Sharma, R.; Shrivastava, S.; Kumar Singh, S.; Kumar, A.; Saxena, S.; Kumar Singh, R. Deep-AFPpred: Identifying novel antifungal peptides using pretrained embeddings from seq2vec with 1DCNN-BiLSTM. Brief. Bioinform. 2022, 3, bbab422. [Google Scholar] [CrossRef] [PubMed]
- He, W.; Jiang, Y.; Jin, J.; Li, Z.; Zhao, J.; Manavalan, B.; Su, R.; Gao, X.; Wei, L. Accelerating bioactive peptide discovery via mutual information-based meta-learning. Brief. Bioinform. 2022, 23, bbab499. [Google Scholar] [CrossRef] [PubMed]
- Lv, Z.; Ao, C.; Zou, Q. Protein function prediction: From traditional classifier to deep learning. Proteomics 2019, 19, e1900119. [Google Scholar] [CrossRef] [PubMed]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. Available online: https://proceedings.neurips.cc/paper_files/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html (accessed on 1 May 2024).
- Bhasin, M.; Raghava, G.P.S. SVM based method for predicting HLA-DRB1* 0401 binding peptides in an antigen sequence. Bioinformatics 2004, 20, 421–423. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Y. Support vector machine classification algorithm and its application. In Information Computing and Applications: Proceedings of the Third International Conference, ICICA 2012, Chengde, China, 14–16 September 2012; Part II 3; Springer: Berlin/Heidelberg, Germany, 2012; pp. 179–186. [Google Scholar]
- Lata, S.; Mishra, N.K.; Raghava, G.P. AntiBP2: Improved version of antibacterial peptide prediction. BMC Bioinform. 2010, 11, S19. [Google Scholar] [CrossRef] [PubMed]
- Ding, X.; Liu, J.; Yang, F.; Cao, J. Random radial basis function kernel-based support vector machine. J. Frankl. Inst. 2021, 358, 10121–10140. [Google Scholar] [CrossRef]
- Westreich, D.; Lessler, J.; Funk, M.J. Propensity score estimation: Neural networks, support vector machines, decision trees (CART), and meta-classifiers as alternatives to logistic regression. J. Clin. Epidemiol. 2010, 63, 826–833. [Google Scholar] [CrossRef] [PubMed]
- Heinze, G.; Schemper, M. A solution to the problem of separation in logistic regression. Stat. Med. 2002, 21, 2409–2419. [Google Scholar] [CrossRef]
- Rish, I. An empirical study of the naive Bayes classifier. In Proceedings of the IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence, Seattle, WA, USA, 4 August 2001; Volume 3, pp. 41–46. [Google Scholar]
- Cao, J.; Kwong, S.; Wang, R. A noise-detection based AdaBoost algorithm for mislabeled data. Pattern Recognit. 2012, 45, 4451–4465. [Google Scholar] [CrossRef]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Kirasich, K.; Smith, T.; Sadler, B. Random forest vs logistic regression: Binary classification for heterogeneous datasets. SMU Data Sci. Rev. 2018, 1, 9. [Google Scholar]
- Haji, S.H.; Abdulazeez, A.M. Comparison of optimization techniques based on gradient descent algorithm: A review. PalArch’s J. Archaeol. Egypt/Egyptol. 2021, 18, 2715–2743. [Google Scholar]
- Bottou, L. Large-scale machine learning with stochastic gradient descent. In Proceedings of the COMPSTAT’2010: 19th International Conference on Computational Statistics, Paris, France, 22–27 August 2010. [Google Scholar]
- Wu, C.-C.; Chen, Y.-L.; Liu, Y.-H.; Yang, X.-Y. Decision tree induction with a constrained number of leaf nodes. Appl. Intell. 2016, 45, 673–685. [Google Scholar] [CrossRef]
- Chen, Z.; Zhao, P.; Li, F.; Leier, A.; Marquez-Lago, T.T.; Wang, Y.; Webb, G.I.; Smith, A.I.; Daly, R.J.; Chou, K.C.; et al. iFeature: A python package and web server for features extraction and selection from protein and peptide sequences. Bioinformatics 2018, 34, 2499–2502. [Google Scholar] [CrossRef] [PubMed]
- Kawashima, S.; Pokarowski, P.; Pokarowska, M.; Kolinski, A.; Katayama, T.; Kanehisa, M. AAindex: Amino acid index database, progress report 2008. Nucleic Acids Res. 2007, 36, D202–D205. [Google Scholar] [CrossRef] [PubMed]
- Bhasin, M.; Raghava, G.P. Classification of nuclear receptors based on amino acid composition and dipeptide composition. J. Biol. Chem. 2004, 279, 23262–23266. [Google Scholar] [CrossRef]
- Saravanan, V.; Gautham, N. Harnessing computational biology for exact linear b-cell epitope prediction: A novel amino acid composition-based feature descriptor. OMICS J. Integr. Biol. 2015, 19, 648–658. [Google Scholar] [CrossRef] [PubMed]
- Chang, K.Y.; Yang, J.-R. Analysis and prediction of highly effective antiviral peptides based on random forests. PLoS ONE 2013, 8, e70166. [Google Scholar] [CrossRef]
- Schaduangrat, N.; Nantasenamat, C.; Prachayasittikul, V.; Shoombuatong, W. ACPred: A computational tool for the prediction and analysis of anticancer peptides. Molecules 2019, 24, 1973. [Google Scholar] [CrossRef]
- Liu, J.; Li, M.; Chen, X. AntiMF: A deep learning framework for predicting anticancer peptides based on multi-view feature extraction. Methods 2022, 207, 38–43. [Google Scholar] [CrossRef] [PubMed]
- Pareek, J.; Jacob, J. Data compression and visualization using PCA and T-SNE. In Advances in Information Communication Technology and Computing: Proceedings of AICTC 2019; Springer: Singapore, 2021; pp. 327–337. [Google Scholar]
- Charoenkwan, P.; Schaduangrat, N.; Moni, M.A.; Manavalan, B.; Shoombuatong, W. SAPPHIRE: A stacking-based ensemble learning framework for accurate prediction of thermophilic proteins. Comput. Biol. Med. 2022, 146, 105704. [Google Scholar] [CrossRef] [PubMed]
- Charoenkwan, P.; Ahmed, S.; Nantasenamat, C.; Quinn, J.M.; Moni, M.A.; Lio’, P.; Shoombuatong, W. AMYPred-FRL is a novel approach for accurate prediction of amyloid proteins by using feature representation learning. Sci. Rep. 2022, 12, 7697. [Google Scholar] [CrossRef] [PubMed]
Dataset | AFPs | Non-AFPs | Description of AFPs and Non-AFPs | |
---|---|---|---|---|
Antifp_DS1 | Train Test | 1168 291 | 1168 291 | The non-AFPs were chosen at random from the Swiss-Prot database, and antimicrobial peptides other than antifungals were obtained from the DRAMA database. |
Antifp_DS2 | Train Test | 1168 291 | 1168 291 | The non-AFPs were antimicrobial peptides other than antifungal peptides. |
Antifp_DS3 | Train Test | 1168 291 | 1168 291 | The non-AFPs were chosen at random from the Swiss-Prot database. |
Algo | Acc (%) | F1 | MCC | Precision (%) | SN (%) | SE (%) |
---|---|---|---|---|---|---|
SVM | 91.5 | 0.91 | 0.83 | 91.2 | 90.5 | 91.4 |
RF | 93.5 | 0.92 | 0.89 | 93.1 | 93.2 | 93.8 |
AdaBoost | 91.2 | 0.93 | 0.90 | 90.2 | 89.5 | 91.1 |
NB | 78.4 | 0.77 | 0.57 | 81.4 | 80.2 | 78.8 |
LR | 90.9 | 0.90 | 0.81 | 90.5 | 90.1 | 90.5 |
SGD | 89.3 | 0.88 | 0.80 | 89.1 | 88.3 | 86.1 |
Bernoulli NB | 87.4 | 0.87 | 0.74 | 88.7 | 88.3 | 87.6 |
DT | 91.7 | 0.91 | 0.83 | 90.8 | 89.5 | 90.0 |
RT | 91.7 | 0.90 | 0.91 | 92.4 | 92.3 | 91.7 |
Algo | Acc (%) | F1 | MCC | Pre(%) | SN (%) | SE (%) |
---|---|---|---|---|---|---|
SVM | 91.4 | 0.91 | 0.82 | 91.1 | 90.5 | 91.5 |
RF | 93.8 | 0.93 | 0.80 | 96.6 | 93.2 | 93.5 |
AdaBoost | 91.7 | 0.92 | 0.89 | 90.8 | 89.5 | 91.2 |
NB | 78.8 | 0.78 | 0.57 | 79.7 | 80.2 | 78.4 |
LR | 90.5 | 0.90 | 0.81 | 90.4 | 90.1 | 90.9 |
SGD | 86.1 | 0.85 | 0.71 | 86.7 | 88.3 | 89.3 |
Bernoulli NB | 87.6 | 0.87 | 0.75 | 90.1 | 88.3 | 87.4 |
DT | 90.0 | 0.90 | 0.80 | 89.1 | 89.5 | 91.7 |
RT | 91.7 | 0.91 | 0.90 | 92.5 | 92.3 | 91.7 |
Algo | Acc (%) | F1 | MCC | Precision (%) | SN (%) | SE (%) |
---|---|---|---|---|---|---|
SVM | 93.5 | 0.93 | 0.87 | 92.6 | 90.5 | 91.4 |
RF | 96.4 | 0.95 | 0.92 | 97.7 | 93.2 | 93.8 |
AdaBoost | 92.6 | 0.91 | 0.90 | 89.0 | 89.5 | 91.1 |
NB | 88.9 | 0.89 | 0.82 | 86.3 | 80.2 | 78.8 |
LR | 92.5 | 0.91 | 0.85 | 92.8 | 90.1 | 90.5 |
SGD | 90.9 | 0.91 | 0.81 | 90.9 | 88.3 | 86.1 |
Bernoulli NB | 91.1 | 0.90 | 0.82 | 93.9 | 88.3 | 87.6 |
DT | 91.6 | 0.91 | 0.83 | 90.8 | 89.5 | 90.0 |
RT | 92.5 | 0.92 | 0.92 | 92.2 | 92.3 | 91.7 |
Algo | Acc (%) | F1 | MCC | Precision (%) | SN (%) | SE (%) |
---|---|---|---|---|---|---|
SVM | 93.1% | 0.93 | 0.86 | 92.3 | 93.5 | 91.4 |
RF | 96.2 | 0.96 | 0.92 | 96.8 | 96.4 | 93.8 |
AdaBoost | 92.3 | 0.91 | 0.90 | 91.4 | 92.6 | 91.1 |
NB | 85.6 | 0.89 | 0.86 | 80.5 | 88.9 | 78.8 |
LR | 93.6 | 0.94 | 0.87 | 92.9 | 92.5 | 90.5 |
SGD | 91.4 | 0.91 | 0.83 | 91.4 | 90.9 | 86.1 |
Bernoulli NB | 90.4 | 0.90 | 0.80 | 92.1 | 91.1 | 87.6 |
DT | 90.4 | 0.90 | 0.80 | 89.2 | 91.6 | 90.0 |
RT | 91.1 | 0.91 | 0.92 | 91.2 | 92.5 | 91.7 |
Algo | Acc (%) | F1 | MCC | Precision (%) | SN (%) | SE (%) |
---|---|---|---|---|---|---|
SVM | 91.4 | 0.91 | 0.82 | 90.9 | 90.5 | 92.3 |
RF | 93.7 | 0.92 | 0.86 | 94.3 | 93.2 | 96.8 |
AdaBoost | 88.3 | 0.87 | 0.85 | 85.2 | 89.5 | 87.2 |
NB | 77.2 | 0.74 | 0.55 | 84.2 | 80.2 | 80.6 |
LR | 91.7 | 0.91 | 0.83 | 90.8 | 90.1 | 92.9 |
SGD | 88.1 | 0.88 | 0.88 | 86.8 | 88.3 | 91.4 |
Bernoulli NB | 86.1 | 0.86 | 0.72 | 86.4 | 88.3 | 92.1 |
DT | 87.3 | 0.87 | 0.74 | 88.7 | 89.5 | 89.2 |
RT | 91.2 | 0.92 | 0.90 | 92.6 | 92.3 | 91.3 |
Algo | Acc (%) | F1 | MCC | Precision (%) | SN (%) | SE (%) |
---|---|---|---|---|---|---|
SVM | 90.8 | 0.91 | 0.81 | 89.9 | 90.5 | 92.3 |
RF | 94.1 | 0.93 | 0.87 | 95.1 | 93.2 | 96.8 |
AdaBoost | 83.4 | 0.82 | 0.81 | 82.1 | 89.5 | 82.0 |
NB | 78.0 | 0.75 | 0.57 | 85.5 | 80.2 | 80.5 |
LR | 90.1 | 0.90 | 0.80 | 89.3 | 90.1 | 92.9 |
SGD | 88.3 | 0.88 | 0.76 | 89.6 | 88.3 | 91.4 |
Bernoulli NB | 86.9 | 0.86 | 0.73 | 89.6 | 88.3 | 92.1 |
DT | 87.6 | 0.87 | 0.75 | 90.1 | 89.5 | 89.2 |
RT | 92.2 | 0.90 | 0.92 | 91.8 | 92.3 | 91.2 |
Datasets | Model | ACC (%) | PRE(%) | F1 Score | MCC |
---|---|---|---|---|---|
Antifp_DS1 | Without Feature Selection | 93.5 | 93.1 | 0.92 | 0.89 |
With Feature Selection | 97.9 | 98.4 | 0.98 | 0.96 |
Datasets | Model | ACC (%) | PRE(%) | F1 score | MCC |
---|---|---|---|---|---|
Antifp_DS1 | MIMML [23] | 91.3 | - | - | 0.83 |
AntiMF [45] | 90.2 | - | - | 0.80 | |
Antifp [10] | 86.3 | - | - | 0.73 | |
AFPDeep [19] | 90.2 | - | - | - | |
Deep-AntiFP [12] | 89.1 | - | - | 0.78 | |
iAFPs-EnC-GA [21] | 93.9 | - | - | 0.90 | |
AFP-MFL [9] | 95.8 | 97.1 | 0.96 | 0.92 | |
AFP-MVFL | 97.9 | 98.4 | 0.98 | 0.96 |
Datasets | Model | ACC (%) | PRE(%) | F1 score | MCC |
---|---|---|---|---|---|
Antifp_DS2 | AFPDeep | 93.5 | - | - | - |
Antifp | 85.9 | - | - | 0.72 | |
AFP-MFL | 94.4 | 95.9 | 0.94 | 0.88 | |
AFP-MVFL | 98.3 | 99.1 | 0.98 | 0.97 | |
Antifp_DS3 | AFPDeep | 88.7 | - | - | - |
Antifp | 90.4 | - | - | 0.81 | |
AFP-MFL | 96.8 | 97.6 | 0.96 | 0.93 | |
AFP-MVFL | 97.4 | 98.3 | 0.97 | 0.95 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ferdous, S.M.; Mugdha, S.B.S.; Dehzangi, I. New Multi-View Feature Learning Method for Accurate Antifungal Peptide Detection. Algorithms 2024, 17, 247. https://doi.org/10.3390/a17060247
Ferdous SM, Mugdha SBS, Dehzangi I. New Multi-View Feature Learning Method for Accurate Antifungal Peptide Detection. Algorithms. 2024; 17(6):247. https://doi.org/10.3390/a17060247
Chicago/Turabian StyleFerdous, Sayeda Muntaha, Shafayat Bin Shabbir Mugdha, and Iman Dehzangi. 2024. "New Multi-View Feature Learning Method for Accurate Antifungal Peptide Detection" Algorithms 17, no. 6: 247. https://doi.org/10.3390/a17060247
APA StyleFerdous, S. M., Mugdha, S. B. S., & Dehzangi, I. (2024). New Multi-View Feature Learning Method for Accurate Antifungal Peptide Detection. Algorithms, 17(6), 247. https://doi.org/10.3390/a17060247