Cancer Research Trend Analysis Based on Fusion Feature Representation
Abstract
:1. Introduction
2. Materials and Methods
2.1. Background
2.2. Method
2.2.1. Feature Fusion Representation Model
2.2.2. Cancer Research Trend Analysis Model
Correlation Analysis Based on Similarity
Keyword Trend Analysis Model
Improved Keyword Trend Analysis Model
3. Results
3.1. Datasets
3.2. Results
3.2.1. Comparison Results of Feature Fusion Methods
3.2.2. Cancer Trend Analysis Results
Correlation Analysis Results Based on Similarity
Results of the Keyword Trend Analysis Model
Results of Analysis on Research Trend
4. Limitation
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- PubMed. Available online: https://pubmed.ncbi.nlm.nih.gov/ (accessed on 7 January 2021).
- Jensen, L.J.; Saric, J.; Bork, P. Literature Mining for the Biologist: From Information Retrieval to Biological Discovery. Nat. Rev. Genet. 2006, 7, 119–129. [Google Scholar] [CrossRef]
- Gonzalez, G.H.; Tahsin, T.; Goodale, B.C.; Greene, A.C.; Greene, C.S. Recent Advances and Emerging Applications in Text and Data Mining for Biomedical Discovery. Brief. Bioinform. 2016, 17, 33–42. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- He, G.; Liang, Y.; Chen, Y.; Yang, W.; Liu, J.S.; Yang, M.Q.; Guan, R. A Hotspots Analysis-Relation Discovery Representation Model for Revealing Diabetes Mellitus and Obesity. BMC Syst. Biol. 2018, 12, 116. [Google Scholar] [CrossRef] [Green Version]
- Guan, R.; Wen, X.; Liang, Y.; Xu, D.; He, B.; Feng, X. Trends in Alzheimer’s Disease Research Based upon Machine Learning Analysis of PubMed Abstracts. Int. J. Biol. Sci. 2019, 15, 2065–2074. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Guan, R.; Zhang, H.; Liang, Y.; Giunchiglia, F.; Huang, L.; Feng, X. Deep Feature-Based Text Clustering and Its Explanation. IEEE Trans. Knowl. Data Eng. 2020, 1–13. [Google Scholar] [CrossRef]
- Collobert, R.; Weston, J.; Bottou, L.; Karlen, M.; Kavukcuoglu, K.; Kuksa, P. Natural Language Processing (Almost) from Scratch. J. Mach. Learn. Res. 2011, 12, 2493–2537. [Google Scholar]
- Hatzivassiloglou, V.; Gravano, L.; Maganti, A. An Investigation of Linguistic Features and Clustering Algorithms for Topical Document Clustering. In Proceedings of the 23rd annual international ACM SIGIR Conference on Research and Development in Information Retrieval, Athens, Greece, 24–28 July 2000; Association for Computing Machinery: New York, NY, USA, 2000; pp. 224–231. [Google Scholar]
- Nam, S.; Jeong, S.; Kim, S.-K.; Kim, H.-G.; Ngo, V.; Zong, N. Structuralizing Biomedical Abstracts with Discriminative Linguistic Features. Comput. Biol. Med. 2016, 79, 276–285. [Google Scholar] [CrossRef]
- Sarkar, K. Sentence Clustering-Based Summarization of Multiple Text Documents. TECHNIA Int. J. Comput. Sci. Commun. Technol. 2009, 2, 325–335. [Google Scholar]
- Tang, B.; Cao, H.; Wang, X.; Chen, Q.; Xu, H. Evaluating Word Representation Features in Biomedical Named Entity Recognition Tasks. Available online: https://www.hindawi.com/journals/bmri/2014/240403/ (accessed on 24 February 2021).
- Gogate, M.; Dashtipour, K.; Adeel, A.; Hussain, A. CochleaNet: A Robust Language-Independent Audio-Visual Model for Real-Time Speech Enhancement. Inf. Fusion 2020, 63, 273–285. [Google Scholar] [CrossRef]
- Gogate, M.; Dashtipour, K.; Bell, P.; Hussain, A. Deep Neural Network Driven Binaural Audio Visual Speech Separation. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 19–24 July 2020; pp. 1–7. [Google Scholar]
- Salton, G. Developments in Automatic Text Retrieval. Science 1991, 253, 974–980. [Google Scholar] [CrossRef] [PubMed]
- Qin, P.; Xu, W.; Guo, J. A Novel Negative Sampling Based on TFIDF for Learning Word Representation. Neurocomputing 2016, 177, 257–265. [Google Scholar] [CrossRef]
- Wang, D.; Liang, Y.; Xu, D.; Feng, X.; Guan, R. A Content-Based Recommender System for Computer Science Publications. Knowl. Based Syst. 2018, 157, 1–9. [Google Scholar] [CrossRef]
- Mihalcea, R.; Tarau, P. TextRank: Bringing Order into Text. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain, 25–26 July 2004; Association for Computational Linguistics: Barcelona, Spain, 2004; pp. 404–411. [Google Scholar]
- Mikolov, T.; Sutskever, I.; Chen, K.; Corrado, G.; Dean, J. Distributed Representations of Words and Phrases and Their Compositionality. In Proceedings of the 26th International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 5–10 December 2013; Curran Associates Inc.: Red Hook, NY, USA, 2013; Volume 2, pp. 3111–3119. [Google Scholar]
- Goldberg, Y.; Levy, O. Word2vec Explained: Deriving Mikolov et al.’s Negative-Sampling Word-Embedding Method. arXiv 2014, arXiv:1402.3722. [Google Scholar]
- Rong, X. Word2vec Parameter Learning Explained. arXiv 2016, arXiv:1411.2738. [Google Scholar]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient Estimation of Word Representations in Vector Space. arXiv 2013, arXiv:1301.3781. [Google Scholar]
- Aliguliyev, R.M. Performance Evaluation of Density-Based Clustering Methods. Inf. Sci. 2009, 179, 3583–3602. [Google Scholar] [CrossRef]
- Bray, F.; Ferlay, J.; Soerjomataram, I.; Siegel, R.L.; Torre, L.A.; Jemal, A. Global Cancer Statistics 2018: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J. Clin. 2018, 68, 394–424. [Google Scholar] [CrossRef] [Green Version]
- Lu, Z.; Kim, W.; Wilbur, W.J. Evaluation of Query Expansion Using MeSH in PubMed. Inf. Retr. 2009, 12, 69–80. [Google Scholar] [CrossRef] [Green Version]
- Aranganayagi, S.; Thangavel, K. Clustering Categorical Data Using Silhouette Coefficient as a Relocating Measure. In Proceedings of the International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007), Sivakasi, India, 13–15 December 2007; Volume 2, pp. 13–17. [Google Scholar]
- Peppone, L.J.; Mahoney, M.C.; Cummings, K.M.; Michalek, A.M.; Reid, M.E.; Moysich, K.B.; Hyland, A. Colorectal Cancer Occurs Earlier in Those Exposed to Tobacco Smoke: Implications for Screening. J. Cancer Res. Clin. Oncol. 2008, 134, 743–751. [Google Scholar] [CrossRef] [Green Version]
- Papadimitriou, N.; Dimou, N.; Tsilidis, K.K.; Banbury, B.; Martin, R.M.; Lewis, S.J.; Kazmi, N.; Robinson, T.M.; Albanes, D.; Aleksandrova, K.; et al. Physical Activity and Risks of Breast and Colorectal Cancer: A Mendelian Randomisation Analysis. Nat. Commun. 2020, 11, 597. [Google Scholar] [CrossRef] [Green Version]
- Wang, M.; Chen, H. Chaotic Multi-Swarm Whale Optimizer Boosted Support Vector Machine for Medical Diagnosis. Appl. Soft Comput. 2020, 88, 105946. [Google Scholar] [CrossRef]
- Moniuszko, T.; Wincewicz, A.; Koda, M.; Domysławska, I.; Sulkowski, S. Role of Periostin in Esophageal, Gastric and Colon Cancer (Review). Oncol. Lett. 2016, 12, 783–787. [Google Scholar] [CrossRef] [Green Version]
- Sumer, F.; Karakas, S.; Gundogan, E.; Sahin, T.; Kayaalp, C. Totally Laparoscopic Resection and Extraction of Specimens via Transanal Route in Synchronous Colon and Gastric Cancer. G. Chir. 2018, 39, 82–86. [Google Scholar] [PubMed]
- Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. arXiv 2019, arXiv:1810.04805. [Google Scholar]
- Lee, J.; Yoon, W.; Kim, S.; Kim, D.; Kim, S.; So, C.H.; Kang, J. BioBERT: A Pre-Trained Biomedical Language Representation Model for Biomedical Text Mining. Bioinformatics 2020, 36, 1234–1240. [Google Scholar] [CrossRef] [PubMed]
Cancer | 2014 | 2015 | 2016 | 2017 | 2018 |
---|---|---|---|---|---|
Lung | 9322 | 9966 | 9446 | 9508 | 10,149 |
Breast | 12,328 | 12,825 | 12,600 | 12,286 | 12,743 |
Gastric | 3747 | 3572 | 3637 | 3414 | 3561 |
Colorectal | 8950 | 9174 | 8778 | 8617 | 8868 |
Liver | 6651 | 6871 | 6517 | 6431 | 6555 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wu, J.; Feng, X.; Guan, R.; Liang, Y. Cancer Research Trend Analysis Based on Fusion Feature Representation. Entropy 2021, 23, 338. https://doi.org/10.3390/e23030338
Wu J, Feng X, Guan R, Liang Y. Cancer Research Trend Analysis Based on Fusion Feature Representation. Entropy. 2021; 23(3):338. https://doi.org/10.3390/e23030338
Chicago/Turabian StyleWu, Jingqiao, Xiaoyue Feng, Renchu Guan, and Yanchun Liang. 2021. "Cancer Research Trend Analysis Based on Fusion Feature Representation" Entropy 23, no. 3: 338. https://doi.org/10.3390/e23030338
APA StyleWu, J., Feng, X., Guan, R., & Liang, Y. (2021). Cancer Research Trend Analysis Based on Fusion Feature Representation. Entropy, 23(3), 338. https://doi.org/10.3390/e23030338