Spatiotemporal Hotspots of Study Areas in Research of Gastric Cancer in China Based on Web-Crawled Literature
Abstract
:1. Introduction
2. Materials and Methods
2.1. Data Source
2.2. Methods
2.2.1. Bibliography Crawler
2.2.2. Toponym Extraction and Accuracy Test
2.2.3. Spatial Processing and Analysis
3. Results
3.1. Accuracy of the Toponym Extraction
3.2. Global Quantitative Characteristics
3.3. Spatiotemporal Characteristics
3.3.1. Spatial Pattern
3.3.2. Spatial and Temporal Changes
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
GC | Gastric Cancer |
GCR | Gastric Cancer Research |
GCRWT | Gastric Cancer Research with Toponyms |
GNER | Geographical Named Entity Recognition |
GNERPP | Post-proceed after Geographical Named Entity Recognition |
SEW | Systematic Exclusion Words |
ST | Standard Toponyms |
HT | Historical Toponyms |
ASTC | Ascending Scale Toponym Counting |
References
- WHO. World Cancer Report 2020. World Health Organization, 2020. Available online: https://www.iarc.fr/cards_page/world-cancer-report/ (accessed on 15 June 2020).
- Zuo, T.; Zheng, R.; Zeng, H.; Zhang, S.; Chen, W. Epidemiology of stomach cancer in China. Chin. J. Clin. Oncol. 2017, 44, 52–58. [Google Scholar]
- Zheng, R.S.; Sun, K.X.; Zhang, S.W.; Zeng, H.M.; Zou, X.N.; Chen, R.; Gu, X.Y.; Wei, W.W.; He, J. Report of cancer epidemiology in China, 2015. Zhonghua Zhong Liu Za Zhi Chin. J. Oncol. 2019, 41, 19–28. [Google Scholar]
- Bray, F.; Ferlay, J.; Soerjomataram, I.; Siegel, R.L.; Torre, L.A.; Jemal, A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2018, 68, 394–424. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Sun, K.X.; Zheng, R.S.; Zhang, S.W.; Zou, X.N.; Chen, R.; Gu, X.Y. Report of cancer incidence and mortality in different areas of China, 2015. China Cancer 2019, 28, 1–11. [Google Scholar]
- Chen, W.Q.; Liang, Z.H.; Cen, H.S.; Wei, K.R. Current status and development of cancer registration in China. Chin. J. Front. Med. Sci. (Electron. Version) 2016, 8, 1–6. [Google Scholar]
- Wei, X.; Chen, L. Present situation and development of death cause registration reporting system in China. Occup. Health 2017, 33, 2157–2160. [Google Scholar]
- Chu, D.K.; Akl, E.A.; Duda, S.; Solo, K.; Yaacoub, S.; Schunemann, H.J.; Review, C.-S.U. Physical distancing, face masks, and eye protection to prevent person-to-person transmission of SARS-CoV-2 and COVID-19: A systematic review and meta-analysis. Lancet 2020, 395, 1973–1987. [Google Scholar] [CrossRef]
- Qichang, Y.; Siwen, F.; Hongbin, L.; Xiaojuan, Z.; Jie, C.; Yan, Z.; Hongwei, Z.; Hongjie, S. Clinicopathological features and lymph node metastasis risk in early gastric cancer with WHO criteria in China: 304 cases analysis. Ann. Diagn. Pathol. 2021, 50. [Google Scholar] [CrossRef]
- Lewington, S.; Clarke, R.; Qizilbash, N.; Peto, R.; Collins, R.; Prospective Studies, C. Age-specific relevance of usual blood pressure to vascular mortality: A meta-analysis of individual data for one million adults in 61 prospective studies. Lancet 2002, 360, 1903–1913. [Google Scholar]
- Wang, K.J.; Wang, R.T. Meta-analysis on the epidemiology of Helicobacter pylori infection in China. Chin. J. Epidemiol. 2003, 24, 19–22. [Google Scholar]
- Tan, J. The progress in medical geography of China and its prospect. Acta Geogr. Sin. 1990, 2, 61–75. [Google Scholar]
- Ren, H.Y.; Zhao, L.; Zhang, A.; Song, L.Y.; Liao, Y.L.; Lu, W.L.; Cui, C. Early forecasting of the potential risk zones of COVID-19 in China’s megacities. Sci. Total. Environ. 2020, 729, 8. [Google Scholar] [CrossRef] [PubMed]
- Cui, C.; Dong, H.; Ren, H.Y.; Lin, G.Z.; Zhao, L. Characterization of Esophageal Cancer and Its Association with Influencing Factors in Guangzhou City, China. Int. J. Environ. Res. Public Health 2020, 17, 1498. [Google Scholar] [CrossRef] [Green Version]
- Hu, Y.; Han, Y.; Zhang, Y. Information Extraction and Spatial Distribution of Research Hot Regions on Rocky Desertification in China. Appl. Sci. 2018, 8, 2075. [Google Scholar] [CrossRef] [Green Version]
- Liang, Y.; Hu, Y.; Han, Y. Spatial Distribution and Dynamic Changes in Research Hotspots for Desertification in China Based on Big Data from CNKI. J. Resour. Ecol. 2019, 10, 692. [Google Scholar]
- Xu, J. The Research of Marketing Strategy about Commercial Database. Master’s Thesis, East China Normal University, Shanghai, China, 2008. [Google Scholar]
- Gorraiz, J.; Schloegl, C. A bibliometric analysis of pharmacology and pharmacy journals: Scopus versus Web of Science. J. Inf. Sci. 2008, 34, 715–725. [Google Scholar] [CrossRef] [Green Version]
- Administrative Districts Data of China. Resource and Environment Science and Data Center, Chinese Academy of Sciences, 2019. Available online: http://www.resdc.cn/Datalist1.aspx?FieldTyepID=20,0 (accessed on 30 May 2020).
- Standard and Historical Administrative District Toponyms in China. National Geomatics Center of China, 2020. Available online: http://www.ngcc.cn/ngcc/ (accessed on 30 May 2020).
- Historical Administrative District Toponyms in China. National Bureau of Statistics of China, 2020. Available online: http://www.stats.gov.cn/ (accessed on 30 May 2020).
- Xe, D.X.; Xia, W.F. Design and Implementation of The Topic-focused Crawler Based on Scrapy. In Advances in Applied Sciences and Manufacturing, Pts 1 and 2; Wang, Y., Si, H., Su, Y., Xu, P., Eds.; Trans Tech Publications Ltd.: Durnten-Zurich, Switzerland, 2014; Volume 850–851, p. 487. [Google Scholar]
- Li, Y.; Feng, L.; Liu, X.; Cheng, S.; Zhang, X. A method of context enhanced keyword extraction for sparse geo-entity relation. J. Geo-Inf. Sci 2016, 18, 1465–1475. [Google Scholar]
- San, W.S.; Xin, P.S. Square and Geography Administrative Divisions and Place Names; China Society Publishing House: Beijing, China, 2017. [Google Scholar]
- Ou, Y.; Zhang, J.; Li, J. Duplicate field matching for data cleaning of Chinese placenames. J. Appl. Sci. 2013, 31, 212–220. [Google Scholar]
- Chum, O.; Philbin, J.; Sivic, J.; Isard, M.; Zisserman, A.; IEEE. Total recall: Automatic query expansion with a generative feature model for object retrieval. In Proceedings of the 2007 IEEE 11th International Conference on Computer Vision, New York, NY, USA, 2–5 July 2007; IEEE: New York, NY, USA, 2007. [Google Scholar] [CrossRef]
- Doerr, M.; Papagelis, M. A method for estimating the precision of placename matching. IEEE Trans. Knowl. Data Eng. 2007, 19, 1089–1101. [Google Scholar] [CrossRef] [Green Version]
- Moran, P.A.P. Notes on Continuous Stochastic Phenomena. Biometrika 1950, 37, 17–23. [Google Scholar] [CrossRef] [PubMed]
- 29 Ord, A.G.J.K. The Analysis of Spatial Association by Use of Distance Statistics. Geogr. Anal. 1992, 24, 189–206. [Google Scholar]
- 30 Yuan, P.; Lin, L.; Zheng, K.; Wang, W.; Wu, S.; Huang, L.; Wu, B.; Chen, T.; Li, X.; Cai, L. Risk factors for gastric cancer and related serological levels in Fujian, China: Hospital-based case-control study. BMJ open 2020, 10. [Google Scholar] [CrossRef]
- Song, G.; Meng, F.; Chen, C.; Gong, Y. Cohort study in area with high incidence of upper gastrointestinal cancer in Cixian, China. Chin. J. Cancer Prev. Treat. 2020, 27, 1455–1463. [Google Scholar]
- Zhang, S.; Chen, W.; Kong, L.; Li, L.; Lu, F.; Li, G.; Meng, J.; Zhao, P. An analysis of cancer incidence and mortality from 30 cancer registries in China, 1998–2002. Bull. Chin. Cancer 2006, 15, 430–447. [Google Scholar]
- National-Health-Commission. China Health Statistics Yearbook; China Statistical Press: Beijing, China, 2019. [Google Scholar]
- Hu, Y.; Han, Y.; Zhang, Y.; Zhuang, Y. Extraction and Dynamic Spatial-Temporal Changes of Grassland Deterioration Research Hot Regions in China. J. Resour. Ecol. 2017, 4, 352–358. [Google Scholar]
- Tongchao, Z.; Xiaolin, Y.; Xiaorong, Y.; Jinyu, M.; Qiufeng, H.; Qiyun, W.; Ming, L. Research trends on the relationship between Microbiota and Gastric Cancer: A Bibliometric Analysis from 2000 to 2019. J. Cancer 2020, 11, 4823–4831. [Google Scholar]
- Cao, M.; Chen, W. Epidemiology of cancer in China and the current status of prevention and control. Chin. J. Clin. Oncol. 2019, 46, 145–149. [Google Scholar]
- Zhou, M.; Wang, X.; Hu, J.; Li, G.; Chen, W.; Zhang, S.; Wan, X.; Wang, L.; Xiang, C.; Hu, Y. Geographical distribution of cancer mortality in China, 2004–2005. Zhonghua Yu Fang Yi Xue Za Zhi Chin. J. Prev. Med. 2010, 44, 303. [Google Scholar]
- Keqin, R.; Yude, C. Centre for Health Statistics Information, Ministry of Public Health, PR China, Li Liandi, The National Cancer Research and Control Office, P, R. China, Beijing; Analysis on the Mortality Pattern and Its Related Factors of the Leading Ten Malignant Tumors in China. Chin. J. Health Stat. 1993, 4, 7–13. [Google Scholar]
- Guo, Y.; Zhang, Y.; Gerhard, M.; Gao, J.J.; Mejias-Luque, R.; Zhang, L.; Vieth, M.; Ma, J.L.; Bajbouj, M.; Suchanek, S.; et al. Effect of Helicobacter pylori on gastrointestinal microbiota: A population-based study in Linqu, a high-risk area of gastric cancer. Gut 2020, 69, 1598–1607. [Google Scholar] [CrossRef] [Green Version]
- Zeng, H.; Chen, W. Cancer Epidemiology and Control in China: State of the Art. Prog. Chem. 2013, 25, 1415–1420. [Google Scholar]
- Chen, Q.; Liu, S.Z.; Zhang, S.K.; Cao, X.Q.; Li, B.Y.; Quan, P.L.; Guo, L.W.; Dong, L.; Sun, X.B.; Zhang, Y.W.; et al. The relative survival and cure fraction of gastric cancer estimated through flexible parametric models using data from population-based cancer registration during 2003–2012 in Linzhou, China. Cancer Med. 2020, 9, 2243–2251. [Google Scholar] [CrossRef] [PubMed]
Database | Search Strategy | Request (accessed on 4 May 2020) | Parsing Method | Remark |
---|---|---|---|---|
CNKI | SU = ‘wei ai’ OR (SU = ‘wei’ AND SU = ‘e xing zhong liu’) | Search: http://kns/request/searchhandler.ashx Results: http://kns.cnki.net/kns/brief/brief.aspx Result Details: http://kns.cnki.net/kns/detail/detail.aspx | Beautiful Soup, Regular expression | Crawl year by year because the number of search results is limited for a single time (less than 5000) |
WOS | (TS = gastric cancer OR TS = cancer of the stomach OR TS = gastric carcinoma) AND CU = CHINA | Search: http://apps.webofknowledge.com/WOS_AdvancedSearch.do Results: http://apps.webofknowledge.com//OutboundService.do?action=go&& | Automatic export | Export every 500 pieces, using the automatic export function |
GNERPP | Text in the Literature | Mis-Extraction Result (SEW)/Standard Toponym (HT) |
---|---|---|
SEW | Yunnan Baiyao | Yunnan Province |
Pingyang Meisu | Pingyang County, Wenzhou City, Zhejiang Province | |
Shanghai Ruijin Hospital | Ruijin City, Ganzhou City, Jiangxi Province | |
Zhongshan Hospital of Fudan University | Zhongshan City, Guangdong Province | |
Gansu Hexi | Hexi District, Tianjin | |
... | ... | |
HT | Xiangfan City | Xiangyang City, Hubei Province (2010) |
Linxian County | Linzhou City, Anyang City, Henan Province (1994) | |
Changle County | Changle City, Fuzhou City, Fujian Province (1994) Changle District (2017) | |
Chongwen District ... | Dongcheng District, Beijing (2010) ... |
Actual (GNER) | Actual (GNERPP) | ||||||||
---|---|---|---|---|---|---|---|---|---|
Positive | Negative | Sum | Precision (%) | Positive | Negative | Sum | Precision (%) | ||
Predicted | Positive | 193 | 69 | 262 | 73.66 | 203 | 10 | 213 | 95.31 |
Negative | 21 | 2317 | 2338 | / | 11 | 2376 | 2387 | / | |
Sum | 214 | 2386 | 2600 | / | 214 | 2386 | 2600 | / | |
Recall (%) | 90.19 | / | / | / | 94.86 | / | / | / |
Time Stage | Province | City | County | ||||||
---|---|---|---|---|---|---|---|---|---|
Moran’s I | Z-Score | p-Value | Moran’s I | Z-Score | p-Value | Moran’s I | Z-Score | p-Value | |
91–95 | 0.130 ** | 2.063 | 0.039 | 0.064 *** | 6.850 | <0.001 | 0.048 *** | 13.976 | <0.001 |
96–00 | 0.159 ** | 2.409 | 0.016 | 0.062 *** | 6.890 | <0.001 | 0.017 *** | 5.594 | <0.001 |
01–05 | 0.155 ** | 2.460 | 0.014 | 0.074 *** | 7.994 | <0.001 | 0.031 *** | 9.492 | <0.001 |
06–10 | 0.176 *** | 2.630 | 0.009 | 0.081 *** | 8.599 | <0.001 | 0.021 *** | 6.454 | <0.001 |
11–15 | 0.220 *** | 3.181 | 0.001 | 0.108 *** | 11.068 | <0.001 | 0.052 *** | 15.240 | <0.001 |
16–19 | 0.117 * | 1.841 | 0.066 | 0.115 *** | 11.749 | <0.001 | 0.050 *** | 14.269 | <0.001 |
91–19 | 0.186 *** | 2.719 | 0.007 | 0.118 *** | 12.072 | <0.001 | 0.048 *** | 14.229 | <0.001 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, Z.; Ren, H.; Zhang, A.; Zhuang, D. Spatiotemporal Hotspots of Study Areas in Research of Gastric Cancer in China Based on Web-Crawled Literature. Int. J. Environ. Res. Public Health 2021, 18, 3997. https://doi.org/10.3390/ijerph18083997
Wang Z, Ren H, Zhang A, Zhuang D. Spatiotemporal Hotspots of Study Areas in Research of Gastric Cancer in China Based on Web-Crawled Literature. International Journal of Environmental Research and Public Health. 2021; 18(8):3997. https://doi.org/10.3390/ijerph18083997
Chicago/Turabian StyleWang, Zhen, Hongyan Ren, An Zhang, and Dafang Zhuang. 2021. "Spatiotemporal Hotspots of Study Areas in Research of Gastric Cancer in China Based on Web-Crawled Literature" International Journal of Environmental Research and Public Health 18, no. 8: 3997. https://doi.org/10.3390/ijerph18083997
APA StyleWang, Z., Ren, H., Zhang, A., & Zhuang, D. (2021). Spatiotemporal Hotspots of Study Areas in Research of Gastric Cancer in China Based on Web-Crawled Literature. International Journal of Environmental Research and Public Health, 18(8), 3997. https://doi.org/10.3390/ijerph18083997