Use of Social Media for the Detection and Analysis of Infectious Diseases in China
Abstract
:1. Introduction
- To explore the spatiotemporal relationship between the evolution of dengue fever and related Weibo posts; and
- To model the spread of the dengue fever in space and time.
2. Data Collection and Methods
2.1. Data Collection
2.2. Topics in Posts
2.3. Kalman Filter
3. Spatial Analysis of Dengue Fever
3.1. Distribution in Guangdong Province
3.2. Distribution in China
4. Temporal Analysis of Dengue Fever
4.1. Distribution on Specific Days
4.2. The Effect of the Day of the Week
4.3. Distribution on a Single Day
4.4. Noise Canceling and Prediction Based on Kalman Filtering
5. Spatiotemporal Pattern of Dengue Fever
5.1. Epidemic Diffusion Process
5.2. Spatial Pattern
6. Conclusions
Acknowledgments
Author Contributions
Conflicts of Interest
References
- Dengue and Severe Dengue. Available online: http://www.who.int/mediacentre/factsheets/fs117/en/ (accessed on 27 January 2016).
- Hay, S.I.; George, D.B.; Moyes, C.L.; Brownstein, J.S. Big data opportunities for global infectious disease surveillance. PLoS Med. 2013, 10, e1001413. [Google Scholar] [CrossRef] [PubMed]
- Ginsberg, J.; Mohebbi, M.H.; Patel, R.S.; Brammer, L.; Smolinski, M.S.; Brilliant, L. Detecting influenza epidemics using search engine query data. Nature 2009, 457, 1012–1014. [Google Scholar] [CrossRef] [PubMed]
- Raubal, M.; Jacquez, G.; Wilson, J.; Kuhn, W. Synthesizing population, health, and place. J. Spat. Inf. Sci. 2013, 7, 103–108. [Google Scholar] [CrossRef]
- Richardson, D.B.; Volkow, N.D.; Kwan, M.-P.; Kaplan, R.M.; Goodchild, M.F.; Croyle, R.T. Spatial turn in health research. Science 2013, 339, 1390–1392. [Google Scholar] [CrossRef] [PubMed]
- Anand, S.; Narayana, K. Earthquake reporting system development by tweet analysis. Int. J. Emerg. Eng. Res. Technol. 2014, 2, 96–106. [Google Scholar]
- MacEachren, A.M.; Robinson, A.C.; Jaiswal, A.; Pezanowski, S.; Savelyev, A.; Blanford, J.; Mitra, P. Geo-Twitter analytics: Applications in crisis management. In Proceedings of the 25th International Cartographic Conference, Paris, France, 3–8 July 2011; pp. 3–8.
- Ghosh, D.; Guha, R. What are we “tweeting” about obesity? Mapping tweets with topic modeling and geographic information system. Cartogr. Geogr. Inf. Sci. 2013, 40, 90–102. [Google Scholar] [CrossRef] [PubMed]
- Qu, Y.; Huang, C.; Zhang, P.; Zhang, J. Microblogging after a major disaster in China. In Proceedings of the ACM 2011 Conference on Computer Supported Cooperative Work, Hangzhou, China, 19–23 March 2011.
- Sakaki, T.; Okazaki, M.; Matsuo, Y. Tweet analysis for real-time event detection and earthquake reporting system development. IEEE Trans. Knowl. Data Eng. 2013, 25, 919–931. [Google Scholar] [CrossRef]
- Widener, M.J.; Li, W. Using geolocated Twitter data to monitor the prevalence of healthy and unhealthy food references across the US. Appl. Geogr. 2014, 54, 189–197. [Google Scholar] [CrossRef]
- Chen, X.; Yang, X. Does food environment influence food choices? A geographical analysis through “tweets”. Appl. Geogr. 2014, 51, 82–89. [Google Scholar] [CrossRef]
- King, D.; Ramirez-Cano, D.; Greaves, F.; Vlaev, I.; Beales, S.; Darzi, A. Twitter and the health reforms in the English National Health Service. Health Policy 2013, 110, 291–297. [Google Scholar] [CrossRef] [PubMed]
- Rogers, D.J.; Wilson, A.J.; Hay, S.I.; Graham, A.J. The global distribution of yellow fever and dengue. Adv. Parasitol. 2006, 62, 181–220. [Google Scholar] [PubMed]
- Brownstein, J.S.; Freifeld, C.C.; Reis, B.Y.; Mandl, K.D. Surveillance Sans Frontières: Internet-based emerging infectious disease intelligence and the health map project. PLoS Med. 2008, 5, e151. [Google Scholar] [CrossRef] [PubMed]
- Benson, D.A.; Cavanaugh, M.; Clark, K.; Karsch-Mizrachi, I.; Lipman, D.J.; Ostell, J.; Sayers, E.W. GenBank. Nucleic Acids Res. 2013, 41, D36–D42. [Google Scholar] [CrossRef] [PubMed]
- Brady, O.J.; Gething, P.W.; Bhatt, S.; Messina, J.P.; Brownstein, J.S.; Hoen, A.G.; Moyes, C.L.; Farlow, A.W.; Scott, T.W.; Hay, S.I. Refining the global spatial limits of dengue virus transmission by evidence-based consensus. PLoS Negl. Trop. Dis. 2012, 6, e1760. [Google Scholar] [CrossRef] [PubMed]
- Messina, J.P.; Brady, O.J.; Pigott, D.M.; Brownstein, J.S.; Hoen, A.G.; Hay, S.I. A global compendium of human dengue virus occurrence. Sci. Data 2014, 1. [Google Scholar] [CrossRef] [PubMed]
- Milinovich, G.J.; Williams, G.M.; Clements, A.C.A.; Hu, W. Internet-based surveillance systems for monitoring emerging infectious diseases. Lancet Infect. Dis. 2014, 14, 160–168. [Google Scholar] [CrossRef]
- Achrekar, H.; Gandhe, A.; Lazarus, R.; Yu, S.-H.; Liu, B. Predicting flu trends using Twitter data. In Proceedings of the 2011 IEEE Conference on Computer Communications Workshops, Shanghai, China, 10–15 April 2011.
- Velardi, P.; Stilo, G.; Tozzi, A.E.; Gesualdo, F. Twitter mining for fine-grained syndromic surveillance. Artif. Intell. Med. 2014, 61, 153–163. [Google Scholar] [CrossRef] [PubMed]
- Padmanabhan, A.; Wang, S.; Cao, G.; Hwang, M.; Zhang, Z.; Gao, Y.; Soltani, K.; Liu, Y. FluMapper: A cyberGIS application for interactive analysis of massive location-based social media. Concurr. Comput. Pract. Exp. 2014, 26, 2253–2265. [Google Scholar] [CrossRef]
- Wang, S.; Paul, M.; Dredze, M. Exploring health topics in Chinese social media: An analysis of Sina Weibo. In Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, Québec City, QC, Canada, 27–31 July 2014.
- Aslam, A.A.; Tsou, M.; Spitzberg, H.B.; An, L.; Gawron, J.M.; Gupta, D.K.; Peddecord, K.M.; Nagel, A.C.; Allen, C.; Yang, J.A.; et al. The reliability of tweets as a supplementary method of seasonal influenza surveillance. J. Med. Internet Res. 2014. [Google Scholar] [CrossRef] [PubMed]
- Blei, D.M.; Andrew, Y.; Ng, M.I.J. Latent Dirichlet allocation. J. Mach. Learn. Res. 2003, 3, 993–1022. [Google Scholar]
- Krestel, R.; Fankhauser, P.; Nejdl, W. Latent dirichlet allocation for tag recommendation. In Proceedings of the Third ACM Conference on Recommender Systems, New York, NY, USA, 23–25 October 2009.
- Lienou, M.; Maitre, H.; Datcu, M. Semantic annotation of satellite images using latent Dirichlet allocation. IEEE Geosci. Remote Sens. Lett. 2010, 7, 28–32. [Google Scholar] [CrossRef]
- Porteous, I.; Newman, D.; Ihler, A.; Asuncion, A.; Smyth, P.; Welling, M. Fast collapsed gibbs sampling for latent dirichlet allocation. In Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, NV, USA, 24–27 August 2008.
- Ruths, D.; Pfeffer, J. Social media for large studies of behavior. Science 2014, 346, 1063–1064. [Google Scholar] [CrossRef] [PubMed]
- Kalman, R.E. A new approach to linear filtering and prediction problems. J. Basic Eng. 1960, 82, 35–45. [Google Scholar] [CrossRef]
- Chinese Center for Disease Control and Prevention. Dengue and Severe Dengue. Available online: http://www.chinacdc.cn/tjsj (accessed on 27 February 2015). (In Chinese)
- Seneviratne, S.; Gurugama, P.; Garg, P.; Perera, J.; Wijewickrama, A. Dengue viral infections. Indian J. Dermatol. 2010, 55, 68. [Google Scholar] [CrossRef] [PubMed]
- Rigau-Pérez, J.G.; Clark, G.G.; Gubler, D.J.; Reiter, P.; Sanders, E.J.; Vorndam, A.V. Dengue and dengue haemorrhagic fever. Lancet 1998, 352, 971–977. [Google Scholar] [CrossRef]
- Ye, X.; Lee, J. Integrating Geographic Activity Space and Social Network Space to Promote Healthy Lifestyles. ACMSIGSPATIAL Health GIS. Available online: http://www.sigspatial.org/sigspatial-special-issues/sigspatial-special-volume-8-number-1-march-2016/Paper3.pdf (accessed on 27 August 2016).
- Wang, Z.; Ye, X.; Tsou, M. Spatial, temporal, and content analysis of Twitter for wildfire hazards. Nat. Hazards 2016. [Google Scholar] [CrossRef]
- Li, S.; Ye, X.; Lee, J.; Gong, J.; Qin, C. Spatiotemporal analysis of housing prices in China: A big data perspective. Appl. Spat. Anal. Policy 2016. [Google Scholar] [CrossRef]
- Chong, Z.; Qin, C.; Ye, X. Environmental regulation, economic network and sustainable growth of urban agglomerations in China. Sustainability 2016, 8, 467. [Google Scholar] [CrossRef]
- Zhang, F.; Zhu, X.; Ye, X.; Guo, W.; Hu, T.; Huang, L. Analyzing urban human mobility patterns through thematic model at the finer scale. ISPRS Int. J. Geo-Inf. 2016, 5, 78. [Google Scholar] [CrossRef]
- Shaw, S.; Tsou, M.; Ye, X. Human dynamics in the mobile and big data era. Int. J. Geogr. Inf. Sci. 2016, 30, 1687–1693. [Google Scholar] [CrossRef]
- Yang, X.; Ye, X.; Sui, D.Z. We know where you are: In space and place-enriching the geographical context through social media. Int. J. Appl. Geospat. Res. 2016, 7, 61–75. [Google Scholar] [CrossRef]
- Wang, Y.; Wang, T.; Ye, X.; Zhu, J.; Lee, J. Using social media for emergency response and urban sustainability: A case study of the 2012 Beijing rainstorm. Sustainability 2016, 8, 25. [Google Scholar] [CrossRef]
- Zhao, H.; Lee, J.; Ye, X.; Tyner, J. Spatiotemporal analyses of religious establishments in coastal China. GeoJournal 2016. [Google Scholar] [CrossRef]
- Huang, X.; Zhao, Y.; Yang, J.; Zhang, C.; Ma, C.; Ye, X. TrajGraph: A graph-based visual analytics approach to studying urban network centralities using taxi trajectory data. IEEE Trans. Vis. Comput. Graph. 2016, 22, 160–169. [Google Scholar] [CrossRef] [PubMed]
Dataset | Number | Data Fields | Time |
---|---|---|---|
1 | 23,144 | ID, text, post-time, userid, area, latitude, longitude, province, city | From 1 June 2014 to 28 October 2014 in Guangdong province |
2 | 1910 | ID, text, post-time, userid, area, latitude, longitude, province, city | Georeferenced in Dataset 1 |
3 | 5270 | #posts with “de”, #posts with “dengue”, date, province | 1 June 2014 to 2 November 2014, in China |
4 | 504 | #posts with “dengue”, hour, city | 22 September 2014, in Guangdong province |
5 | 504 | #new dengue cases, city | 22 September 2014, in Guangdong province |
Prevention | Detection | Fear | |||
---|---|---|---|---|---|
Words | Probabilities | Words | Probabilities | Words | Probabilities |
Carried | 0.063 | Case | 0.104 | Mosquito | 0.186 |
Prevention | 0.061 | Infection | 0.057 | No | 0.044 |
Work | 0.058 | Our city | 0.049 | Now | 0.026 |
Inspection | 0.044 | Current | 0.048 | Know | 0.024 |
Strengthen | 0.034 | Find | 0.038 | Really | 0.018 |
Health | 0.033 | Patient | 0.034 | Family | 0.017 |
Staff | 0.025 | Severe | 0.026 | Feeling | 0.017 |
Construction site | 0.019 | Input | 0.026 | Feel | 0.015 |
Company | 0.015 | Happen | 0.024 | Should | 0.014 |
Situation | 0.015 | Treatment | 0.022 | Recently | 0.013 |
Recently | 0.013 | Yesterday | 0.020 | Easy | 0.013 |
Neighboring | 0.012 | Disease | 0.020 | Terrible | 0.013 |
Emphasis | 0.011 | Yesterday | 0.019 | Powerful | 0.012 |
Management | 0.010 | Reporter | 0.019 | See | 0.010 |
Increase | 0.009 | Center | 0.019 | Scary | 0.010 |
Efforts | 0.009 | Risk | 0.015 | Danger | 0.010 |
Area | 0.009 | This year | 0.014 | Actually | 0.009 |
Unit | 0.009 | Understand | 0.013 | Is not it | 0.009 |
Recent | 0.009 | Arise | 0.013 | Really | 0.009 |
Public | 0.009 | Hospitalized | 0.011 | A little | 0.008 |
Symptom | Care | ||
---|---|---|---|
Word | Probabilities | Word | Probabilities |
Symptom | 0.055 | Mosquito | 0.166 |
Occur | 0.039 | No | 0.051 |
Fever | 0.035 | Find | 0.030 |
Headache | 0.024 | Action | 0.020 |
Rash | 0.023 | Need | 0.020 |
Pathogenesis | 0.023 | Family | 0.018 |
Muscle | 0.022 | Easy | 0.017 |
Decrease | 0.022 | Issue | 0.016 |
Virus | 0.019 | Now | 0.012 |
Main | 0.019 | Introduce | 0.011 |
Cause | 0.017 | Mother | 0.011 |
Acute | 0.016 | Terrible | 0.011 |
Prevention | 0.015 | Should | 0.011 |
Disease | 0.015 | See | 0.011 |
Performance | 0.014 | Friend | 0.010 |
Hemorrhage | 0.014 | Worry | 0.009 |
Treatment | 0.013 | Like | 0.008 |
Arthralgia | 0.013 | Actually | 0.008 |
Infect | 0.013 | Phone | 0.008 |
Clinical | 0.012 | Detect | 0.008 |
Topic | Number of Sub-Topics | Number of Weibo Posts |
---|---|---|
Prevention | 9 | 7669 |
Detection | 8 | 7751 |
Symptom | 1 | 2414 |
Fear | 1 | 1003 |
Care | 1 | 890 |
Total | 20 | 19,727 |
© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ye, X.; Li, S.; Yang, X.; Qin, C. Use of Social Media for the Detection and Analysis of Infectious Diseases in China. ISPRS Int. J. Geo-Inf. 2016, 5, 156. https://doi.org/10.3390/ijgi5090156
Ye X, Li S, Yang X, Qin C. Use of Social Media for the Detection and Analysis of Infectious Diseases in China. ISPRS International Journal of Geo-Information. 2016; 5(9):156. https://doi.org/10.3390/ijgi5090156
Chicago/Turabian StyleYe, Xinyue, Shengwen Li, Xining Yang, and Chenglin Qin. 2016. "Use of Social Media for the Detection and Analysis of Infectious Diseases in China" ISPRS International Journal of Geo-Information 5, no. 9: 156. https://doi.org/10.3390/ijgi5090156
APA StyleYe, X., Li, S., Yang, X., & Qin, C. (2016). Use of Social Media for the Detection and Analysis of Infectious Diseases in China. ISPRS International Journal of Geo-Information, 5(9), 156. https://doi.org/10.3390/ijgi5090156