Measuring the Impact of Natural Hazards with Citizen Science: The Case of Flooded Area Estimation Using Twitter
Abstract
:1. Introduction
2. Author Contributions and Position
3. Spatial Mapping Problem Definition
4. Feature Vector Mapping Function
5. Data and Preprocessing
5.1. Target Map Construction
5.2. Twitter Data Collection
5.3. Textual Representation
5.4. Spatial Information Extraction
5.5. Named Entity Recognition
6. Experiments
6.1. Experimental Protocol
6.2. Quantitative and Qualitative Results
7. Discussion
8. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Sakaki, T.; Okazaki, M.; Matsuo, Y. Tweet Analysis for Real-Time Event Detection and Earthquake Reporting System Development. IEEE Trans. Knowl. Data Eng. 2013, 25, 919–931. [Google Scholar] [CrossRef]
- de Bruijn, J.; de Moel, H.; Jongman, B.; Wagemaker, J.; Aerts, J. TAGGS: Grouping Tweets to Improve Global Geoparsing for Disaster Response. J. Geovisualiz. Spat. Anal. 2017, 2. [Google Scholar] [CrossRef] [Green Version]
- Chen, L.; Butler, P.; Ramakrishnan, N.; Prakash, B. Syndromic surveillance of Flu on Twitter using weakly supervised temporal topic models. Data Min. Knowl. Discov. 2016, 30, 681–710. [Google Scholar] [CrossRef]
- Jongman, B.; Wagemaker, J.; Romero, B.; De Perez, E. Early Flood Detection for Rapid Humanitarian Response: Harnessing Near Real-Time Satellite and Twitter Signals. ISPRS Int. J. Geo-Inf. 2015, 4, 2246–2266. [Google Scholar] [CrossRef] [Green Version]
- De Groeve, T.; Riva, P. Global real-time detection of major floods using passive microwave remote sensing. In Proceedings of the 33rd International Symposium on Remote Sensing of Environment, Tucson, AZ, USA, 4–8 May 2009. [Google Scholar]
- Wiegmann, M.; Kersten, J.; Senaratne, H.; Potthast, M.; Klan, F.; Stein, B. Opportunities and Risks of Disaster Data from Social Media: A Systematic Review of Incident Information. In Natural Hazards and Earth System Sciences Discussions; [preprint under review]; Copernicus Publications: Göttingen, Germany, 2020; pp. 1–16. [Google Scholar]
- Revilla-Romero, B.; Wanders, N.; Burek, P.; Salamon, P.; de Roo, A. Integrating remotely sensed surface water extent into continental scale hydrology. J. Hydrol. 2016, 543, 659–670. [Google Scholar] [CrossRef] [PubMed]
- Grimaldi, S.; Li, Y.; Pauwels, V.R.N.; Walker, J.P. Remote Sensing-Derived Water Extent and Level to Constrain Hydraulic Flood Forecasting Models: Opportunities and Challenges. Surv. Geophys. 2016, 37, 977–1034. [Google Scholar] [CrossRef]
- Hostache, R.; Chini, M.; Giustarini, L.; Neal, J.; Kavetski, D.; Wood, M.; Corato, G.; Pelich, R.M.; Matgen, P. Near-Real-Time Assimilation of SAR-Derived Flood Maps for Improving Flood Forecasts. Water Resour. Res. 2018, 54, 5516–5535. [Google Scholar] [CrossRef]
- MacEachren, A.M.; Jaiswal, A.; Robinson, A.C.; Pezanowski, S.; Savelyev, A.; Mitra, P.; Zhang, X.; Blanford, J. SensePlace2: GeoTwitter analytics support for situational awareness. In Proceedings of the 2011 IEEE Conference on Visual Analytics Science and Technology (VAST), Providence, RI, USA, 23–28 October 2011; pp. 181–190. [Google Scholar]
- Crooks, A.; Croitoru, A.; Stefanidis, A.; Radzikowski, J. #Earthquake: Twitter as a Distributed Sensor System. Trans. GIS 2013, 17, 124–147. [Google Scholar]
- Cheng, T.; Wicks, T. Event Detection using Twitter: A Spatio-Temporal Approach. PLoS ONE 2014, 9, e97807. [Google Scholar] [CrossRef]
- Craglia, M.; Ostermann, F.; Spinsanti, L. Digital Earth from vision to practice: Making sense of citizen-generated content. Int. J. Digit. Earth 2012, 5, 398–416. [Google Scholar] [CrossRef]
- Middleton, S.; Middleton, L.; Modafferi, S. Real-Time Crisis Mapping of Natural Disasters Using Social Media. IEEE Intell. Syst. 2014, 29, 9–17. [Google Scholar] [CrossRef] [Green Version]
- Granell, C.; Ostermann, F.O. Beyond data collection: Objectives and methods of research using VGI and geo-social media for disaster management. Comput. Environ. Urban Syst. 2016, 59, 231–243. [Google Scholar] [CrossRef]
- Zhang, C.; Fan, C.; Yao, W.; Hu, X.; Mostafavi, A. Social media for intelligent public information and warning in disasters: An interdisciplinary review. Int. J. Inf. Manag. 2019, 49, 190–207. [Google Scholar] [CrossRef]
- Grace, R. Hyperlocal Toponym Usage in Storm-related Social Media. In Proceedings of the 17th ISCRAM Conference, Blacksburg, VA, USA, 24–27 May 2020. [Google Scholar]
- Schulz, A.; Hadjakos, A.; Paulheim, H.; Nachtwey, J.; Mühlhäuser, M. A Multi-Indicator Approach for Geolocalization of Tweets. In Proceedings of the Seventh International AAAI Conference on Weblogs and Social Media, Dublin, Ireland, 4–7 June 2013; pp. 1–10. [Google Scholar]
- Dittrich, A. Real-Time Event Analysis and Spatial Information Extraction From Text Using Social Media Data. Ph.D. Thesis, KIT, Karlsruher, Germany, 2016. [Google Scholar]
- Mishra, S.; Diesner, J. Semi-supervised Named Entity Recognition in noisy-text. In Proceedings of the 2nd Workshop on Noisy User-Generated Text (WNUT), Osaka, Japan, 11 December 2016; pp. 203–212. [Google Scholar]
- Krieger, M.; Ahn, D. TweetMotif: Exploratory search and topic summarization for Twitter. In Proceedings of the AAAI Conference on Weblogs and Social Media, Washington, DC, USA, 23–26 May 2010. [Google Scholar]
- Brangbour, E.; Bruneau, P.; Marchand-Maillet, S.; Hostache, R.; Matgen, P.; Chini, M.; Tamisier, T. Extracting localized information from a Twitter corpus for flood prevention. arXiv 2019, arXiv:1903.04748. [Google Scholar]
- Robertson, C.; Feick, R. Inference and analysis across spatial supports in the big data era: Uncertain point observations and geographic contexts. Trans. GIS 2018, 22, 455–476. [Google Scholar] [CrossRef]
- Bates, P.; De Roo, A. A simple raster-based model for flood inundation simulation. J. Hydrol. 2000, 236, 54–77. [Google Scholar] [CrossRef]
- Andreadis, K.M.; Schumann, G.J.P. Estimating the impact of satellite observations on the predictability of large-scale hydraulic models. Adv. Water Resour. 2014, 73, 44–54. [Google Scholar] [CrossRef]
- García-Pintado, J.; Mason, D.C.; Dance, S.L.; Cloke, H.L.; Neal, J.C.; Freer, J.; Bates, P.D. Satellite-supported flood forecasting in river networks: A real case study. J. Hydrol. 2015, 523, 706–724. [Google Scholar] [CrossRef] [Green Version]
- Brouwer, T.; Eilander, D.; Van Loenen, A.; Booij, M.; Wijnberg, K.; Verkade, J.; Wagemaker, J. Probabilistic flood extent estimates from social media flood observations. In Natural Hazards and Earth System Sciences; Copernicus Publications: Göttingen, Germany, 2017; Volume 17. [Google Scholar]
- Nobre, A.D.; Cuartas, L.A.; Hodnett, M.; Rennó, C.D.; Rodrigues, G.; Silveira, A.; Waterloo, M.; Saleska, S. Height Above the Nearest Drainage—A hydrologically relevant new terrain model. J. Hydrol. 2011, 404, 13–29. [Google Scholar] [CrossRef] [Green Version]
- Nobre, A.D.; Cuartas, L.A.; Momo, M.R.; Severo, D.L.; Pinheiro, A.; Nobre, C.A. HAND contour: A new proxy predictor of inundation extent. Hydrol. Process. 2016, 30, 320–333. [Google Scholar] [CrossRef]
- Eilander, D.; Trambauer, P.; Wagemaker, J.; van Loenen, A. Harvesting Social Media for Generation of Near Real-time Flood Maps. Procedia Eng. 2016, 154, 176–183. [Google Scholar] [CrossRef] [Green Version]
- Karssenberg, D.; Burrough, P.; Sluiter, R.; de Jong, K. The PCRaster Software and Course Materials for Teaching Numerical Modelling in the Environmental Sciences. Trans. GIS 2001, 5, 99–110. [Google Scholar] [CrossRef] [Green Version]
- Fohringer, J.; Dransch, D.; Kreibich, H.; Schröter, K. Social media as an information source for rapid flood inundation mapping. Nat. Hazards Earth Syst. Sci. 2015, 15, 2725–2738. [Google Scholar] [CrossRef] [Green Version]
- Joachims, T. Text categorization with Support Vector Machines: Learning with many relevant features. In ECML-98; Springer: Berlin/Heidelberg, Germany, 1998; pp. 137–142. [Google Scholar]
- Lampos, V.; Cristianini, N. Nowcasting Events from the Social Web with Statistical Learning. ACM Trans. Intell. Syst. Technol. 2012, 3, 1–22. [Google Scholar] [CrossRef] [Green Version]
- Dhingra, B.; Zhou, Z.; Fitzpatrick, D.; Muehl, M.; Cohen, W. Tweet2Vec: Character-Based Distributed Representations for Social Media. arXiv 2016, arXiv:1605.03481. [Google Scholar]
- Oh Song, H.; Xiang, Y.; Jegelka, S.; Savarese, S. Deep Metric Learning via Lifted Structured Feature Embedding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 4004–4012. [Google Scholar]
- Xiang, G.; Fan, B.; Wang, L.; Hong, J.; Rose, C. Detecting offensive tweets via topical feature discovery over a large scale twitter corpus. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management. Association for Computing Machinery, Maui, HI, USA, 29 October 2012; pp. 1980–1984. [Google Scholar]
- Parekh, P.; Patel, H. Toxic Comment Tools: A Case Study. Int. J. Adv. Res. Comput. Sci. 2017, 8, 964–967. [Google Scholar]
- Gao, Y.; Wang, S.; Padmanabhan, A.; Yin, J.; Cao, G. Mapping spatiotemporal patterns of events using social media: A case study of influenza trends. Int. J. Geogr. Inf. Sci. 2018, 32, 425–449. [Google Scholar] [CrossRef]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient Estimation of Word Representations in Vector Space. arXiv 2013, arXiv:1301.3781. [Google Scholar]
- Graves, A.; Fernández, S.; Schmidhuber, J. Bidirectional LSTM Networks for Improved Phoneme Classification and Recognition. In Artificial Neural Networks: Formal Models and Their Applications—ICANN 2005; Springer: Berlin/Heidelberg, Germany, 2005; pp. 799–804. [Google Scholar]
- Littman, J. Hurricanes Harvey and Irma Tweet ids. 2017. Available online: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/QRKIBW (accessed on 17 March 2021).
- Brangbour, E.; Bruneau, P.; Marchand-Maillet, S.; Hostache, R.; Chini, M.; Matgen, P.; Tamisier, T. Computing flood probabilities using Twitter: Application to the Houston urban area during Harvey. In Proceedings of the 9th International Workshop on Climate Informatics, Paris, France, 2–4 October 2019. [Google Scholar]
- Krapac, J.; Verbeek, J.; Jurie, F. Modeling spatial layout with fisher vectors for image categorization. In Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 1487–1494. [Google Scholar]
- Mintz, M.; Bills, S.; Snow, R.; Jurafsky, D. Distant supervision for relation extraction without labeled data. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Singapore, 2–7 August 2009; pp. 1003–1011. [Google Scholar]
- Giustarini, L.; Hostache, R.; Kavetski, D.; Chini, M.; Corato, G.; Schlaffer, S.; Matgen, P. Probabilistic Flood Mapping Using Synthetic Aperture Radar Data. IEEE Trans. Geosci. Remote Sens. 2016, 54, 6958–6969. [Google Scholar] [CrossRef]
- Chini, M.; Pelich, R.; Pulvirenti, L.; Pierdicca, N.; Hostache, R.; Matgen, P. Sentinel-1 InSAR Coherence to Detect Floodwater in Urban Areas: Houston and Hurricane Harvey as A Test Case. Remote Sens. 2019, 11, 107. [Google Scholar] [CrossRef] [Green Version]
- Pulvirenti, L.; Chini, M.; Pierdicca, N. InSAR Multitemporal Data over Persistent Scatterers to Detect Floodwater in Urban Areas: A Case Study in Beletweyne, Somalia. Remote Sens. 2021, 13, 37. [Google Scholar] [CrossRef]
- Fletcher, R. Practical Methods of Optimization, 2nd ed.; Wiley & Sons: Hoboken, NJ, USA, 1987. [Google Scholar]
- Ruder, S. An overview of gradient descent optimization algorithms. arXiv 2017, arXiv:1609.04747. [Google Scholar]
- Lampos, V.; Zou, B.; Cox, I. Enhancing Feature Selection Using Word Embeddings: The Case of Flu Surveillance. In Proceedings of the 26th International Conference on World Wide Web, Perth Australia, 8 April 2017; pp. 695–704. [Google Scholar]
- Dempster, A.P.; Laird, N.M.; Rubin, D.B. Maximum Likelihood from Incomplete Data Via the EM Algorithm. J. R. Stat. Soc. Ser. B Methodol. 1977, 39, 1–22. [Google Scholar]
- Porter, M. An algorithm for suffix stripping. Program 1980, 14, 130–137. [Google Scholar] [CrossRef]
- Kitamoto, A.; Sagara, T. Toponym-based geotagging for observing precipitation from social and scientific data streams. In Proceedings of the ACM Multimedia 2012 Workshop on Geotagging and Its Applications in Multimedia, Nara, Japan, 2 November 2012; pp. 23–26. [Google Scholar]
- Fung, I.C.H.; Tse, Z.T.H.; Cheung, C.N.; Miu, A.S.; Fu, K.W. Ebola and the social media. Lancet 2014, 384, 2207. [Google Scholar] [CrossRef]
- Shelton, T.; Poorthuis, A.; Graham, M.; Zook, M. Mapping the data shadows of Hurricane Sandy: Uncovering the sociospatial dimensions of ‘big data’. Geoforum 2014, 52, 167–179. [Google Scholar] [CrossRef]
- Brangbour, E.; Bruneau, P.; Tamisier, T.; Marchand-Maillet, S. Active Learning with Crowdsourcing for the Cold Start of Imbalanced Classifiers. In Proceedings of the 17th International Conference on Cooperative Design, Visualization, and Engineering, Whistler, BC, Canada, 25–28 October 2020; pp. 192–201. [Google Scholar]
- Perrin, A.; Anderson, M. Share of U.S. Adults Using Social Media, Including Facebook, Is Mostly Unchanged since 2018. Pew Research Center, 2019. Available online: https://www.pewresearch.org/fact-tank/2019/04/10/share-of-u-s-adults-using-social-media-including-facebook-is-mostly-unchanged-since-2018/ (accessed on 17 March 2021.).
- Bischke, B.; Helber, P.; Schulze, C.; Srinivasan, V.; Dengel, A.; Borth, D. The Multimedia Satellite Task at MediaEval 2017. In Proceedings of the MediaEval 2017 Workshop, Dublin, Ireland, 13–15 September 2017. [Google Scholar]
Spatial Information | With w | With | ||
---|---|---|---|---|
Support (# Pixels) | Imbalance | Support (# Pixels) | Imbalance | |
Place | 49,077 | 11% | 23,652 | 16% |
Geotag | 57,616 | 10% | 29,405 | 11% |
Place+geotag | 83,514 | 10% | 45,628 | 12% |
Polygon stack | 56,549 | 11% | 28,444 | 15% |
Pre-Processing | Feature Vector | F1 | P | R | Top-10 F1 | Top-10 P | Top-10 R |
---|---|---|---|---|---|---|---|
Place + geotag | keyword-match + + | 0.167 | 0.101 | 0.552 | 0.115 | 0.065 | 0.531 |
Place + geotag | TFIDF | 0.319 | 0.273 | 0.386 | 0.684 | 0.561 | 0.878 |
TFIDF + | 0.365 | 0.313 | 0.440 | 0.710 | 0.621 | 0.829 | |
TFIDF + | 0.272 | 0.184 | 0.517 | 0.485 | 0.339 | 0.859 | |
TFIDF + + | 0.319 | 0.239 | 0.483 | 0.521 | 0.390 | 0.792 | |
Place only | TFIDF | 0.313 | 0.220 | 0.546 | 0.564 | 0.407 | 0.917 |
TFIDF + | 0.404 | 0.305 | 0.603 | 0.777 | 0.665 | 0.937 | |
TFIDF + | 0.305 | 0.210 | 0.563 | 0.554 | 0.388 | 0.972 | |
TFIDF + + | 0.413 | 0.320 | 0.586 | 0.799 | 0.674 | 0.984 | |
Geotag only | TFIDF | 0.327 | 0.320 | 0.336 | 0.682 | 0.633 | 0.739 |
TFIDF + | 0.309 | 0.388 | 0.258 | 0.609 | 0.612 | 0.610 | |
TFIDF + | 0.287 | 0.216 | 0.431 | 0.586 | 0.454 | 0.830 | |
TFIDF + + | 0.262 | 0.244 | 0.286 | 0.455 | 0.381 | 0.575 | |
Poly. stack | TFIDF | 0.333 | 0.272 | 0.432 | 0.658 | 0.549 | 0.821 |
TFIDF + | 0.406 | 0.324 | 0.545 | 0.796 | 0.725 | 0.884 | |
TFIDF + | 0.315 | 0.232 | 0.496 | 0.623 | 0.482 | 0.882 | |
TFIDF + + | 0.399 | 0.310 | 0.558 | 0.775 | 0.680 | 0.902 | |
Poly. + geotag | TFIDF | 0.325 | 0.252 | 0.459 | 0.658 | 0.527 | 0.879 |
TFIDF + | 0.403 | 0.311 | 0.575 | 0.792 | 0.706 | 0.902 | |
TFIDF + | 0.307 | 0.221 | 0.502 | 0.597 | 0.448 | 0.898 | |
TFIDF + + | 0.392 | 0.307 | 0.544 | 0.767 | 0.666 | 0.903 | |
Place + geotag | Tweet2Vec | 0.267 | 0.182 | 0.501 | 0.467 | 0.334 | 0.775 |
Tweet2Vec + | 0.315 | 0.230 | 0.499 | 0.458 | 0.331 | 0.746 | |
Tweet2Vec + | 0.275 | 0.191 | 0.497 | 0.512 | 0.359 | 0.895 | |
Tweet2Vec + + | 0.331 | 0.256 | 0.469 | 0.625 | 0.489 | 0.867 | |
Place only | Tweet2Vec | 0.301 | 0.207 | 0.547 | 0.533 | 0.379 | 0.900 |
Tweet2Vec + | 0.410 | 0.314 | 0.600 | 0.748 | 0.634 | 0.912 | |
Tweet2Vec + | 0.310 | 0.222 | 0.518 | 0.578 | 0.413 | 0.965 | |
Tweet2Vec + + | 0.425 | 0.342 | 0.567 | 0.834 | 0.728 | 0.977 | |
Geotag only | Tweet2Vec | 0.289 | 0.219 | 0.427 | 0.582 | 0.466 | 0.776 |
Tweet2Vec + | 0.278 | 0.251 | 0.312 | 0.466 | 0.386 | 0.590 | |
Tweet2Vec + | 0.286 | 0.216 | 0.424 | 0.534 | 0.405 | 0.790 | |
Tweet2Vec + + | 0.270 | 0.252 | 0.291 | 0.458 | 0.373 | 0.595 | |
Poly. stack | Tweet2Vec | 0.307 | 0.222 | 0.500 | 0.548 | 0.398 | 0.879 |
Tweet2Vec + | 0.373 | 0.275 | 0.579 | 0.727 | 0.597 | 0.931 | |
Tweet2Vec + | 0.327 | 0.245 | 0.489 | 0.630 | 0.489 | 0.886 | |
Tweet2Vec + + | 0.399 | 0.316 | 0.541 | 0.797 | 0.718 | 0.895 | |
Poly. + geotag | Tweet2Vec | 0.306 | 0.220 | 0.506 | 0.539 | 0.386 | 0.893 |
Tweet2Vec + | 0.384 | 0.280 | 0.610 | 0.677 | 0.531 | 0.938 | |
Tweet2Vec + | 0.326 | 0.244 | 0.496 | 0.592 | 0.454 | 0.853 | |
Tweet2Vec + + | 0.406 | 0.327 | 0.535 | 0.785 | 0.703 | 0.893 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Bruneau, P.; Brangbour, E.; Marchand-Maillet, S.; Hostache, R.; Chini, M.; Pelich, R.-M.; Matgen, P.; Tamisier, T. Measuring the Impact of Natural Hazards with Citizen Science: The Case of Flooded Area Estimation Using Twitter. Remote Sens. 2021, 13, 1153. https://doi.org/10.3390/rs13061153
Bruneau P, Brangbour E, Marchand-Maillet S, Hostache R, Chini M, Pelich R-M, Matgen P, Tamisier T. Measuring the Impact of Natural Hazards with Citizen Science: The Case of Flooded Area Estimation Using Twitter. Remote Sensing. 2021; 13(6):1153. https://doi.org/10.3390/rs13061153
Chicago/Turabian StyleBruneau, Pierrick, Etienne Brangbour, Stéphane Marchand-Maillet, Renaud Hostache, Marco Chini, Ramona-Maria Pelich, Patrick Matgen, and Thomas Tamisier. 2021. "Measuring the Impact of Natural Hazards with Citizen Science: The Case of Flooded Area Estimation Using Twitter" Remote Sensing 13, no. 6: 1153. https://doi.org/10.3390/rs13061153
APA StyleBruneau, P., Brangbour, E., Marchand-Maillet, S., Hostache, R., Chini, M., Pelich, R. -M., Matgen, P., & Tamisier, T. (2021). Measuring the Impact of Natural Hazards with Citizen Science: The Case of Flooded Area Estimation Using Twitter. Remote Sensing, 13(6), 1153. https://doi.org/10.3390/rs13061153