According to a recent report from an important insurance company, flooding was at the same time the costliest and the deadliest natural disaster in 2016, with considerable human fatalities [
1]. In fact, more than half of the natural hazards in 2016 were hydrological, which were the most devastating type of disaster financially with 59 billion USD worth of damages. This latter category of disasters is dominated by floods with 164 floods occurrences against 13 landslides. During the same year, floods caused the greatest loss of life among all the disasters, with 4731 deaths [
2]. Therefore, there is a growing need for a quantification of the impact of the flood to help response authorities to mitigate the damages and prioritize at the time of the emergency, as well as supporting insurance companies in working out an assessment of the losses sustained by each property. A thorough understanding of the potential flood risk can also assist development agencies to build resilient communities.
In the event of flooding, a clear cloud-free image acquired instantaneously is necessary to have a synoptic view of the affected area. In this context, remotely sensed images are suitable to map inundations, particularly when harsh climatic conditions are encountered and the access to the affected site is impractical [
3]. Moreover, satellite-borne Synthetic Aperture Radar (SAR) sensors have been extensively used in the last decade to monitor many flooding events by taking advantage of their ability to operate independently of the sunlight, and in cloudy conditions which are common during inundations. Thanks to the important number of SAR satellites in orbit, the user has a wide choice of datasets which come in different wavelengths and resolutions. For the time being, X-band sensors like TerraSAR-X and COSMO-SkyMed provide the highest spatial resolution among all SAR sensors [
4]. With this metric-resolution configuration, floods could be detected even in complex scenarios, such as urban settlements where streets are relatively narrow [
5]. Furthermore, the systematic acquisition plan of the Sentinel-1 C-band satellite increases the likelihood to find a reference image in the Sentinel Hub archive. A deeper understanding of the flood hazard is achieved by extracting the extent and depth flood features as well as assessing the velocity of the floodwater, which will help to efficiently manage the inundation risk.
Operational flood mapping aims to reduce the delay between the acquisition of the satellite image and the diffusion of the flood extent map produced from it to the civil protection authorities for instantaneous relief efforts. This objective is achieved with a fully automated flood detection service [
6]. The flood mapping service developed in [
6] is an improvement of [
7], where the workflow was adapted to operational situations covered by TerraSAR-X. Briefly, the service is triggered when the TerraSAR-X product is downloaded to the FTP server, and at the end of the process the flood extent map is available to visualize online via a Web interface. It should be mentioned that the radar pulse from COSMO-SkyMed, which operates in the X-band like TerraSAR-X, was found to be attenuated by the precipitation due to its relatively short wavelength [
8]. Besides, due to the time delay between the tasking of TerraSAR-X during an emergency flood situation and the actual acquisition of the image, the peak of the inundation might be missed since this satellite needs at least 2.5 days to access the requested site [
9]. In this case, the systematic acquisition mode of the Sentinel-1 constellation would be of great help. In [
10], the flood mapping service proposed in [
6] was modified to process Sentinel-1 SAR images. In particular, [
6] was improved by adding a post-processing step which consists in eliminating from the flood map areas higher than the nearest drainage network, using a thresholding strategy. The process in [
10] is in principal completely unsupervised yet the latter threshold was determined empirically. The Sentinel-1 images were automatically downloaded and preprocessed using the Sentinel Application Platform (SNAP), and then the classification carried on in a similar way to [
7]. It was found in [
10] that among the polarizations offered by the Sentinel-1 sensor, VV-polarized SAR images result in more accurate flood maps than cross-polarized (VH) products. In the same context, [
11] suggested that HH polarization realizes the highest accuracy in terms of flood mapping. In the same prospect of an operational mapping of the flood, [
12] proposed to detect the flood in vegetated, forested and built-up areas, besides the normal low backscatter flood (open water), using a Fuzzy logic approach which has permitted to combine data stemming from different sources (a DEM, a Land Cover Map). The threshold values are retrieved from three selected backscattering models (for agricultural, forested, and urban areas) applied with varying radar parameters to a number of land covers. To keep the process simple, the threshold values are calculated by considering only a few specific flood scenarios. As a result, it will be challenging to map the flood when, for instance, the plant characterisitics change and the backscattering model’s preconditions become unsatisfied. Although, this issue has been addressed by allowing the user to adjust manually the values of the default thresholds, this leads to a lack of automation in the process as a consequence. An automated flood mapping method based on the approximation of the Probability Density Function (PDF) of the backscatter of the water was proposed by [
13]. The threshold value is then defined as the point where the PDF of the backscatter and the gamma distribution modelling the water, and having backscatter values lower than this threshold, start to diverge. The Thresholding is followed with a region growing and a pixel-based change detection. The authors in [
14] addressed the challenging task of the urban flood mapping in an unsupervised way, by improving the process introduced in [
13] with a more objective estimation of the region growing’s tolerance criterion. In the context of image segmentation, the region growing cannot be generalized as its cost function is not defined a priori, but is set empirically according to the specific application instead. Furthermore, it fails in practice when the edges of the object to detect are too smooth [
15]. Another way the issue of the automation of the flood mapping was addressed in the literature is with a service running regularly as in [
16]. In this project, the multiyear ENVISAT ASAR dataset is tiled into splits of 1° longitude by 1° latitude, and the training step is carried out by making use of the SRTM-derived water mask (SWBD) and the features extracted from the tile, which consist mainly of the backscatter and the incidence angle. Subsequently, pixels from nonlabeled images are classified using Bayes’ theorem to get a probability map of water and land, after estimating the probability distributions for each class from trained histograms. Nevertheless, the low spatial resolution of the water mask which is crucial to the training phase, could impact negatively the precision of the classification, especially for smaller rivers.
Supervised and unsupervised learning methods were already applied, in a few studies, to tackle SAR flood extent mapping problems. The work in [
17] used a self-organizing map (SOM), which is essentially an unsupervised artificial neural network, to segment then classify a flooded SAR image. SOM being originally a dimensionality reduction technique, a moving window centered around SAR image pixels forms a vector of neighbouring pixels that are passed as input into the neural network to train it. At the end of the learning process, the central pixel of each sliding window is mapped onto one of the neurons of a 2D grid, with multiple image pixels possibly being assigned to the same neuron. This results in the flooded SAR image being segmented, with each neuron representing a cluster. However, the eventual classification of the neurons on the grid into water and non-water was performed with the help of ground truth pixels extracted manually. The authors in [
18] presented several semi-automatic and manual methods to map inundations, by exploiting free satellite multispectral and SAR data. Similarly to the current paper, the authors took advantage of water and vegetation indices like the Normalized Difference Vegetation Index (NDVI) and the Modified Normalized Difference Water Index (MNDWI), although it was the variation in these indices that was expected to reveal the presence of flooding. In another experiment, a supervised classifier was also investigated for the same purpose by choosing samples manually from different types of land cover. However, the latter two methods were applied separately and on multispectral optical images that could suffer from the cloud cover. When the flood mapping was carried out on SAR images, the threshold was manually adjusted either on a single flooded image or on the log ratio between a pair of images captured before and after the flood. With the aim of mapping urban flooding in [
19], first an active contour model (snake) is employed to detect the flood in rural areas. Then, a supervised Bayesian classification is carried out on adjacent flooded urban areas, where the training data for the flood and the non-flood classes is chosen from the previously obtained rural flood map and from urban areas situated higher than the rural water level on the LiDAR DSM (Digital Surface Model), respectively. This method is semi-automated since the initialization of the snake and the selection of the training dataset are both done manually.
This study will focus on the mapping of the flood extent characteristic by proposing a fully automated classifier trained on a dataset retrieved from a pre-flood SAR image with the help of an optical Sentinel-2 image. The availability of an optical image allows to derive a water-mask without any human intervention from the Normalized Difference Water Index (NDWI), which, when multiplied by the pre-flood SAR image, on a pixel basis, permits to build a training dataset of backscatter values for the water and non-water classes. The labelled training dataset is thereby extracted from the optical and the pre-flood SAR images in an automated fashion. The preprocessing of the dataset and the classification are invoked from an online web application to map the extent of the inundation present on a post-flood SAR image. This application is intended mainly for emergency situations. As a consequence, it is of extreme importance to extract the flood map as quickly as possible and in an unsupervised way. The current paper is structured as follows. In the next section (
Section 2), two datasets acquired with X-band TerraSAR-X and C-band Sentinel-1 of the inundations in Tewkesbury in 2007 and in Myanmar in 2015, respectively, are presented. The SAR images depicting these flood events will serve later on to assess the algorithm introduced relatively to a validation dataset. Afterwards, the theory behind the supervised classifier used specifically to cluster the flooded SAR image into two classes and the post-processing utilized for refining the classified flood map, as well as the entire automated flood mapping process, are explained in detail in
Section 3. The results of the flood mapping using the proposed method are validated and discussed in the following section (
Section 4). Eventually, this paper closes with a conclusion about the strengths and the constraints of this algorithm.