1. Introduction
Studying the spatial and temporal distribution of surface water resources is critical, especially in highly populated areas and in regions under climate change pressure. With an increased number of Earth-observation satellites providing a large diversity of remote sensing data, there is now the potential to monitor the surface water at regional to global scale. However, mapping surface water is still challenging. It is difficult to provide products with the accuracy required for a large range of applications (e.g., agriculture, disaster management, and hydrology).
Several methods have already been proposed to detect and monitor surface water with visible and Near-Infrared (NIR) images. Ref. [
1] used positive values of the Normalized Difference Water Index (NDWI) to classify water bodies. Ref. [
2] applied a threshold on NIR reflectances of the NOAA/AVHRR satellite to delineate lakes. Ref. [
3] detected surface water by identifying the positive values of the Modification of Normalized Difference Water Index (MNDWI). Ref. [
4] combined NIR data and the Normalized Difference Vegetation Index (NDVI) to detect surface water bodies. However, cloud contamination is a stringent constraint for these methods, limiting their application to cloud-free conditions which is very restrictive in some regions (e.g., in the Tropics). Vegetation can also mask the surface water partly or totally. This makes the water detection difficult or impossible under canopy. In addition, the NIR reflectance over highly turbid water can be higher than the red reflectance, introducing confusions in the indices used for the water detection.
Synthetic Aperture Radar (SAR) have become an important source of data to detect flood or monitor surface water as they allow observations regardless of the cloud cover, day and night, with spatial resolution comparable to visible and near-infrared satellite images [
5]. SAR instruments have been available on many sensors and platforms (Envisat ASAR, PALSAR, or RADARSAT, for example) providing observations for different areas all over the globe (but normally with a limited number of images available per year in some regions). Flood detection using different SAR observations has been studied by many authors, showcasing the advantages of SAR instruments compared to optical instruments in monitoring floods. Ref. [
6] used a single decision tree classifier on two sets of JERS-1 SAR data to classify surface water within the states of North Carolina and South Carolina into five land cover types (water, marsh, flooded forest, field, and non-flooded forest). Although the classifier was simple, they reported an overall classification accuracy of nearly 90%. Ref. [
7] showed the potential of the COSMO-SkyMed data for flood detection by showing case studies in several locations all over the globe (e.g., Tarano River overflow, Italy, April 2009; Pakistan inundation, July–September 2010; Thailand flood, October 2010; and Australia flood, January 2011). COSMO-SkyMed instruments provided very high resolution X-band SAR images, but covered limited areas (the highest spatial resolution is ∼1 m for an observation area of 10 km × 10 km). X-band data from TerraSAR-X instrument were also reported suitable for flood mapping under forest canopy in the temperate forest zone in Estonia [
8]. Ref. [
9] compared four flood detection approaches over five areas (Vietnam, the Netherlands, Mali, Germany, and China) using SAR data from the TanDEM-X mission. Although these four approaches were designed according to different requirements, their performances were satisfactory over the studied areas (17 out of 20 water masks reaching an overall accuracy larger than 90%). Other studies using SAR data for water monitoring locally and regionally under different environments can be listed, such as [
10,
11,
12]. Mapping water bodies at global scale using SAR data was limited due to the lack of global observations, and the fact that SAR data are not easy to access freely. Ref. [
13] used multi-year (2005–2012) Envisat ASAR observations to create, for the first time, a global potential water body map at a spatial resolution of 150 m. Errors concentrated along shorelines and coastline, but this global water map has an accuracy of ∼80% compared to the reference data.
The Mekong Delta in Southeast Asia (one of the largest deltas in the world) is a vast triangular plain of approximately 55,000 km
, most of it lower than 5 m above sea level. The seasonal variation in water level results in rich and extensive wetlands. For instance, the Mekong Delta region covers only 12% of Vietnam but produces ∼50% of the annual rice (with two or three harvests per year depending on the provinces), represents ∼50% of the fisheries, and ∼70% of the fruit production. In the Delta, the dry season extends from November to April and the rainy season from May to October. Many researches have been carried out to monitor the surface water in the Delta, using both optical and active microwave satellite images. Ref. [
14] produced a monthly mean climatology of the water extent from 2000 to 2004 with a spatial resolution of 500 m, using visible and NIR MODIS/Terra data. However, with 85% to 95% cloud cover during the wet season over the Mekong Delta [
15], remote sensing methods derived from visible and NIR images present some limitations. Different SAR observations have also been exploited to study floods and wetlands over the Delta. Ref. [
16] mapped flood occurrence for the year 1996 over the Delta using five ERS-2 observations. Ref. [
17] used 60 Envisat ASAR observations during the years 2007–2011 to study the flood regime in the Delta. Thanks to the launch of the Sentinel-1A &B satellites, as well as the free data policy of the European Space Agency (ESA), Sentinel-1 SAR observations are now regularly and freely accessible for scientific and educational purposes, over large parts of the globe. Similar to previous SAR instruments, Sentinel-1 instruments show strong potential for detecting open water bodies at high spatial resolution [
18,
19]. With the advantage of higher temporal resolution than previous SAR instruments, Sentinel-1 has the ability to monitor the seasonal cycle of water extent every six days over Europe and the boreal region, and with slightly reduced temporal sampling elsewhere. In this study, we propose a methodology using Sentinel-1A SAR observation for monitoring water surface extent within Cambodia and the Mekong Delta for the year 2015. It is based on a Neural Network (NN) algorithm, trained on visible Landsat-8 images (30 m spatial resolution). At the time of this study, the temporal resolution of Sentinel-1 over the Delta was 12 days: it reduced to 6 days after the launch of the Sentinel-1B in April 2016.
The Sentinel-1 SAR data and the ancillary observations are described in
Section 2, including the pre-processing steps.
Section 3 presents the NN methodology, along with sensitivity tests. Results and comparisons with other products are provided and discussed in
Section 4.
Section 5 concludes this study.
3. Methodology
3.1. Surface Water Information from the Sentinel-1 SAR Images
Flat water surfaces act like mirrors and reflect almost all incoming energy in the specular direction, thus providing very low backscatter. With this physical principle, detection of surface water is often based, at least partly, on the application of a threshold on the SAR backscatter coefficient, with the low backscatter values attributed to water bodies [
6,
7,
16,
17]. However, SAR backscatter coefficients over water surfaces are also affected by several mechanisms related to the interaction of the signal with vegetation or with possible surface roughness. The backscattered signals over flooded vegetation in wetlands can be enhanced due to the double-bounce scattering mechanism [
26,
27,
28]. On the other side, the backscatter coefficients can be affected by vegetation canopy (e.g., rice) above the water surfaces due to volume scattering from the plant components (stems or leaves) [
29]. The backscatter coefficients (especially the VV polarization) can also be influenced by the wind-induced surface roughness over open water [
17,
30]. Finally, there might be ambiguities between surface water and other very flat surfaces (such as arid regions), that could provide very similar backscatter signatures [
31].
Based on a reference water mask derived from Landsat-8 NDVI,
Figure 3 presents the histograms of the backscatter coefficients for VH and VV polarizations, separately for water and non-water pixels over the incidence angle range of 30°–45° for the area shown in
Figure 2. For both polarizations, the water and non-water histograms are rather well separated, with thresholds of −22 dB and −15 dB for the VH and VV polarizations, respectively. Using these thresholds, the surface water has been classified separately for each polarization. The classification derived from the VH polarized image had a stronger spatial linear correlation with the reference water mask than the one derived from the VV polarized image (72% compared to 62%), confirming a higher sensitivity of the VH polarization to the presence of surface water [
19]. Using both polarizations for the classification increased the correlation (76%), confirming that the two polarizations carry different information and that using both of them increases the retrieval accuracy. These findings confirmed the study by [
32] where water detection with VV polarization was further refined using multiple-polarization.
The effect of the backscatter incidence angle is also tested here. For a collection of pixels located over water (rivers, reservoirs, or lakes), the backscatter coefficient is plotted as a function of the incidence angle between 30° and 45° (
Figure 4). Similar negative correlations between incidence angle and backscatter coefficients can also be found in [
13] with ASAR data over water bodies (from ∼−5 dB at 20° to ∼−20 dB at 45° of incidence angle).
As a conclusion, the SAR backscatter coefficients (VH and VV polarizations) are both sensitive to the presence of water, but with slightly different sensitivities. The effect of the incidence angle, although rather limited within the 29°–46° range of Sentinel-1 SAR, has to be accounted for if a high detection accuracy is required. Simple tests on thresholding techniques illustrated the limitations of these approaches and here we suggest developing a new scheme to delineate the surface water based on Neural Networks.
The temporal dynamics of the backscatter coefficients can also be a source of information and can help disentangle the influence of the other surface parameters [
13]. However, this temporal information will not be investigated here.
3.2. A Neural Network-Based Classification
Here, we propose training a NN to produce surface water maps from SAR images, over the Mekong Delta. In the remote sensing field, NNs are often used as a regression tool to estimate a quantity. For each pixel, NN input satellite observations are represented by a vector x, and the network outputs (i.e., the retrieval) is represented by a vector y. However, NNs can also be used as classifiers. In this case, when trained with binary output values (y = 0 for non-water, 1 for water surfaces), the NN becomes a statistical model for the conditional probability , i.e., the probability of the surface being covered by water knowing the satellite observations x. The NN output can then directly be used as an index for water presence probability, but a threshold can also be applied to classify the state as being covered by water or not. The threshold needs to be optimized in order to satisfy some quality criteria, such as overall accuracy or false alarm rates.
The NN classifier needs to be trained in order to perform an optimal discrimination between water and non-water states. A supervised learning is chosen: the NN will be designed to reproduce an already existing classification. A dataset including a collection of SAR information
x and associated surface water state
y is first built. Part of it is then used during the training stage in order to determine the optimal parameters of the NN model. The reference dataset in the selected area is provided here by a Landsat-8 surface water map (NN outputs), in spatial and temporal coincidence with the Sentinel-1 SAR data (NN inputs). A maximum time difference of 3 days is tolerated, as the two satellites do not fly in phase. Six Landsat-8 surface water maps are selected, along with the corresponding Sentinel-1 SAR observations (see
Table 1 for more details on the training dataset). The selection process for the Landsat-8 images has been described in
Section 2.2.1. The images cover parts of the lower Mekong Delta in Vietnam and Cambodia. For each image in the training dataset, the number of non-water pixels is much higher than the number of water pixels. To avoid giving too much weight to the non-water pixels, an equalization of the training dataset is performed: an equal number of non-water and water pixels is selected in the training dataset. For this purpose, non-water pixels are selected randomly in the images, to match the number of water pixels. The total number of training samples is ∼10 million pixels, half water pixels, half non-water pixels. It takes ∼5 h to train the NN (with the use of a personal computer), but when the training is completed, a surface water map can be produced quickly (after ∼3–4 min) from any new set of satellite inputs
x. A test dataset is chosen to measure the performance of the NN retrieval scheme with data not used in the training process. The NN methodology is summarized in
Figure 5.
Several tests were necessary to determine the optimum inputs to the NN, in addition to the obvious ones, i.e., the backscatter coefficients for both polarizations. To limit ambiguities between flat arid surfaces and surface water, and to better capture small rivers, the spatial homogeneity of the backscatter coefficients appeared to be a relevant parameter. The standard deviation of the backscatter coefficients are computed locally over 100 m × 100 m boxes. As a result, the NN uses a maximum of five different inputs x:
SAR backscatter coefficient VH polarization (BS_VH);
SAR backscatter coefficient VV polarization (BS_VV);
SAR incidence angle;
SAR standard deviation of backscatter coefficient VH over 100 m × 100 m (STD_VH);
SAR standard deviation of backscatter coefficient VV over 100 m × 100 m (STD_VV);
Figure 6 presents an example of the set of five input images and the target surface water map used to train the NN. Missing areas in the maps correspond to Landsat-8 low quality pixels and are excluded from the training. The NN model is asked to find a relationship between these five input parameters and the corresponding water and non-water state.
3.3. NN Sensitivity Tests
In this section, we use a test dataset of three SAR Sentinel-1 images and three corresponding Landsat-8 reference surface water maps to make several sensitivity tests in order to optimize the performance of the NN classification (see details of the test data sets in
Table 1). Three different sensitivity tests were carried out: (1) selecting the best threshold of the NN output to classify land/water surface; (2) understanding the effect of the equalization of the water and non-water pixels in the NN training dataset; (3) finding the most important satellite NN inputs. The NN performances have been evaluated based on: spatial correlation between the SAR and Landsat-8 surface water maps, overall accuracy of the NN, as well as higher values of true positive (TP) and true negative (TN) percentages. True positive value indicates the NN ability to correctly detect water pixels, while true negative value illustrates its ability to correctly detect non-water pixels (compared to the Landsat-8 surface water maps).
3.3.1. Selection of an Optimized Threshold for the NN Output
The first test is conducted to optimize the output threshold to distinguish water from non-water pixels.
Figure 7 shows the histogram of the output of the NN, separating the water and non-water pixels according to the related Landsat-8 surface water map. The histograms of the water and non-water clusters intersect around 0.9, meaning that the optimal threshold to separate water from non-water pixels is close to this number. Different thresholds on the NN output values were tested (0.80, 0.85, and 0.90): for each one, the confusion matrix and the overall accuracy are calculated, with the corresponding Landsat-8 images as references. The overall accuracy and the spatial correlation increase from 98% to 99% when the threshold increases from 0.80 to 0.90 (
Table 3), but the true positive pixel detection decreases from 92% (with threshold 0.80) to 89% (with threshold 0.90) and the false negative pixel detection increases from 8% to 11%. A threshold of 0.85 is selected here because of its good water detection performance and because it results in the predicted water surface closest to the reference map: 4430 km
2 from the Landsat-8 versus 4420 km
2 from the SAR results, i.e., a limited overestimation of 0.4% as compared to the reference map.
3.3.2. Equalization of Water and Non-Water Pixel Number
For this test, instead of using an equal number of water and non-water pixels in the training dataset, 10% of each Sentinel-1 image is selected randomly to train the neural network, meaning that the number of non-water pixels is several times higher (10–15 times depending on each image in the training dataset) than the number of water pixels (as seen in
Figure 7). The intersection between histograms of the NN outputs for water pixels (blue) and non-water pixels (red) moves to 0.5 (see the histogram in
Figure 8), meaning that the value 0.5 should be selected to separate water from non-water clusters. As shown in
Table 4, the resulting NN is very efficient at detecting non-water pixels with a true negative detection of 99.7%, but it misses 14% of the actual water pixels (86% of true positive detection only, compared to 91% with the equalized training dataset—
Table 3). The true positive detection of water pixels decreases because in the training database, the non-water pixels are more numerous and as such have more weight in the retrieval than the water pixels. As a consequence, the NN is more effective at detecting non-water pixels, and less effective at detecting water pixels. It is concluded that the use of an equalized training data set is very important in this classification framework.
3.3.3. Analyzing the Weight of Each NN Satellite Input
To identify the most relevant inputs for the NN classification of the water surface, 15 NNs are trained based on all 15 different combinations of five input parameters, and their performances are evaluated following various criteria.
Table 5 presents the best results with one to five inputs and illustrates how the overall accuracy of the NN classification increases when the number of satellite inputs increases, as compared to the reference Landsat-8 dataset. The NN trained with only the VH backscatter coefficient has a spatial correlation of 78% and a true positive accuracy (correctly detecting water pixels) of 77% compared to the reference data. The spatial correlation increases to 79%, and the true positive accuracy rises to 85% when the standard deviation of the VV backscatter coefficient is added as an input to the NN. The VV backscatter coefficient helps to increase the performance of the NN since both spatial correlation and true positive accuracy increase to 87% and 90%, respectively. The standard deviation of the VH backscatter coefficient does not significantly improve the accuracy of the NN classification. This is due to the strong linear correlation (88%) between the spatial standard deviations of the VH and the VV backscatter coefficients (the other linear correlations among the five input parameters of the NN are provided in
Table 6). Similar to the standard deviation of the VH backscatter coefficient, the incidence angle does not have a strong impact on the performance of the NN since its accuracy remains nearly the same after adding the incidence angle as a new input. The input parameters of the NN classification are listed below, from the most important to the least important one in the NN processing:
Backscatter coefficient VH polarization (BS_VH)
Standard deviation of backscatter coefficient VV polarization (STD_VV)
Backscatter coefficient VV polarization (BS_VV)
Incidence angle
Standard deviation of backscatter coefficient VH polarization (STD_VH)
To conclude, the water detection ability of the proposed NN increased when the input parameters are carefully selected and when an optimal output threshold is selected. An equal number of water and non-water pixels should be used in the training dataset to ensure that the NN performs equally well in classifying water and non-water clusters. The STD_VH provides limited additional information to the NN due to its strong linear correlations with the other NN inputs. The incidence angle also plays a limited role in the NN performance. This is partly due to the rather narrow range of incidence angle, from 29° to 46°.
5. Conclusions and Perspectives
This study presents a methodology to monitor and quantify surface water under all weather conditions within Cambodia and the Mekong Delta in Vietnam, using high quality Sentinel-1 SAR observations, freely available online. The methodology is based on a neural network classification trained with optical Landsat-8 images at 30 m spatial resolution. The information content of each satellite input is analyzed and the inputs are selected to optimize the performance of the classification. This method allows the detection of surface water with good accuracy when compared to visible and NIR data under clear sky conditions, as well as when compared to a floodability map derived from topography data. Surface water maps derived from the proposed NN show a spatial correlation of ∼90% when compared to Landsat-8 water maps, with a true positive water detection of ∼90%. Compared to MODIS/Terra water maps over the Delta in 2015, our products share the same wetland seasonal cycle and dynamics, with a temporal correlation of ∼99%.
In the future, we will first apply the method to other areas under similar environments in southeast Asia and in other parts of the globe, and second we will test it in more vegetated environments. The final goal is to develop a general method capable of performing at the global scale to exploit the full spatial coverage of the Sentinel-1 mission. For this purpose, several approaches will be tested to improve the retrieval scheme. First, the introduction of a priori information from a topography-based floodability index will increase information on flooding and reduce ambiguities in the SAR signal with other surface parameters. Second, with the launch of the optical Sentinel-2 satellite, Sentinel-2 observations could be used to replace Landsat-8 data, and to train the SAR surface water classification under clear sky conditions. The classification could then be extended to the cloudy areas using the SAR data. Third, the temporal information in the SAR backscatter could also be exploited (i.e., minimum or standard deviation of the time series) as this information has been shown to improve the detection of floods [
13]. Finally, the high-resolution inundation extent retrieval maps could be post-processed in order to reduce the inherent noise in such high-spatial retrievals. We plan to test random walk techniques for that purpose.