1. Introduction
Flood extent maps based on synthetic aperture radar (SAR) have increasingly been used in recent emergency response operations. For example, in the Sentinel Asia consortium (
https://sentinel.tksc.jaxa.jp), where national space agencies, research institutions, and end-user organizations work together on emergency observation requests in the Asia-Pacific region, the responses to 20 of 23 activated flood-related events in the year of 2017 used ALOS-2 SAR flood-mapping results [
1]. These statistics demonstrate the need for radar’s all-weather, day-and-night sensing capability, where in most cases cloud cover and rains persist for the duration of a flood.
Flood extent maps are primarily extracted from SAR intensity (the squared amplitude of the SAR return) information. The backscattering coefficient (σ
0, log intensity in dB) decreases due to the specular surface of open flood waters if radar energy is mostly forward-scattered. In other cases, σ
0 may increase if the radar wave bounces first off the water surface (away from the satellite) and then off a semivertical structure, such as a building wall, tree trunk, or even a car in the flood (towards the satellite); this is the “double bounce” effect. Compared with rural floods, urban flood mapping suffers more from layover and shadow effects due to SAR’s side-looking nature. The layover zone is in front of buildings, toward the satellite direction, and the length is usually longer than the building height at incidence angles lower than 45° (layover length = building height × cot(incidence angle) [
2]). Within this zone there is a good chance of seeing stronger backscattering due to the double bounce effect, and its strength is a function of the oblique angle between the flight direction and the building orientation. The double-bounce intensity increase can be larger than 10 dB at 0° and remains high up to 5°; at an angle larger than 10° the increase will drop to a constant level [
3,
4]. The shadow zone, on the other hand, has a smaller length (shadow length = building height × tan(incidence angle) [
2]), and the backscattering stays low at all times. The low intensity within the shadow may lead to false flood detection if one uses a during-flood SAR image alone.
Several studies have tried to improve the accuracy of urban flood mapping by addressing layover and shadow effects. For example, Mason et al. and Giustarini et al. [
2,
5,
6], for a case study using TerraSAR-X data, masked out the layover and shadow zones by using a SAR simulator. The accuracy of open water flood mapping increases after applying the mask, but flooded pixels that show increased brightness due to double-bounce scattering are left out from the flood map. In another example, Pulvirenti et al. [
7] developed an algorithm that adopted the double-bounce intensity values from electromagnetic modeling as initial fuzzy thresholds, and they used fuzzy logic to map out the urban flood with intensity increase in COSMO-SkyMed images. Mason et al. [
8] also demonstrated the use of an electromagnetic scattering model and high-resolution elevation data to simulate the double-bounce effects. The simulated double-bounce scattering strength agrees with the observed data; hence, it can provide information about the statistical distribution of pixels showing double bounce for better thresholding decisions. Both model-based approaches have the advantage of being independent from user biases when handling double-bounce effects, although the auxiliary data of a high-resolution building model (or digital surface model) is not always available. Tanguy et al. [
9] proposed to use ancillary hydraulic data to refine the flood detected with RADARSAT-2 so as to overcome the SAR limitations associated with viewing geometry. The availability of such high-resolution hydraulic data is not, however, guaranteed. Recently, some studies have found that using interferometric coherence in conjunction with intensity will improve the detection accuracy particularly associated with double bounce [
4,
10,
11,
12]. The efficacy of augmenting moderate resolution interferometric coherence (from Sentinel-1 SAR, for example) in urban flood detection still awaits systematic, quantitative validation [
10].
In parallel to the developments in urban flood mapping, multitemporal SAR data analyses are also gradually being adopted by scientists to study flood extents. Hostache et al. [
13] suggested to select the most appropriate pre-flood image for change detection from a time series perspective. Schlaffer et al. [
14,
15] carried out harmonic analysis on seven years of ENVISAT ASAR data and identified floods by looking at anomalies in the time series. D’Addabbo et al. [
16] utilized Bayesian networks to jointly consider SAR intensity and InSAR coherence time series of COSMO-SkyMed data, as well as ancillary information including several hydraulic parameters, to study flood extent variation over time. Clement et al. [
17] utilized Sentinel-1 SAR intensity difference (change detection) time series to identify the evolution of a flood with time. Ouled Sghaier et al. [
18] utilized texture analysis on multitemporal RADARSAT-2 and Sentinel-1 intensity data to study flood history. Among these studies, some of them focus on flood evolution and treat each epoch independently, while others emphasize the use of pixel history to improve the accuracy of flood detection on particular events.
The approach that we propose in this study focuses on both the urban and time series dimensions. We would like to obtain the statistics of each pixel from the time domain and use them in a Bayesian flood probability function. The flood probability is calculated based on the historical pixel intensity values. The test data are Sentinel-1 SAR, whose open access and short repeat time make them a priority choice in many emergency response cases. We test our approach on the Hurricane Matthew flood in early October 2016 at the town of Lumberton, in North Carolina, USA. The Lumberton flood offers a great opportunity in which nearly concurrent airborne optical imagery and spaceborne SAR observations coexist for the flooding epoch. It is also an ideal target to examine the efficacy of urban flood mapping algorithms, as more than half of the flooded areas are in urban settings.
3. Data Processing and Analysis
3.1. Validation Dataset
Radar intensity data contain more complicated responses than a simple binary dissection between wet and dry areas. We therefore decided to create a validation dataset that also honored possible SAR σ
0 responses based on different land cover types. We manually classified the during-flood aerial image into the following six classes (
Table 1 and
Figure 2):
Flood: This class was the open standing water surface as directly seen in the aerial image. It was considered as an area of flooding associated with Hurricane Matthew. Intensity may decrease or increase in the during-event epoch depending on the incidence angle of the SAR image and the proximity of the pixel to any nearby vertical structures, such as trees or buildings.
Permanent Water: This class was identified based on the water bodies in the pre-flood SPOT-6 image. The SAR σ0 tended to stay low both in the non- and during-event images given the constant specular reflection surface at all times.
Flooded Vegetation: This class represented small to medium-size patches of trees surrounded by flood water from Hurricane Matthew. They mainly appeared in the urban area of Lumberton. Whether the radar wave penetrated the tree crowns was subject to factors like radar wavelength, polarization, and the density and structure of vegetation. If it did, double bounce and enhanced backward scattering may occur, resulting in an increase in the during-event σ0. If the tree crown was too dense to penetrate, SAR σ0 stayed relatively stable with potential seasonal variations.
Dry Vegetation: This class stood for small to medium-size patches of trees standing on dry land. The SAR σ0 should also stay at a relatively constant level with potential seasonal variations.
Dry: The dry class represented buildings, roads, dry bare land or lawns. Any pixels that appeared to be dry without ambiguity of underlying flood water were included in this class. The pixel values may vary with time depending on the detailed land cover type (paved road, lawn, bare soil, etc.). We did not expect to see significant intensity anomalies in buildings and roads on the during-event epoch, although intensity changes associated with soil moisture may be observed on the bare land or lawns.
Uncertain: A large area of the Lumberton region is covered by dense forest where the Lumber River flows through. Careful investigation of the spatial context showed that many of these large forest patches were surrounded by flood water on all sides. What remains unclear is whether if there was also flood water under the tree canopy. To honor this unidentifiable condition based on the high-resolution optical imagery, we classified these large forest patches as uncertain. The SAR σ0 may stay at a relatively high level due to the dense tree tops, but the actual level and the during-event response may vary within the same forest patch. We did not include pixels in this class in the final confusion matrix calculation.
After manual classification was done, we applied a nearest-neighbor sampling to convert the validation vector data of 30 cm resolution (
Figure 2b) into a raster image of 15 m resolution (
Figure 2c). We took cautious steps to ensure that each pixel in the converted raster image registered to the pixels in the SAR image. Note that there was potential change or loss of information during the down-sampling. We also tested converting the vector file into a 30 m raster image, and the result showed significant loss of spatial context, especially in urban regions. For example, buildings were concatenated together or coalesced with trees; and smaller buildings could fully disappear. Our conclusion is that 15 m ground resolution is minimally needed to undertake proper flood mapping in urban regions.
3.2. Sentinel-1 SAR Data Processing
We developed our amplitude stack processing pipeline using the NASA Jet Propulsion Laboratory’s InSAR Scientific Computing Environment (ISCE) version 2 (ISCE2 is now open-sourced at GitHub:
https://github.com/isce-framework/isce2). In this pipeline (
Figure 3), we started from SLC files in VV polarization mode, and incorporated Sentinel-1 radiometric calibration and thermal noise calibration [
22] on a burst-by-burst basis. Then we co-registered all slave images to a single master image before merging the bursts. In this study, to ensure proper comparison with the validation dataset, we produced a stack of intensity images in georeferenced coordinates. For an actual operational system, we would switch to stack processing in radar coordinates.
After producing the geocoded amplitude stack of 15 × 15 m posting, we applied a 5 × 5 Lee filter [
23,
24] to reduce speckle noise. We also tested 3 × 3 and 7 × 7 Lee filters, and comparison with the validation dataset shows that the 5 × 5 window gave the optimal result at a ground pixel spacing of 15 × 15 m. Since the incidence angle variation was small across the study area, from 35.7° at near range to 36.4°at far range, and the Lumberton region is relatively flat, we skipped the local incidence angle correction proposed by [
25]. Finally, we computed the backscattering coefficients (σ
0) in decibels (dB) by the following definition:
where
A stands for the amplitude of the SAR image in the complex domain after radiometric calibration and thermal noise calibration [
22]. We then looked into these σ
0 values on a pixel-by-pixel basis in the following multitemporal analysis.
3.3. SAR Intensity Time Series
We first tried to observe if there was any characteristic temporal pattern associated with each of the six classes defined in
Section 3.1. For the Flood class (
Figure 4a), we picked a pixel sitting in the middle of open standing water (sample pixel 1 in
Figure 2), and we observed a significant intensity decrease (~15 dB) for the during-event epoch, while the background value remained between −5 and −12 dB. It is worth pointing out that the histogram for this pixel (
Figure 4b) looked similar to the Dry class (
Figure 4f), reflecting the nature that this pixel was dry under ordinary conditions. The flood epoch was, therefore, an outlier (vertical line in
Figure 4b) from the dry epochs, and this formed the core of our flood detection approach (
Section 4).
We picked the sample pixel for the Permanent Water class in the middle of a pond. Its σ
0 slightly varied with time but mostly stayed low, around −20 dB (
Figure 4a; sample pixel 2 in
Figure 2). In some epochs the value may go 5 dB higher, close to some of the nonflooded epochs in the flood class. This natural variation may be associated with the changes of floating aquatic plants in the pond, which can be observed in the WorldView/QuickBird images in Google Earth.
For the Flooded Vegetation and Dry Vegetation classes, we selected one sample each from the flooded neighborhood in the south (sample pixel 3) and from the dry neighborhood in the north (sample pixel 4). They both showed σ
0 values between −5 and −8 dB in general, with the Dry Vegetation class higher on average than the other (
Figure 4b). This difference may simply represent different vegetation density or structures between two sample pixels. The more important difference was the intensity increase (~3 dB) in the time series of the Flooded Vegetation class on the during-event epoch, revealing possible double-bounce backscattering between the tree and flood water surfaces.
The σ
0 values for Dry class (sample pixel 5) stayed at a constant low level around −15 dB, even more stable than the Permanent Water sample pixel (
Figure 4c). This pattern reflects the characteristics of paved road (
Figure 2a,b), with the asphalt layer showing low backscattering intensity. We also examined another pixel in the Dry class (sample pixel 5-1), and the values fluctuated within a wider range, possibly due to the changes between land cover type (bare ground vs. lawn) and/or changes in soil moisture. Regardless of different temporal patterns in these two sample pixels, neither of them showed any intensity anomalies for the during-event epoch.
Figure 4.
(
a–
c) σ
0 time series of selected pixels for the 6 classes in the validation dataset. The location of each sample pixel is marked in
Figure 2, with the same ID number as shown in the legend (e.g., “1-Flood” in the legend of
Figure 4a is from the sample pixel 1 in
Figure 2). The grey bars in the background are 3-day cumulative precipitation from Global Precipitation Measurement daily solutions [
26]. The during-event epoch is marked by the precipitation record of >250 mm. (
d–
f) Histogram of the time series. The vertical lines stand for the σ
0 values on the during-event epoch. (
g–
i) Histogram of normalized time series. Vertical lines stand for the during-event σ
0 values after normalization.
Figure 4.
(
a–
c) σ
0 time series of selected pixels for the 6 classes in the validation dataset. The location of each sample pixel is marked in
Figure 2, with the same ID number as shown in the legend (e.g., “1-Flood” in the legend of
Figure 4a is from the sample pixel 1 in
Figure 2). The grey bars in the background are 3-day cumulative precipitation from Global Precipitation Measurement daily solutions [
26]. The during-event epoch is marked by the precipitation record of >250 mm. (
d–
f) Histogram of the time series. The vertical lines stand for the σ
0 values on the during-event epoch. (
g–
i) Histogram of normalized time series. Vertical lines stand for the during-event σ
0 values after normalization.
Compared to the Dry class, the σ0 values for the Uncertain class seemed to show low-frequency seasonal variations between the years of 2017 and 2018. As the pixel was located in the middle of a dense forest (sample pixel 6), the undulations in the time series may reflect seasonal changes in the forest.
We computed the histograms of σ
0 for each of the time series. In
Figure 4d, we can see that the histogram for the Flood and Permanent Water classes looked very different, but their during-event σ
0 values were almost identical. For the Flooded Vegetation and Dry Vegetation classes (
Figure 4e), however, the histograms were more similar, with also small differences in the σ
0 values on the during-event epoch.
Figure 4d,e together demonstrated that, in general, it was easier to map out the open-water flood, whereas the double-bounce effect associated with flooded vegetation cannot be easily identified. Given the high-frequency variation of σ
0 values in almost every class, plus a much smaller during-event σ
0 change in the Flooded Vegetation class than that in the Flood class, identifying the double bounce effect in the SAR image will never be an easy task.
5. Results
Figure 6a shows the mapping result at the cutoff probability of
(black pixels) and
(grey pixels). The largest open-water flood body (with a w-shape) near the central part of the AOI was well depicted. In the urban area (
Figure 6b), we saw that a large proportion of the urban floods were mapped by the criterion of
, indicating that these pixels experienced an intensity increase during the event epoch as compared with other epochs in their own history, and the increase was significant enough such that the probability of being flooded was over 50%. However, we could still observe that a visible portion of the urban flood could not be mapped by this p50-ts method (
Figure 6b). The unmapped pixels represented flooded areas without significant changes of during-event backscattering in the time series. The mosaicking pattern of intensity decrease, intensity increase, and even unchanged intensity signified the complexity of radar backscattering patterns of floods in urban areas.
Next, we looked at the probability maps (
Figure 6c–f). The transition zone of intermediate probability (yellow color) was narrow, with the majority of the mapped flood at the high probability end (
p > 0.9; red color). The effect of underprediction was also demonstrated by having low probability values in the pixels within the Flood and Flooded Vegetation classes (
Figure 6b vs.
Figure 6d,f). Their spatial distributions were more clearly shown by the contingency map (
Figure 7), where the mapped urban flood was mainly surrounded by FN pixels. Overprediction (FP pixels), on the other hand, was not as common within the urban area.
On the reliability diagram (
Figure 8), we saw that the plots were mostly below the 1:1 line, with the flood probability larger than the observed frequency, except for the first and second bin (
). In the first bin,
was mainly determined by FN/(FN+TN), so the ~5% higher in
than
came from the larger number of FN pixels than what
predicted. In the last bin,
was mainly determined by TP/(TP+FP), so the ~14% lower value in
than
came from the larger number of FP pixels than that predicted by
(overprediction). It would be wrong to interpret the numbers as indicating that the mapping results suffered more overprediction than underprediction because the sample numbers varied greatly in each bin. The best way to view reliability is by looking at the weighted root-mean-square error between the plot and the 1:1 line (
), which was around 13%. This can be interpreted as the average error in the probability map.
The evaluation metrics echoed what we found in the maps and plots (
Table 2). The comparison with the best grid search result of the same normalized dataset showed that the p50-ts method could reach identical performance as the best choice of thresholds. When comparing with the best grid search result of log intensity ratio, we could see that the p50-ts method had higher overall CSI (34% vs. 24%) and OA (81% vs. 64%). There was a slight decrease in PA (~6%), but UA went from 30% for the log intensity ratio method to 57% for our technique, suggesting significant improvement in reducing the number of overpredicted (FP) pixels. In the urban area, the metrics from our method and the log intensity ratio method were similar, but the resulting flood maps showed different flood patterns (
Figure 7). The difference was due to the fact that although global thresholds were used in both approaches, the time series normalization accounted for a certain level of spatial difference by considering the different history for each pixel; hence, the corresponding σ
0 thresholds spatially varied.
In summary, the p50-ts method improved flood mapping in the case of Hurricane Matthew flood within the Lumberton area. The result is close to the best of what can be done with the optimal uniform thresholding on a pair of SAR images. However, the result still suffers obvious underprediction and overprediction. Next, we discuss the potential sources for the false predictions.
6. Discussion
The time series normalization allows us to identify the flood-related double bounce pixels and specular reflection pixels of statistical significance. With that we can study the statistical distribution for these two effects. We sorted out the pixels that were in the Flood class and mapped as flooded due to intensity increase, and we interpreted these as due to double-bounce effect. We compared their histogram with histograms for (1) pixels in the Flood class mapped as flooded due to intensity decrease, interpreted as open water flood that saw specular reflection, and (2) pixels in the Dry class mapped as nonflooded. The plots in
Figure 9 indicate that, from the histogram perspective, the normalized during-event intensity (
Figure 9c) could better separate the pixels that saw double-bounce scattering than the log intensity ratio (
Figure 9b). For the open water flood, the normalized during-event intensity and the log intensity ratio had similar efficacies in separating them from the nonflooded. The single during-event intensity (
Figure 9a) had the worst histogram separation among all three. The better separation capability for double-bounce scattering marks the value of this method in studying urban floods.
We also looked at the intensity changes with respect to the p50 threshold in the six validation classes individually (
Figure 10). In the Flood class, only 50% of the pixels were detected as flood, among which 32% were identified with σ
0 decrease and 18% with σ
0 increase. In the remaining 50% there was no clear σ
0 anomaly based on the thresholds given. In the Flooded Vegetation class, we saw more pixels with σ
0 increase (18%) than those with σ
0 decrease (12%). However, about 70% of the pixels in the Flooded Vegetation class did not see significant σ
0 changes. In the Permanent Water class, 22% of the pixels were identified as flood by σ
0 decrease. As for the Dry and Dry Vegetation classes, we saw a small fraction of false positives (6%–10%). Next, we would like to address the potential sources that cause the underprediction in the Flood and Flooded Vegetation class as well as the overprediction in the Permanent Water and Dry class.
6.1. Uncertainties in the Validation Dataset
One possible source of error is uncertainties in the validation dataset. We generated the validation data by rasterizing the validation vectors of ~30 cm resolution into 15 × 15 m pixels. At a regional scale, the rasterized image agreed in general with the vector file (
Figure 2). However, when we zoomed in to the local scale, we observed discrepancies between the two (
Figure 10a,b and
Figure 11a,b). The discrepancies were associated with the sizes and orientations of the objects as well as their relative portion within a pixel. The difference was particularly obvious in a setting where the objects were densely distributed but isolated from one another. Thus, although the number of pixels falsely validated may only account for a small portion in the AOI, we should always be aware of the existence of such uncertainties, especially when utilizing moderate-resolution SAR images in flood mapping.
6.2. Source of Underprediction
Underprediction (FN) accounted for the majority of false predictions. We first used patch 1 (
Figure 6a) to discuss the possible sources of underprediction. In
Figure 11d, there was a clear pattern of underprediction, with four elongated zones of FN subparallel to each other. According to the pre-flood optical image, these zones were asphalt roads (
Figure 11c). Several studies have pointed out that asphalt will show up as dark pixels in radar images due to its smooth surface and low subsurface soil moisture content [
32,
33,
34]. What we saw in this study was that the backscattering from asphalt surfaces could be as low as that from a water surface and, hence, indistinguishable from the during-event flood in the time series. This agrees with the point made by [
6], that smooth surfaces such as tarmac, paved road, and parking lots may serve as water surface-like radar response areas. In the cases where they are actually flooded, there may not be any intensity anomaly compared to dry conditions. This is one of the major sources of underprediction.
In patch 2, we saw another source of underprediction. This patch was inside a region with trees, houses, and parallel small roads. These densely packed houses and trees may cause serious shadow effects, leading to undetectable zones of constant low intensity (
Figure 12c,d). Therefore, shadow was an effect that the time series normalization would not be able to deal with.
One thing worth pointing out is that shadow and smooth surfaces usually cause overprediction in the flood mapping with a single intensity image, while it causes underprediction in the log intensity ratio method as well as our time series normalization method.
There is one more phenomenon that drew our attention: only a small portion of the Flooded Vegetation was mapped as flooded (
Figure 11e and
Figure 12e). Some of the tree patches showed significant during-event intensity changes while others did not (
Figure 11e). The logic behind the Flooded Vegetation class is that since the small tree patches were surrounded by open water flood, backscattering could be stronger when the radar wave penetrated the tree crown and reflected at the tree trunks and at the water surface, or vice versa. This expected behavior was, however, only seen in 18% of the Flooded Vegetation class, with another 12% seeing intensity decrease that could not be explained by the double-bounce mechanism. Therefore, despite the higher probability of observing intensity increase as compared with the Dry Vegetation class (
Figure 10), it is likely that the backscattering for this class was the combination of changes in trees and flood. There is no easy way to separate the contribution from these two mechanisms. We had quickly tested to re-categorize this class as nonflooded, and this gave better overall OA (+2%) and PA (+4.5%) but lower CSI (−1.4%) and UA (−8%) (
Table 2). As tree penetration capability is also a function of radar wavelength and polarization [
35], there may not be a single answer to what would be the most appropriate way to validate the flood mapping in this class.
6.3. Source of Overprediction
A large proportion of overprediction (FP) was seen in the Permanent Water class (
Figure 10). One example is in patch 3, where part of the pond water was mapped as flooded (
Figure 13a-c). When we compared the pre-event and during-event optical images, we observed that some of the ponds were possibly vegetated with aquatic plants. As aquatic plants such as macrophytes are known to cause high backscattering intensity in C-band SAR [
35], it is likely that we would see a during-event intensity decrease when the pond water level increased higher than the plants. Vegetation in the permanent waterbody is, therefore, one potential source of overprediction.
Another source of overprediction comes from the during-event intensity increase in the Dry class (
Figure 11). One example is shown in patch 4, in which the soil underneath the sparse low meadow appeared darker in the during-event aerial image (
Figure 12e) than the pre-event satellite image (
Figure 12d); However, there was no water visible on the surface. This is a known effect of soil moisture, where the rise in soil moisture will also give rise to a radar backscattering intensity increase [
36,
37]. This effect is more obvious in bare soil or sparse meadow land cover types and, hence, affects more the prediction of rural floods than urban floods.
Finally, the increase in backscattering intensity from buildings, possibly due to cumulated water on the roof, also accounted for some overprediction in the Dry class, although the proportion was relatively small in this case study.