Next Article in Journal
Study on Hydrochemical Characteristics and Interactions between Groundwater and Surface Water in the Dongting Lake Plain
Previous Article in Journal
Microplastics in the Danube River and Its Main Tributaries—Ingestion by Freshwater Macroinvertebrates
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The PCA-NDWI Urban Water Extraction Model Based on Hyperspectral Remote Sensing

1
Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
2
University of Chinese Academy of Sciences, Beijing 100049, China
*
Author to whom correspondence should be addressed.
Water 2024, 16(7), 963; https://doi.org/10.3390/w16070963
Submission received: 20 February 2024 / Revised: 14 March 2024 / Accepted: 20 March 2024 / Published: 27 March 2024

Abstract

:
Accurate extraction of water bodies is the basis of remote sensing monitoring of water environments. Due to the complex types of ground objects around urban water bodies, high spectral and spatial resolution are needed to achieve accurate extraction of water bodies. Addressing the limitation that most spectral index methods used for water body extraction are more suitable for open waters such as oceans and lakes, this study proposes a PCA-NDWI accurate extraction model for urban water bodies based on hyperspectral remote sensing, which combines Principal Component Analysis (PCA) with Normalized Difference Water Index (NDWI). Furthermore, aiming at the common water shadow problem in urban hyperspectral remote sensing images, the advantages of the PCA-NDWI model were further verified by experiments. By comparing the accuracy and F1-Measure of the PCA-NDWI, NDWI, HDWI, and K-means models, the results demonstrated that the PCA-NDWI model was better than the other tested methods. The accuracy and F1-Measure of the PCA-NDWI model water extraction data were 0.953 and 0.912, respectively, and the accuracy and F1-Measure of the PCA-NDWI model water shadow extraction data were 0.858 and 0.872, respectively. Therefore, the PCA-NDWI model can effectively separate shadows and the surrounding features of urban water bodies, accurately extract water body information, and has great application potential in water resources management.

1. Introduction

Water resources are the key to human survival and development. In recent years, the random discharge of a large number of industrial wastewater and domestic sewage has led to the rapid deterioration of water quality in rivers and lakes, which has disrupted the ecological balance and hindered the sustainable development of society. Therefore, the prevention and control of water pollution is particularly important and urgent [1,2]. Remote sensing of water bodies is the main technology applied to wide-area water body extraction and water body information analysis, which has been widely used because of its real-time and macroscopic advantages. As the first step of water quality monitoring, accurate river area extraction plays a great role in water resource protection [3]. At present, the commonly used water extraction methods mainly include the single-band threshold method, the water spectral index method [4,5,6], the image processing method [7,8], the machine learning method [9,10,11,12,13], the deep learning method [14,15,16], etc. Among them, the water spectral index method is the most frequently used method because of its relatively fast calculation and good extraction accuracy [17].
The initial spectral index is the Normalized Difference Water Index (NDWI) proposed by McFeeters [18]. With the continuous expansion of the types of study waters, some scholars have put forward new spectral indexes of water extraction, such as the modified Normalized Difference Water Index (MNDWI), the Automated Water Extraction Index (AWEI), water index 2015 (WI2015), etc., so as to be suitable for water extraction in different types of areas such as mountains, lakes, towns, and oceans [19,20,21,22,23,24,25,26,27,28]. These methods are mainly based on multispectral satellite remote sensing data, and at the same time, most of them focus on open waters such as rivers, lakes, and seas. However, the spectral and spatial resolution of multispectral satellite remote sensing is not precise enough to distinguish small-sized ground objects, and it is difficult to deal with the task of accurate extraction of urban water bodies with the characteristics of complex surrounding environments and areas that are not large enough for accurate detection. Compared with multispectral satellite remote sensing, uncrewed aerial vehicle (UAV) hyperspectral remote sensing has a higher spectral and spatial resolution, and because of its relatively low flying altitude, it greatly reduces the influence of atmospheric radiation on the water spectrum and is more suitable for urban rivers with complex and changeable water environments [29,30]. Because there are many bands in hyperspectral data, we cannot completely correspond the multispectral bands in the NDWI model with the hyperspectral bands. At the same time, if only a few bands in hyperspectral data are considered when constructing the spectral index, a large number of useful information about the remaining bands is usually ignored. Aiming at the characteristics of hyperspectral data with many bands and high correlation between bands, data dimensionality reduction is usually needed before modeling. Spectral dimensionality reduction methods are mainly divided into two categories: feature band selection and feature band extraction. Feature band selection refers to selecting representative feature bands from the original bands [31], while feature band extraction refers to mapping the original high-dimensional spectral space to the low-dimensional space. As a universal feature band extraction method, Principal Component Analysis (PCA), is widely used in image compression, spectral data analysis, machine learning, and so on [32]. After spectral dimensionality reduction, hyperspectral data can be extended to the construction of a spectral index. In addition, due to the extremely high spatial resolution of hyperspectral remote sensing, the water shadow in the image is clearly visible, including the shadows reflected by plants along the river bank and the shadows transmitted by weeds at the bottom of the river, etc. The spectral difference between the water body with shadows and the water body without shadows is quite large, so it is usually difficult to accurately distinguish this part of the water body from the routine water body extraction work. The threshold of extracting the water body by the spectral index is usually set to 0, but Fen Chen showed in experiments that the optimal threshold of the water body spectral index depended on the band data and the spectral characteristics of the sensor, and the threshold needed to be adjusted in specific applications [33]. Some scholars have used manual threshold selection or automatic threshold selection. Although the manual threshold selection method is complicated, it is relatively accurate. However, automatic threshold methods, such as the maximum between-cluster variance method (OTSU), which is widely used, usually have a certain deviation from the optimal threshold [34,35]. Especially, compared with lakes and oceans where the surrounding ground objects are relatively simple, the background of urban areas is complex, and there are many ground objects that are easily confused with water bodies, resulting in the calculated gray histogram peaks and valleys not being obvious and bimodal characteristics not being detected. Therefore, it is difficult to accurately extract water bodies by using the OTSU method.
In view of this, based on the hyperspectral remote sensing UAV data, this study took a coastal city and a developed city in the south of China as typical study areas to explore the accurate extraction method of wide-area urban water bodies. The PCA-NDWI model used for accurate extraction of water bodies was constructed by combining NDWI with PCA dimensionality reduction, that is, using PCA dimensionality reduction principal components in green band and near-infrared band to build a band ratio, so as to increase the spectral difference between water bodies and non-water bodies, and at the same time solve the problem of water body extraction accuracy decline caused by water shadow. In the experiment, the water extraction accuracy of the NDWI model constructed by hyperspectral green light at 563 nm and near-infrared light at 865 nm, the NDWI model constructed by the mean of hyperspectral green light at 520 nm–600 nm and the mean of near-infrared light at 760 nm–950 nm, the HDWI model, the K-means model, and the PCA-NDWI model were compared, respectively. Then, the shadow recognition ability of the HDWI and PCA-NDWI models with high accuracy was discussed, which fully verified the feasibility of the PCA-NDWI model in accurate extraction of urban water bodies, and expanded the application of hyperspectral remote sensing data in the spectral index method. This paper can provide a new approach for the study of the spectral index method, and at the same time provide a strong technical guarantee for wide-area monitoring of hyperspectral remote sensing of water areas and the formulation of water environmental protection schemes in complex urban areas.

2. Materials and Methods

2.1. Acquisition and Preprocessing of UAV Hyperspectral Data

The study area of this research included some rivers in a coastal city and a developed city in southern China, as shown in Figure 1. The selected representative cities have developed water systems and high urbanization processes, and there is an urgent need for wide-area monitoring of the water environments. The water areas are located in the urban areas, and the rivers are long and narrow. The environment temperature is suitable and the nutrients are abundant, so there are weeds, islands, and other targets in the water bodies, and the environments around the shores are complex. The study area in Figure 1a includes a water body, buildings, a bridge, water surface flare, underwater interference targets, etc. In addition, the image of the study area in Figure 1b not only has the above-mentioned features, but also has a large number of water shadows, which easily interfere with the extraction of water, and then affect the accuracy of water extraction. Thus, it is appropriate to take these areas as typical study areas for water extraction, which can fully verify the universality of the water extraction method proposed in this paper.
This experiment adopted the UAV hyperspectral imaging system integrated by Hangzhou Hyperspectral Imaging Technology Co., Ltd., (Hangzhou, China) in which the core module hyperspectral imaging instrument was independently developed by our project group (CIOMP). As shown in Figure 2, the hyperspectral imaging instrument covered the wavelength range of 400 nm–1000 nm, and the spectral resolution was better than 2.5 nm, including 300 spectral channels. The experiment was carried out in clear and cloudless weather. The flying height of the UAV was set to 100 m, and the flying time was concentrated between 10: 00 and 14: 00. In order to reduce the influence of the specular reflection of sunlight, the observation method above the water surface was used to collect the water spectrum, that is, the angle between the observation plane of the instrument and the incident plane of the sun was 135°, and the angle between the instrument and the normal of the water surface was 45°. Therefore, the synchronous acquisition of hyperspectral data, high-definition visible light data, and GPS data was realized.
Because the hyperspectral image acquisition process of the water body was conducted outdoors, the CMOS detector of the system was very sensitive to factors such as solar light intensity and dark current noise. On the other hand, there are differences in the response consistency of each unit of the detector to the signal. The reflectance of water data needs to be corrected to reduce the interference of the above-mentioned factors. Therefore, when obtaining the hyperspectral image of the water body, it is also necessary to obtain the hyperspectral image of the standard reflectance plate and dark background. The standard reflectance plate with a surface reflectance of 10% was selected for the experiment and the spectral reflectance of the water body was calculated by Formula (1) [36,37]:
R o = D N o D N b D N w D N b × R w
where D N o and R o , respectively, represent the initial spectral data of the water body and its spectral reflectance data; D N w and R w represent the initial spectral data and spectral reflectance data of the standard reflectance plate, respectively; and D N b represents the initial spectral data under a dark background.
In this study, PCA was used to reduce the dimensionality of the hyperspectral data band to construct a spectral index model. If the dimension of the actual water spectrum is directly reduced, the principal component value after dimensionality reduction will contain random noise in the actual spectrum, which will lead to a decline in water extraction accuracy. Therefore, before dimensionality reduction, data preprocessing is needed to reduce noise interference.
The Savitzky–Golay smoothing filter was used to reduce noise [38]. In Figure 3, the red curve shows the corrected original spectral reflectance of the water body, and the blue curve shows the spectral reflectance curve of the water body after 9 points and 3 times smoothing. In the curve, the reflectance of green light in the 520–600 nm band is higher than that of near-infrared light in the 760–950 nm band. The reflectance of the green light band is at the peak position of the curve, and the reflectance of the near-infrared light band is at the peak and valley position of the curve. This difference can be used to construct an appropriate spectral index for accurate water extraction in complex urban areas.

2.2. Method Principle

The spectral index method is a common method for extracting water by spectral remote sensing. In this study, the NDWI method, the HDWI method for hyperspectral data, the PCA-NDWI method proposed in this paper, and the K-means unsupervised clustering method commonly used in machine learning were selected to extract water from the study area and the results were compared. These methods are introduced below.
The NDWI method is a mainstream method based on Landsat multispectral remote sensing images [18], and its specific formula is as follows:
N D W I = R G R N R G + R N
where R G is the green band reflectance and R N is the near-infrared band reflectance. The NDWI method increases the numerical difference between water and background objects by calculating the green band and near-infrared band of multi-spectral data. As hyperspectral bands have more numbers and narrower spectral resolution than multispectral bands, in order to solve the corresponding problem between bands of different spectral remote sensing data, NDWI bands were selected in two ways.
The first method is to select one wavelength in the hyperspectral green band and one wavelength in the near-infrared band to construct the NDWI. By comparing the water extraction accuracy of the NDWI at different wavelengths in the early stage, the hyperspectral spectral reflectance of 563 nm ( R 563   nm ) and spectral reflectance of 865 nm ( R 865   nm ) were selected as the reflectance of the green light band and near-infrared band in Formula (3), N D W I 1 , respectively.
N D W I 1 = R 563   nm R 865   nm R 563   nm + R 865   nm
In the second method, R G ¯ and R N ¯ were obtained by averaging the reflectance of hyperspectral green light at 520–600 nm and near-infrared light at 760–950 nm, respectively, which were used to construct Formula (4), N D W I 2 .
N D W I 2 = R G ¯ R N ¯ R G ¯ + R N ¯
The HDWI is a method proposed by Huan Xie for hyperspectral urban water data in 2014 [39]. This method amplifies the contrast between water and non-water areas by integrating and differentiating the spectra of 650–700 nm and 700–850 nm. After practical training and testing, this paper selected spectral reflectance integration of 520–600 nm green light and 760–950 nm near-infrared light to improve the separability of water from other backgrounds, and the formula is as follows:
H D W I = 520   nm 600   nm R λ d λ 760   nm 950   nm R λ d λ 520   nm 600   nm R λ d λ + 760   nm 950   nm R λ d λ
The K-means algorithm can realize unsupervised clustering of samples [40,41,42]. By calculating the Euclidean distance from n sample points to the centers of k clusters, it is divided into the nearest cluster, and the Euclidean distance formula is as follows:
d i s X i , C j = t = 1 m ( X i t C j t ) 2
where X i represents the i-th sample (1 ≤ in), C j represents the j-th cluster center (1 ≤ jk), m represents the dimension of the sample, and X i t and C j t , respectively, represent the attribute values of the t-th dimension of the sample and the cluster center (1 ≤ tm). According to the comparison of water extraction accuracy under different K values, when the K value is greater than 20, the water extraction accuracy changes little, so the K value selected in this study was 20. After the sample points were divided, the cluster center of each cluster was calculated again, and the sample points were allocated repeatedly until the sum of squares (SSEs) of cluster center errors from each point to its corresponding cluster did not change. The calculation formula is as follows:
S S E = i = 1 k X i , | d i s X i , C j | 2
To extend hyperspectral data to the construction of NDWI, it is necessary to reasonably select the band information of hyperspectral data without losing useful information. In order to improve the applicability of the spectral index in image extraction of water bodies with shadows in complex urban backgrounds, the PCA-NDWI accurate extraction model for urban water bodies was constructed by combining PCA and NDWI:
P C A N D W I = P G P N P G + P N
where P G and P N are the first principal component of hyperspectral green light reflectance at 520–600 nm and near-infrared reflectance at 760 nm–950 nm, respectively, after PCA calculation. PCA is a commonly used data dimensionality reduction method, which maps high-dimensional data to low-dimensional data through linear transformation, selects the key factors that affect the dependent variables in the original spectral data, reduces the data dimensionality and inherits the characteristics of the original variables, and greatly preserves the information contained in the original variables; as a result, the extracted new variables are linearly independent, thus effectively solving the problem of band redundancy of hyperspectral data [43,44,45]. Because the projection variables generated by PCA are not interpretable, it has certain limitations in the tasks requiring interpretability of results [46]. However, it is simple to realize, reduces the calculation cost, and can greatly improve the lightweight degree of the model, which is suitable for the application needs of today’s wide-area dynamic real-time monitoring of water environments. Compared with the above-mentioned spectral indices, the PCA-NDWI accurate extraction model for urban water bodies fully extracts the most useful information contained in two broad bands of hyperspectral green light and near-infrared light and greatly improves the utilization of hyperspectral information.

2.3. Evaluation Method

In order to evaluate the effect of the water extraction method, the comprehensive evaluation index (F1-Measure) and accuracy were used to evaluate the performance of the above four methods, and the formulas are as follows:
F 1 = 2 × T P T P + F P × T P T P + F N T P T P + F P + T P T P + F N
A c c u r a c y = T P + T N T P + F N + T N + F P
where true positive (TP) represents the number of pixels that are actually water and predicted to be water. True negative (TN) represents the number of pixels that are actually non-water and predicted to be non-water. False positive (FP) indicates the number of pixels that are actually not water but are predicted to be water, that is, the number of false detections. False negative (FN) indicates the number of pixels that are actually water but are predicted to be non-water, that is, the number of missed detections. F1-Measure is a commonly used comprehensive evaluation index, which is often used to evaluate the model. The higher the F1, the better the method. The accuracy reflects the percentage of accurate areas predicted by the model, which can basically reflect the effectiveness of the model. The higher the accuracy, the better the method.

3. Experiment and Discussion

3.1. Spectral Analysis of Typical Ground Objects

Based on the RGB images of a coastal city study area in southern China, the water area was manually marked with green to evaluate the accuracy of the water extraction model. The marked RGB images are shown in Figure 4, and the study area includes water bodies, vegetation, aquatic plants, buildings, roads, architectural shadows, and other typical features.
To analyze the waters in this area, the typical study areas in the red box in Figure 4 can be used as an example. The initial RGB image is shown in Figure 5a. There are many kinds of ground objects around this typical urban water image, and there are environmental disturbances around the water to be extracted, such as river banks, aquatic plants, trees, bridges, and shaded roads. The interior of the green dotted box is the river area, and there are aquatic plants and islands inside. The shadow road surface inside the red box, the parking lot road surface inside the blue box, and the water body inside the yellow box all appear dark black in different degrees. The average spectral reflectance of these boxes and the spectral reflectance curve in Figure 5b were obtained, respectively. It can be found that the average spectral reflectance of these confusing objects is similar to some extent after 750 nm in the near-infrared band.

3.2. Accuracy Analysis of Water Extraction in the Study Area of a Coastal City in Southern China

Part of the typical study area in Figure 5a was used for training, and the above-mentioned five methods were used for water extraction, respectively, among which, K-means clustering and the PCA-NDWI model were used to reduce the dimension of spectral features by PCA first. In the experiment, the number of principal components retained by K-means clustering was 3, which contained more than 99% useful spectral information in the whole band. The PCA-NDWI model was used to obtain the first principal component in the green band and near-infrared band, respectively, which contained more than 98% useful spectral information.
In addition, to construct the N D W I 1 , N D W I 2 , HDWI, and PCA-NDWI models, it was necessary to manually select their optimal thresholds in combination with a gray histogram. Compared with the automatic threshold selection method, the manual threshold selection method is more accurate, which makes the segmentation effect of the water body and background better.
Because it is difficult to completely separate the pixel clusters of some small shadows, roads, and other ground objects from the water body only by using spectral information, the water image information was combined in the modeling, and only the connected regions with more than 500 pixels were reserved as the water body part to improve the accuracy of water body extraction. Finally, the accuracy and F1-Measure of each water extraction method for training water bodies were calculated and the results are summarized in Table 1.
From the results of the water extraction of training waters in Table 1, it can be found that the accuracy and F1-Measure of the PCA-NDWI model were the highest among several methods, which were 0.984 and 0.946, respectively. The accuracy of the HDWI model, which was also aimed at hyperspectral remote sensing data, was 0.960, which was similar to that of the PCA-NDWI model, indicating that the percentage of accurate segmentation and extraction of water areas by the two methods was close, but the F1-Measure of the HDWI model was lower, which was 0.847. The results of the N D W I 1 model were close to those of the HDWI model, and its accuracy and F1-Measure were 0.959 and 0.844, respectively. Although the accuracy of the N D W I 2 model was 0.936, its F1-Measure was the lowest value of all methods, only 0.729, and the water extraction effect of the model was the worst. The F1-Measure of K-means clustering was slightly better than the N D W I 2 model, and the accuracy of 0.801 was the lowest value among several methods.
The images of water body extraction results of five algorithms were discussed, as shown in Figure 6. Figure 6a is the initial water mark image of this area, and the water area was covered with green. Figure 6b–f, respectively, show the results of water body extraction by several methods, in which the areas where TP correctly detects water body, FP false detection, and FN missed detected were marked with green, red, and blue, respectively, and the areas where TN correctly detects background were displayed with the initial RGB values of the image background.
Comparing the results of water extraction in Figure 6, it can be found that in the five results in Figure 6b–f, there were more or less a few red and blue areas on both sides of the river, which were consistent with the river boundary in Figure 6a. This was caused by errors in manual boundary demarcation, that is, it was divided into inside or outside the boundary, which could be ignored. In addition, because there were relatively many mixed pixels at the edge, it would also have a certain impact on the extraction effect, and a small number of misclassified areas appeared at the river boundary.
The PCA-NDWI model proposed in this paper has the lowest false detection rate and missed detection rate for water bodies, as shown in Figure 6f. However, overall, the extracted island areas in water have the problem of shrinking, that is, some island areas are divided into water bodies, which has some false detection. At the same time, there was a problem with missed detection in the upper right corner of the river in the five results images, but the missed detection rate of the PCA-NDWI model was lower than other methods. In the four results of Figure 6b–e, except for the results of the PCA-NDWI model, there were a large number of missed detection areas in the water, most of which were accompanied by small islands in the water or the edge of the river. The N D W I 2 model had the highest missed detection rate, so it was difficult to accurately divide the boundary between water and small islands.
Under the condition that the model parameters were fixed, the parameter stability and precision reliability of the model in the same dynamic water area were investigated. The remaining wide-area river sections not used for training in the study area were selected for testing, and the accuracy and F1-Measure of the five models were calculated and the results are summarized in Table 2.
From the comparison results of water extraction accuracy of the test river reach in Table 2, it can be found that the accuracy of each water extraction method in the wide-area test water area was lower than that in the training water area, but in general, the K-means clustering model was the worst method among several methods. When the PCA-NDWI model was applied to a small part of the water area and extended to different dynamic river sections of the same water area, the model parameters could maintain a relatively stable state, and the accuracy of water extraction could be reliably maintained at a certain level.
The overall effect image of the PCA-NDWI model proposed in this paper for water body extraction in a coastal city study area in southern China is shown in Figure 7. Although there were a few errors in water body boundary extraction, it was relatively complete. The easily misclassified objects in the image could be separated accurately, and there were a few missed detections and false detections in the division of some islands and meadows.
The importance of accurate water extraction is not only manifested in the application of dynamic monitoring of water area but also in the monitoring of wide-area water quality. For lakes and seas, because of their wide water surface and large area, the accuracy of their water quality analysis is not determined by the accurate extraction of water boundary and aquatic plants, and the low proportion of false positives and false negatives will not affect the reliability of subsequent water environment monitoring. Therefore, traditional multispectral satellite remote sensing combined with the NDWI method can meet the application requirements. Unlike the above-mentioned open waters, urban water bodies are located in a special position, and urban water pollution is sudden and unpredictable. In particular, sewage outfalls in most cities are located at the river boundary. Monitoring the dynamic data of these outlets is an important step in maintaining river water quality [47], so it is particularly important to extract the boundary of urban water bodies. Compared with other methods, the method proposed in this paper has the lowest missed detection rate, the clearest boundary extraction, and the highest integrity of water extraction, which enhances the reliability of wide-area monitoring of urban water bodies to some extent.

3.3. Accuracy Analysis of Water Shadow Recognition

Water shadow is one of the main issues that affect the accuracy of water extraction. Because of the spectral difference between water with shadow and water without shadow, water shadow areas cannot be correctly divided into water areas. In view of this, in order to evaluate the recognition and segmentation ability of the PCA-NDWI model proposed in this paper, the HDWI and PCA-NDWI models, two methods with high accuracy, were used to extract water from the hyperspectral data of a developed city in southern China, where the water surface shadow problem was significant. The comparison results are shown in Table 3.
Table 3 shows the overall water extraction accuracy of the study area and the shadow extraction accuracy of water shadow areas. It can be found that the PCA-NDWI model was superior to the HDWI model in both the extraction accuracy of the overall water area and the extraction accuracy of water shadow areas, in which the accuracy and F1-Measure of the PCA-NDWI model for water shadow extraction were 0.858 and 0.872, respectively, which were much higher than those of the HDWI model.
The water body and shadow extraction ability of the above-mentioned two methods were discussed by using some images with water shadow. Figure 8a shows the RGB images of some study areas with significant shadow problems. Figure 8b shows a manually marked water image corresponding to this part of the area, in which the green area is a water body without shadow, and the orange area is a water body with the shadow of trees on the shore or weeds at the bottom of the river. Figure 8c,d show the water body extraction effect images of the HDWI and PCA-NDWI models, respectively. The TP area, FP false detection area, and FN missed detection area are marked with green, red, and blue, and the TN area is displayed with the initial RGB value of the image background area.
Comparing Figure 8c,d, it can be found that there was a significant problem of missed detection in the HDWI model’s water body extraction, and most areas in the water shadow range could not be correctly divided into water bodies, as shown in the blue area in Figure 8c. However, the PCA-NDWI model had a good ability to divide water shadows into water bodies, but this method also had some false detection problems, that is, a small amount of background was divided into water bodies, as shown in the red area in Figure 8d.

3.4. Limitations of Application Scenarios with Insufficient Sunlight

Through the experiments described above, the advantages of the water extraction model proposed in this paper were verified in a complex urban environment. However, in the experiments, we found that when the UAV hyperspectral images were obtained, poor lighting conditions had a certain impact on the performance of the subsequent water extraction model. Taking the local study area in the lower part of the third image in Figure 7 as an example, the RGB image and water extraction image of this part are shown in Figure 9.
The RGB image in Figure 9 contains two typical areas: the light-filled area and the light-deficient area. Although we selected clear and cloudless weather and fixed low-altitude angles to reduce the influence of light factors on the model when acquiring the UAV hyperspectral image, there are still some shady areas blocked by high-rise buildings in the acquired image. Due to the lack of illumination, a small number of dark surface targets in the shady areas were further deepened. By comparing the average and standard deviation curves of the original Digital Number of FP area (red) and water area (green) in the water extraction image, it can be found that the average of the two areas is highly similar, which leads to some false positive problems in this part of the dark surface area in the model. Therefore, the performance of the proposed model is reduced in application scenarios with poor lighting conditions, and the pre-processing steps of UAV hyperspectral images in application scenarios with cloud cover and building shadows need to be further improved.

4. Conclusions

Water extraction is the key to remote sensing monitoring of water environments, which plays an indispensable role in water quality monitoring, water pollution early warning, and water environment management evaluation. In this study, based on the hyperspectral remote sensing data images of an urban UAV, an accurate extraction model of hyperspectral remote sensing water body combining PCA and the NDWI model was constructed. The PCA-NDWI model improved the applicability of the traditional spectral index algorithm in complex urban areas and provided standardized data input for accurate inversion of key monitoring indicators of urban water bodies. In the experiments conducted in this study, the effects of the PCA-NDWI, NDWI, HDWI, and K-means models on water extraction in complex urban areas of a coastal city in southern China were compared. The PCA-NDWI model had the highest accuracy and F1-Measure, which were 0.953 and 0.912, respectively. At the same time, we discussed the water shadow extraction effect of the PCA-NDWI model in a developed city study area in southern China with a significant water shadow problem, and its accuracy and F1-Measure were 0.858 and 0.872, respectively, which were superior to other methods. This result further verified the universality of the proposed research model and effectively solved the problem of the existing spectral index method for water body extraction being inefficient for urban complex water bodies. The results demonstrated that the PCA-NDWI model proposed in this paper can accurately obtain effective information on hyperspectral bands for the construction of a spectral index model. Moreover, it was shown that the model can accurately distinguish water bodies from other ground objects in the case of complex environments around a city, many weeds in water, and significant shadow interference on water surfaces.
In general, the extraction accuracy of the model was within a reasonable range. In this study, two representative China cities with high land utilization rates were selected as the research areas; however, the universality of the model in terms of applicability to other cities and environments needs to be further verified. In order to solve this problem, the data set will be further expanded in the future in order to improve the data extraction accuracy of the PCA-NDWI extraction model for urban water bodies.

Author Contributions

Conceptualization, Z.Z. and J.Y.; writing—Review and Editing, Z.Z. and S.F.; software, M.W.; resources: C.S. and N.S.; and validation, J.C. and J.W. All authors have read and agreed to the published version of the manuscript.

Funding

Jilin Province Special Funds for High-tech Industrialization in cooperation with the Chinese Academy of Sciences (CAS) (2022SYHZ0025) and Jilin Province Science & Technology Development Program Project in China (20210204216YY, 20210204146YY, and 20200403009SF).

Data Availability Statement

The data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Jin, W.; Li, Y.; Lu, L.; Zhang, D.; He, S.; Shentu, J.; Chai, Q.; Huang, L. Water quality assessment of east Tiaoxi River, China, based on a comprehensive water quality index model and Monte-Carlo simulation. Sci. Rep. 2022, 12, 10042. [Google Scholar] [CrossRef]
  2. Dörnhöfer, K.; Oppelt, N. Remote Sens for lake research and monitoring–Recent advances. Ecol. Indic. 2016, 64, 105–122. [Google Scholar] [CrossRef]
  3. Pekel, J.F.; Cottam, A.; Gorelick, N.; Belward, A.S. High-resolution mapping of global surface water and its long-term changes. Nature 2016, 540, 418–422. [Google Scholar] [CrossRef] [PubMed]
  4. Huang, X.; Xie, C.; Fang, X.; Zhang, L. Combining pixel-and object-based machine learning for identification of water-body types from urban high-resolution remote-sensing imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2015, 8, 2097–2110. [Google Scholar] [CrossRef]
  5. Gamshadzaei, M.H.; Rahimzadegan, M. Stable and accurate methods for identification of water bodies from Landsat series imagery using meta-heuristic algorithms. J. Appl. Remote Sens. 2017, 11, 045005. [Google Scholar] [CrossRef]
  6. Li, X.; Ding, J.; Ilyas, N. Machine learning method for quick identification of water quality index (WQI) based on Sentinel-2 MSI data: Ebinur Lake case study. Water Supply 2021, 21, 1291–1312. [Google Scholar] [CrossRef]
  7. Tymków, P.; Jóźków, G.; Walicka, A.; Karpina, M.; Borkowski, A. Identification of water body extent based on remote sensing data collected with unmanned aerial vehicle. Water 2019, 11, 338. [Google Scholar] [CrossRef]
  8. Liu, Z.; Yao, Z.; Wang, R. Automatic identification of the lake area at Qinghai–Tibetan Plateau using remote sensing images. Quat. Int. 2019, 503, 136–145. [Google Scholar] [CrossRef]
  9. Guneroglu, N.; Acar, C.; Dihkan, M.; Karsli, F.; Guneroglu, A. Green corridors and fragmentation in South Eastern Black Sea coastal landscape. Ocean. Coast. Manag. 2013, 83, 67–74. [Google Scholar] [CrossRef]
  10. Crasto, N.; Hopkinson, C.; Forbes, D.L.; Lesack, L.; Marsh, P.; Spooner, I.; Van Der Sanden, J.J. A LiDAR-based decision-tree classification of open water surfaces in an Arctic delta. Remote Sens. Environ. 2015, 164, 90–102. [Google Scholar] [CrossRef]
  11. Acharya, T.D.; Lee, D.H.; Yang, I.T.; Lee, J.K. Identification of water bodies in a Landsat 8 OLI image using a J48 decision tree. Sensors 2016, 16, 1075. [Google Scholar] [CrossRef] [PubMed]
  12. Vignesh, T.; Thyagharajan, K.K. Water bodies identification from multispectral images using Gabor filter, FCM and canny edge detection methods. In Proceedings of the 2017 international conference on information communication and embedded systems (ICICES), Chennai, India, 23–24 February 2017; IEEE: Piscataway, NJ, USA; pp. 1–5. [Google Scholar]
  13. Chang, N.B.; Chen, H.W.; Ning, S.K. Identification of river water quality using the fuzzy synthetic evaluation approach. J. Environ. Manag. 2001, 63, 293–305. [Google Scholar] [CrossRef] [PubMed]
  14. Wang, G.; Wu, M.; Wei, X.; Song, H. Water identification from high-resolution remote sensing images based on multidimensional densely connected convolutional neural networks. Remote Sens. 2020, 12, 795. [Google Scholar] [CrossRef]
  15. Ding, C.; Li, Y.; Xia, Y.; Wei, W.; Zhang, L.; Zhang, Y. Convolutional neural networks based hyperspectral image classification method with adaptive kernels. Remote Sens. 2017, 9, 618. [Google Scholar] [CrossRef]
  16. Zou, Q.; Ni, L.; Zhang, T.; Wang, Q. Deep learning based feature selection for remote sensing scene classification. IEEE Geosci. Remote Sens. Lett. 2015, 12, 2321–2325. [Google Scholar] [CrossRef]
  17. Xu, H.Q. Development of remote sensing water indices: A review. J. Fuzhou Univ. 2021, 49, 613–625. [Google Scholar]
  18. McFeeters, S.K. The use of the Normalized Difference Water Index (NDWI) in the delineation of open water features. Int. J. Remote Sens. 1996, 17, 1425–1432. [Google Scholar] [CrossRef]
  19. Xu, H. Modification of normalised difference water index (NDWI) to enhance open water features in remotely sensed imagery. Int. J. Remote Sens. 2006, 27, 3025–3033. [Google Scholar] [CrossRef]
  20. Fang-fang, Z.; Bing, Z.; Jun-sheng, L.; Qian, S.; Yuanfeng, W.; Yang, S. Comparative analysis of automatic water identification method based on multispectral remote sensing. Procedia Environ. Sci. 2011, 11, 1482–1487. [Google Scholar] [CrossRef]
  21. Feyisa, G.L.; Meilby, H.; Fensholt, R.; Proud, S.R. Automated Water Extraction Index: A new technique for surface water mapping using Landsat imagery. Remote Sens. Environ. 2014, 140, 23–35. [Google Scholar] [CrossRef]
  22. Fisher, A.; Flood, N.; Danaher, T. Comparing Landsat water index methods for automated water classification in eastern Australia. Remote Sens. Environ. 2016, 175, 167–182. [Google Scholar] [CrossRef]
  23. Du, Y.; Zhang, Y.; Ling, F.; Wang, Q.; Li, W.; Li, X. Water bodies’ mapping from Sentinel-2 imagery with modified normalized difference water index at 10-m spatial resolution produced by sharpening the SWIR band. Remote Sens. 2016, 8, 354. [Google Scholar] [CrossRef]
  24. Li, W.; Du, Z.; Ling, F.; Zhou, D.; Wang, H.; Gui, Y.; Sun, B.; Zhang, X. A comparison of land surface water mapping using the normalized difference water index from TM, ETM+ and ALI. Remote Sens. 2013, 5, 5530–5549. [Google Scholar] [CrossRef]
  25. Jiang, W.; Ni, Y.; Pang, Z.; Li, X.; Ju, H.; He, G.; Lv, J.; Yang, K.; Fu, J.; Qin, X. An effective water body extraction method with new water index for sentinel-2 imagery. Water 2021, 13, 1647. [Google Scholar] [CrossRef]
  26. Rad, A.M.; Kreitler, J.; Sadegh, M. Augmented Normalized Difference Water Index for improved surface water monitoring. Environ. Model. Softw. 2021, 140, 105030. [Google Scholar] [CrossRef]
  27. Zhai, K.; Wu, X.; Qin, Y.; Du, P. Comparison of surface water extraction performances of different classic water indices using OLI and TM imageries in different situations. Geo-Spat. Inf. Sci. 2015, 18, 32–42. [Google Scholar] [CrossRef]
  28. Hou, T.; Sun, W.; Chen, C.; Yang, G.; Meng, X.; Peng, J. Marine floating raft aquaculture extraction of hyperspectral remote sensing images based decision tree algorithm. Int. J. Appl. Earth Obs. Geoinf. 2022, 111, 102846. [Google Scholar] [CrossRef]
  29. Zang, W.; Lin, J.; Wang, Y.; Tao, H. Investigating small-scale water pollution with UAV remote sensing technology. In Proceedings of the World Automation Congress, Puerto Vallarta, Mexico, 24–28 June 2012; IEEE: Piscataway, NJ, USA; pp. 1–4. [Google Scholar]
  30. Schaeffer, B.A.; Schaeffer, K.G.; Keith, D.; Lunetta, R.S.; Conmy, R.; Gould, R.W. Barriers to adopting satellite remote sensing for water quality management. Int. J. Remote Sens. 2013, 34, 7534–7544. [Google Scholar] [CrossRef]
  31. Sun, W.; Yang, G.; Peng, J.; Meng, X.; He, K.; Li, W.; Li, H.; Du, Q. A multiscale spectral features graph fusion method for hyperspectral band selection. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–12. [Google Scholar] [CrossRef]
  32. Feng, Y.; He, M.Y.; Song, J.H.; Wei, J. ICA-based dimensionality reduction and compression of hyperspectral images. J. Electron. Inf. Technol. 2007, 29, 2871–2875. [Google Scholar]
  33. Chen, F.; Chen, X.; Van de Voorde, T.; Roberts, D.; Jiang, H.; Xu, W. Open water detection in urban environments using high spatial resolution remote sensing imagery. Remote Sens. Environ. 2020, 242, 111706. [Google Scholar] [CrossRef]
  34. Sekertekin, A. A survey on global thresholding methods for mapping open water body using Sentinel-2 satellite imagery and normalized difference water index. Arch. Comput. Methods Eng. 2021, 28, 1335–1347. [Google Scholar] [CrossRef]
  35. Li, C.; Shao, Z.; Zhang, L.; Huang, X.; Zhang, M. A comparative analysis of index-based methods for impervious surface mapping using multiseasonal Sentinel-2 satellite data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 3682–3694. [Google Scholar] [CrossRef]
  36. Burger, J.; Geladi, P. Hyperspectral NIR image regression part I: Calibration and correction. J. Chemom. A J. Chemom. Soc. 2005, 19, 355–363. [Google Scholar] [CrossRef]
  37. Polder, G.; van der Heijden, G.W.; Keizer, L.P.; Young, I.T. Calibration and characterisation of imaging spectrographs. J. Near Infrared Spectrosc. 2003, 11, 193–210. [Google Scholar] [CrossRef]
  38. Schafer, R.W. What is a Savitzky-Golay filter? [lecture notes]. IEEE Signal Process. Mag. 2011, 28, 111–117. [Google Scholar] [CrossRef]
  39. Xie, H.; Luo, X.; Xu, X.; Tong, X.; Jin, Y.; Pan, H.; Zhou, B. New hyperspectral difference water index for the extraction of urban water bodies by the use of airborne hyperspectral images. J. Appl. Remote Sens. 2014, 8, 085098. [Google Scholar] [CrossRef]
  40. Hartigan, J.A.; Wong, M.A. Algorithm AS 136: A k-means clustering algorithm. J. R. Stat. Society. Ser. C (Appl. Stat.) 1979, 28, 100–108. [Google Scholar] [CrossRef]
  41. Sinaga, K.P.; Yang, M.S. Unsupervised K-means clustering algorithm. IEEE Access 2020, 8, 80716–80727. [Google Scholar] [CrossRef]
  42. Imani, M.; Ghassemian, H. Band clustering-based feature extraction for classification of hyperspectral images using limited training samples. IEEE Geosci. Remote Sens. Lett. 2013, 11, 1325–1329. [Google Scholar] [CrossRef]
  43. Wold, S.; Esbensen, K.; Geladi, P. Principal component analysis. Chemom. Intell. Lab. Syst. 1987, 2, 37–52. [Google Scholar] [CrossRef]
  44. Shahdoosti, H.R.; Ghassemian, H. Combining the spectral PCA and spatial PCA fusion methods by an optimal filter. Inf. Fusion 2016, 27, 150–160. [Google Scholar] [CrossRef]
  45. Ringnér, M. What is principal component analysis? Nat. Biotechnol. 2008, 26, 303–304. [Google Scholar] [CrossRef] [PubMed]
  46. Palo, H.K.; Sahoo, S.; Subudhi, A.K. Dimensionality reduction techniques: Principles, benefits, and limitations. In Data Analytics in Bioinformatics: A Machine Learning Perspective; Wiley: Hoboken, NJ, USA, 2021; pp. 77–107. [Google Scholar]
  47. Giri, S. Water quality prospective in Twenty First Century: Status of water quality in major river basins, contemporary strategies and impediments: A review. Environ. Pollut. 2021, 271, 116332. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Geographical location map of the study area and image examples for rivers in (a) a coastal city and (b) a developed city in southern China.
Figure 1. Geographical location map of the study area and image examples for rivers in (a) a coastal city and (b) a developed city in southern China.
Water 16 00963 g001
Figure 2. UAV hyperspectral imaging system.
Figure 2. UAV hyperspectral imaging system.
Water 16 00963 g002
Figure 3. Preprocessing of water spectral data.
Figure 3. Preprocessing of water spectral data.
Water 16 00963 g003
Figure 4. Water mark image of a coastal city study area in southern China.
Figure 4. Water mark image of a coastal city study area in southern China.
Water 16 00963 g004
Figure 5. Part of the typical study area of a coastal city in southern China: (a) water RGB images and (b) average spectral reflectance of confusing ground objects.
Figure 5. Part of the typical study area of a coastal city in southern China: (a) water RGB images and (b) average spectral reflectance of confusing ground objects.
Water 16 00963 g005
Figure 6. Comparison of extraction results of different water spectral indexes: (a) initial water mark image, (b) N D W I 1 , (c) N D W I 2 , (d) H D W I , (e) K-means, and (f) P C A N D W I .
Figure 6. Comparison of extraction results of different water spectral indexes: (a) initial water mark image, (b) N D W I 1 , (c) N D W I 2 , (d) H D W I , (e) K-means, and (f) P C A N D W I .
Water 16 00963 g006
Figure 7. Overall effect image of water extraction in a coastal city study area in southern China.
Figure 7. Overall effect image of water extraction in a coastal city study area in southern China.
Water 16 00963 g007
Figure 8. Initial marker image and water extraction effect map in the study area: (a) initial RGB image, (b) water mark image, (c) HDWI water extraction effect image, and (d) PCA-NDWI water extraction effect image.
Figure 8. Initial marker image and water extraction effect map in the study area: (a) initial RGB image, (b) water mark image, (c) HDWI water extraction effect image, and (d) PCA-NDWI water extraction effect image.
Water 16 00963 g008
Figure 9. Initial marker image and water extraction effect map in the study area: (a) RGB image of the local study area, (b) image of water extraction effect of the corresponding area, and (c) mean and standard deviation curves of Digital Number in FP area and water area.
Figure 9. Initial marker image and water extraction effect map in the study area: (a) RGB image of the local study area, (b) image of water extraction effect of the corresponding area, and (c) mean and standard deviation curves of Digital Number in FP area and water area.
Water 16 00963 g009
Table 1. Effect comparison of different water extraction methods in training waters.
Table 1. Effect comparison of different water extraction methods in training waters.
Methods A c c u r a c y F 1
N D W I 1 0.9590.844
N D W I 2 0.9360.729
H D W I 0.9600.847
K-means0.8010.749
P C A N D W I 0.9840.946
Table 2. Effect comparison of different water extraction methods in testing waters.
Table 2. Effect comparison of different water extraction methods in testing waters.
Methods A c c u r a c y F 1
N D W I 1 0.9130.794
N D W I 2 0.8810.736
H D W I 0.8970.859
K m e a n s 0.8080.700
P C A N D W I 0.9530.912
Table 3. Comparison of different water extraction methods.
Table 3. Comparison of different water extraction methods.
MethodsWater ExtractionShadow Extraction in Water Shadow Area
A c c u r a c y F 1 A c c u r a c y F 1
H D W I 0.8850.9290.6320.483
P C A N D W I 0.9600.9770.8580.872
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhao, Z.; Yang, J.; Wang, M.; Chen, J.; Sun, C.; Song, N.; Wang, J.; Feng, S. The PCA-NDWI Urban Water Extraction Model Based on Hyperspectral Remote Sensing. Water 2024, 16, 963. https://doi.org/10.3390/w16070963

AMA Style

Zhao Z, Yang J, Wang M, Chen J, Sun C, Song N, Wang J, Feng S. The PCA-NDWI Urban Water Extraction Model Based on Hyperspectral Remote Sensing. Water. 2024; 16(7):963. https://doi.org/10.3390/w16070963

Chicago/Turabian Style

Zhao, Zitong, Jin Yang, Mingjia Wang, Jiaqi Chen, Ci Sun, Nan Song, Jinyu Wang, and Shulong Feng. 2024. "The PCA-NDWI Urban Water Extraction Model Based on Hyperspectral Remote Sensing" Water 16, no. 7: 963. https://doi.org/10.3390/w16070963

APA Style

Zhao, Z., Yang, J., Wang, M., Chen, J., Sun, C., Song, N., Wang, J., & Feng, S. (2024). The PCA-NDWI Urban Water Extraction Model Based on Hyperspectral Remote Sensing. Water, 16(7), 963. https://doi.org/10.3390/w16070963

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop