Next Article in Journal
Classification and Recognition of Lung Sounds Using Artificial Intelligence and Machine Learning: A Literature Review
Previous Article in Journal
An Improved Deep Learning Framework for Multimodal Medical Data Analysis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Estimating Rainfall Intensity Using an Image-Based Convolutional Neural Network Inversion Technique for Potential Crowdsourcing Applications in Urban Areas

by
Youssef Shalaby
1,
Mohammed I. I. Alkhatib
1,
Amin Talei
1,*,
Tak Kwin Chang
1,
Ming Fai Chow
1 and
Valentijn R. N. Pauwels
2
1
Department of Civil Engineering, School of Engineering, Monash University Malaysia, Jalan Lagoon Selatan, Bandar Sunway 47500, Selangor, Malaysia
2
Department of Civil Engineering, Monash University, Clayton, VIC 3800, Australia
*
Author to whom correspondence should be addressed.
Big Data Cogn. Comput. 2024, 8(10), 126; https://doi.org/10.3390/bdcc8100126
Submission received: 20 August 2024 / Revised: 25 September 2024 / Accepted: 26 September 2024 / Published: 29 September 2024

Abstract

:
High-quality rainfall data are essential in many water management problems, including stormwater management, water resources management, and more. Due to the high spatial–temporal variations, rainfall measurement could be challenging and costly, especially in urban areas. This could be even more challenging in tropical regions with their typical short-duration and high-intensity rainfall events, as some of the undeveloped or developing countries in those regions lack a dense rain gauge network and have limited resources to use radar and satellite readings. Thus, exploring alternative rainfall estimation methods could be helpful to back up some shortcomings. Recently, a few studies have examined the utilisation of citizen science methods to collect rainfall data as a complement to the existing rain gauge networks. However, these attempts are in the early stages, and limited works have been published on improving the quality of such data. Therefore, this study focuses on image-based rainfall estimation with potential usage in citizen science. For this, a novel convolutional neural network (CNN) model is developed to predict rainfall intensity by processing the images captured by citizens (e.g., by smartphones or security cameras) in an urban area. The developed model is merely a complementary sensing tool (e.g., better spatial coverage) to the existing rain gauge network in an urban area and is not meant to replace it. This study also presents one of the most extensive datasets of rain image data ever published in the literature. The estimated rainfall data by the proposed CNN model of this study using images captured by surveillance cameras and smartphone cameras are compared with observed rainfall by a weather station and exhibit strong R2 values of 0.955 and 0.840, respectively.

1. Introduction

Estimating rainfall in urban areas is essential for water management, such as rainfall–runoff (R-R) modelling, stormwater management, and flash flood nowcasting [1]. Therefore, accurate rainfall data are essential for authorities managing water resources and floods in urban areas. Furthermore, urban catchments are known for their high spatial variability and relatively shorter catchment response [2]. Thus, rainfall estimation in an urban catchment requires a high spatial and temporal data resolution to better represent the hydrological processes in the catchment [2].
To date, several techniques have been developed for measuring rainfall, such as conventional rain gauges, weather radars, and satellite imagery techniques, which are well accepted and utilised [1,3]. However, finding a universal methodology for rainfall estimation remains elusive, as these techniques, though advantageous in some respects, are limited in other aspects [4]. For instance, conventional rain gauges are primitive in only providing a direct rainfall measurement of the amount reaching the ground surface. However, although rain gauges are point estimates with a reliable temporal resolution, placing many of them over a catchment to achieve a desired spatial resolution may not be feasible due to the required hardware maintenance and operating costs [5].
On another front, weather radars and satellite imagery techniques are indirect areal rainfall estimation techniques that obtain a good level of rainfall’s spatial and temporal characteristics compared to rain gauges. Yet, the rainfall estimates obtained by weather radars are affected by multiple error sources that require constant calibration to ensure their reliability [6,7]. On the other hand, satellite imagery techniques have issues in specific areas such as coastal regions, snow-covered areas, valleys and mountains, and hills with shallow clouds [8]. Additionally, they provide a limited spatial resolution (i.e., 25 km × 25 km), which can be improved by downscaling methods resolving 1 km × 1 km at best [9]. These limitations highlight the potential inaccuracies in capturing localised precipitation events that may trigger flash floods.
Given the limitations of the techniques, constant efforts have been made to develop alternative rainfall-sensing approaches, such as telecommunication networks, citizen science, acoustic sensors, and image sensors. These alternative methods provide low-cost complementary observations to the data measured by existing rainfall estimation techniques.
Among these alternative approaches, citizen science stands out the most due to its successful application in the last two decades in the US’s quantitative rainfall measurement through the CoCoRaHS project [10]. The CoCoRaHS project was established in 1997, where citizens are equipped with a mini rain gauge to collect daily rainfall estimates. These rain gauges can be installed in backyards or any open area, and citizens only need to read the measurements and upload them to a website where all readings are collected and presented for different agencies to utilise. So far, more than 20,000 participants have been involved in this project. Several efforts followed the success of the CoCoRaHS project, which involved regular public rain data collection across North America as a great example of shared data.
However, citizen science rain gauges can be easily installed in rural areas and townhouses but not in urban areas with tall buildings, and they can only provide a single daily rainfall measurement. This highlights the need to develop new techniques for citizens to utilise with more ease and acceptability to collect rainfall measurements in urban areas, which can provide sub-daily rainfall measurements of a rainfall event to be used in hydrological modelling applications. Nowadays, most urban citizens are equipped with smartphones and internet connections, while most buildings have surveillance cameras; therefore, this study proposes using smartphones or surveillance cameras as image sensors for measuring rainfall with the notion that such data collection can be conducted through citizen science. The predicted rainfall data by this method will provide complementary rainfall data to the existing rainfall measurement techniques, such as rain gauges and radars, as this approach can offer additional spatial coverage for rainfall measurement. The reason for selecting images in this study is the constant development and progressing attention in the literature towards image sensors and image processing techniques for rainfall sensing.
There are a few studies that share our objective of measuring rainfall using camera-based techniques. For instance, Allamano et al. [11] employed a statistical framework based on fundamental camera optics to estimate rainfall intensity from images captured by smartphones. The proposed framework utilised five processing phases: (1) drop detection, (2) blur effect removal, (3) estimation of the drop velocities, (4) drop positioning in the control volume, and (5) rain rate estimation. The developed approach for drop detection was entirely based on identifying the brightness threshold in a camera setting that allows drop size detection. This study utilised 104 min for the development of the framework. The results of the developed method were compared with tipping-bucket rain gauge data, which resulted in a root mean square error performance of 3.01 mm/h.
Similarly, Dong et al. [12] developed a rainfall intensity estimation technique using video recordings. The proposed methodology was based on identifying raindrop size distribution in a video frame (image) and utilising it to estimate rainfall intensity by fitting the DSD in a gamma distribution model. The developed methodology realises two stages: (1) extracting grey tone features from images to detect the presence of raindrops in images and (2) extracting the average colour tensor and average intensity difference features for each raindrop from images to focus on the raindrop in the images and eventually calculate the diameter for each focused raindrop. The authors highlighted that the main challenge with this methodology is finding the best camera setting, such as focal length and focal plane, to effectively capture the small drop size in an image. The algorithm developed in this study was compared with pluviometer rain gauge data (9 min) for three rainfall intensities (low, moderate, and heavy rain), which showed an acceptable agreement between the two methods.
Jiang et al. [13] attempted to improve the work proposed by Allamano et al. [11] and Dong et al. [12] and utilise it through surveillance cameras with real-world backgrounds (moving cars). The enhanced algorithm adds a removal framework that eliminates unfocused raindrops and unsatisfying size–velocity relationships. The authors utilised a total of 403 min for the development of the algorithm, which resulted in a mean average percentage error of (21.8%), which is slightly better than the previously published methods by Allamano et al. [11] (26.0%) and Dong et al. [12] (31.8%).
Yin et al. [14] utilised state-of-the-art convolutional neural networks (CNNs) to estimate rainfall intensity using captured images. The authors used a pre-trained model (ResNet) with transfer learning to develop the irCNN model (rainfall estimator from images). The ground-truth rainfall values were collected using a tipping-bucket rain gauge with 1 min and 0.1 mm resolution. The model was trained on a synthetic surveillance camera dataset consisting of 4000 images and tested using a smartphone dataset (918-1s frames) collected on the campus of Zhejiang University, China. The model resulted in a MAPE value of 18.5%. Then, the same model was trained using a real-time surveillance camera dataset of 7117 images from six rainfall events, resulting in a MAPE value of 16.5%. The authors further tested the model performance based on a training event approach by selectively changing the number of events collected using the real-time surveillance camera used for training and testing, resulting in an average MAPE value of 21.9%.
Recently, Wang et al. [15] proposed a near-infrared surveillance-video-based rain gauge using a 1D convolutional neural network (CNN). The proposed model utilises a rain streak extraction algorithm and uses that feature as the input to the CNN model. A total of 4368 min was used for model development collected using a siphon rain gauge capable of reporting rainfall information every 0.1 mm at a 1 min resolution. The developed model had a varying level of performance in terms of mean absolute error (MAE) ranging from 8.86 to 84.84 mm/h. The authors highlighted that the developed model is only recommended where the surveillance area’s lighting conditions and the cameras’ main parameters do not differ significantly from the camera used in their study.
To this end, the above literature review shows a successful effort toward developing an image-based rainfall estimator. However, those studies utilised a limited number of rainfall events, focusing more on surveillance cameras and less on smartphones. In addition, the AI-based techniques in past studies can still be improved to result in more accurate rainfall estimation. Therefore, this study focuses on three contributions: (1) extending the input images from surveillance cameras to a mix of smartphones and surveillance cameras, (2) collecting an extensive dataset (one of the largest ever published in the literature) of observed rainfall and its corresponding images captured by smartphones and surveillance cameras to train the AI-based model better, and (3) introducing a CNN algorithm that utilises state-of-the-art image processing techniques before feeding images to the CNN model.
In this study, it is speculated that due to the diversified capabilities of smartphones and CCTV cameras in the market to capture HD images, rainfall representation would differ and would have a great adverse impact on the CNN model performance to translate rainfall images to rainfall intensities. Therefore, this study examined the capabilities of image pre-processing techniques to achieve a reliable performance on diversified data for future implementation of image rainfall-sensing applications.
The image-based technique for rainfall estimation developed in this study has the potential to be used in citizen science applications, which could contribute to enhancing the spatial coverage of rainfall data in urban areas, which can help improve the spatial representation of rainfall for stormwater network design and rainfall–runoff studies by providing more confidence to the model, representing a further complementary source of data.

2. Materials and Methods

The methodology of this study includes four main stages. Stage 1 focuses on collecting rainfall images from different sources (i.e., surveillance cameras and smartphones) to conduct rainfall image analysis using an event-based approach. Stage 2 compares five image pre-processing techniques based on CNN model performance on the surveillance camera dataset. Stage 3 evaluates the developed model performance on the surveillance camera dataset using both surveillance camera and smartphone datasets. The final stage (Stage 4) focuses on improving the CNN model by retraining the surveillance camera-CNN model by a transfer-learning [16] technique to adapt the model for smartphone data.
The basic idea of the suggested method includes both the CNN model development and rainfall image capturing. When it rains, rain images are collected from already-installed sensors, such as cameras, which are common in urban areas. Based on these images, the suggested CNN model is then used to forecast the rainfall intensity. The rainfall intensity estimated by the CNN model at each location will result in highly spatiotemporal rainfall data.

2.1. Study Site and Data Collection

The study site and its surroundings are located at Monash University’s Malaysia campus (see Figure 1). The data were collected in different locations with different spatiotemporal characteristics to develop a diverse rain image dataset. The rainfall data were captured using a tipping-bucket rain gauge at 1 min resolution. The rainfall image data were captured using both a surveillance camera and smartphones. The aforementioned data collection for rain and its corresponding images took place between May and December 2022 to represent Malaysia’s southeast (May–September) and northeast (November–March) monsoon seasons [17].

2.1.1. Rain Data Collection

Since the extracted snapshots (rain images) were at a resolution of one frame per second, the rainfall intensity data were linearly interpolated to allocate a specific rainfall intensity to each image frame. In this process, the rainfall intensities recorded at two consecutive minutes were used to interpolate values for the captured frames between the centroid of each minute.

2.1.2. Rain Images from Surveillance Cameras and Smartphones

The first and most important task for developing a reliable deep-learning model is obtaining sufficient high-quality data.
This study utilises natural rainfall data, and from a hydrological point of view, rainfall is imbalanced in nature, where more low-intensity rainfall is observed in a rainfall event than high-intensity rainfall.
The current collected dataset reflects this fact, and the authors intended to keep it as is to allow acceptance from hydrologists, as they usually tend to resist/refrain from methods that use synthetic data.
In this study, an outdoor solar surveillance camera was installed approximately 700 m from the rain gauge station. The video files had a maximum resolution of 1080 pixels and were in the mp4 format. The surveillance camera recorded 805 min of video during rainfall events from 1 May 2022 to 31 December 2022. This camera depicted a variety of rainfall events with different intensities and durations. Data were collected with a fixed background to better show how rain affects the images’ characteristics. Eventually, a total of 6121 images were extracted from these video recordings. Additionnally, rainfall image data were also collected using smartphones at a few points on the Monash University campus, around the rain gauge station (see Figure 1), during rainfall events between May and December 2022. As a result, 1984 rainfall images for various rainfall intensities and durations were collected. Figure 2 shows sample rainfall images taken by smartphones during the data collection period.
The dataset was taken from two locations, as shown in Figure 1. Location 1 (labelled as CCTV station) was mainly used for the CCTV dataset collection, with one CCTV camera set to a static position with a fixed view angle. However, Location 2 (labelled as smartphone stations) was where the smartphone dataset was captured using one mobile phone, capturing photos at different points within that location (or site).
The geographical location was urban, with typical city sceneries, such as buildings, trees, streets, etc. in the background (see Figure 2).
Weather conditions ranged from cloudy to slightly cloudy.
Most of the photos in the datasets were taken during the daytime.
As a pilot study, we intended to limit the impact of diversifying the dataset to explore the possibility of achieving a working model. The next increment of this study is to challenge the model further and expose its limitations.
Figure 1. Locations of rainfall images captured using surveillance cameras and smartphones during rain events near the Monash University campus in Malaysia.
Figure 1. Locations of rainfall images captured using surveillance cameras and smartphones during rain events near the Monash University campus in Malaysia.
Bdcc 08 00126 g001
Figure 2. Sample images captured by smartphone on campus during rainfall events.
Figure 2. Sample images captured by smartphone on campus during rainfall events.
Bdcc 08 00126 g002

2.2. Image Pre-Processing

Image pre-processing is essential after gathering the image records and rain gauge readings. Real-world raw images often lack focus on certain features or trends; therefore, once collected, pre-processing images to convert them into a simpler, more meaningful format for deep-learning applications is an essential step. Image thresholding effectively and efficiently reduces the number of elements and locates the objects in complex images. Therefore, such a pre-processing method prevents the CNN model from misinterpreting or favouring specific images during learning [18]. This study examines four thresholding methods to highlight important rainfall features in an image and potentially improve the model’s performance in estimating rainfall intensity. These four thresholding techniques are image sharpening, pixel intensity, Otsu, and Yen methods [19,20,21].
In the current study, the choice of pre-processing methods was based on the most widely accepted methods in the literature to allow future research studies to replicate the same study easily. At the same time, to prevent over-production or removing the rainfall from the images, the most advanced texture description treats rainfall as noise and removes it from the image. In future studies, more advanced texture descriptors shall be considered and analysed for rainfall image sensing.
Image sharpening is a crucial step in image processing that enhances the perceived sharpness of an image. Digital cameras often use and incorporate sharpening algorithms. Professional photographers also use these sharpening methods to improve the quality of their images [21]. The most widely used and significant characteristic for categorisation is the pixel intensity value, which is the basic information held within pixels [22]. In a greyscale image, each pixel’s intensity value is one; in a colour image, it is three. Rain can affect an image pixel by changing the pixel’s intensity; background pixels covered by raindrops will show changes in the pixel’s intensity due to being covered by raindrops.
Otsu’s thresholding is a method used for distinguishing objects from their background image by giving each pixel a value for intensity T (threshold) so that each pixel can be categorised as a point in the background or a point on the object [23]. Otsu’s approach chooses the appropriate threshold (foreground pixels and background pixels) from images with two classes of pixels supporting a bi-modal histogram. For each 2D image, the method creates a histogram and calculates the weights, mean, and variance of the background and foreground pixels for a single threshold. Yen’s thresholding method, however, was established to segment images using automatic multilevel thresholding to produce a 2D image that facilitates identifying the foreground from the image background [19].
After reviewing the pros and cons of using these four image processing techniques, this study adopted four combinations for developing CNN-based rainfall estimation models: Model 1 using Otsu’s threshold method, Model 2 using Yen’s threshold method, Model 3 using a combination of Yen’s, sharpening, and pixel intensity methods, and Model 4 using a combination of Otsu’s, sharpening, and pixel intensity methods. The sharpening and image intensity techniques are not used alone because the overall performances of Otsu’s and Yen’s methods are consistently better than the other two in the literature [19,21].

2.3. CNN Model Development

CNNs are a deep-learning method that was initially developed for image classifications and object detection. CNNs have shown excellent performance in image recognition, improving image quality, object identification, and rain image analysis [23,24]. These studies have led to the development of rain filters that remove rain from images. Due to their original purpose, these studies mostly focus on rain removal and image restoration. Therefore, this study uses a CNN model framework to fine-tune and improve the image rain CNN model.
As per the authors’ knowledge, regression CNN models using images as inputs are limited or lacking in the literature, and no open-source models are available to allow transfer learning. The only available or accessible CNN models are those used for classification problems, such as VGG, Renet, googlenet, etc. In our initial examination, using classification CNN models for regression problems, those models yielded negative results, which forced us to explore the direction of building models from scratch.
Nonetheless, the transfer-learning approach is considered in this study for enhancing the model performance on the smartphone dataset, where the initial model was built on a surveillance camera dataset, and the smartphone dataset was introduced to the model through transfer learning, yielding a slight model improvement.
The CNN model used in this study is Konstanz Information Miner (KNIME) [25], a deep-learning platform with free and open-source data analytics software commonly used to create CNNs. KNIME uses the most popular deep-learning framework, Keras network integration [25]. This deep-learning framework can compare the images with the rainfall data produced by the rain gauge to estimate rainfall intensities in terms of time series. The CNN model automatically finds the features required for detection or classification when it is given raw image data. In other words, it estimates continuous values of rainfall intensities by analysing the complex relationships between image features and a single output associated with each image, which is a rainfall intensity value.
Before training the model, the rainfall images need to be pre-processed, which involves extracting image frames at a rate of one frame per second from rain videos. The corresponding rainfall intensity from the rain gauge (1 min intervals) was used to depict each minute of rain video. Linear interpolation was then used to assign rainfall intensity values to each extracted frame from the video recordings at each second [25]. The convolutional design of a typical CNN includes an input layer, convolutional layers, subsampling layers, full connection layers, and an output layer [26]. For example, an image containing a few raindrops encoded by a pixel matrix is delivered to the input layer. Then, various feature maps are created within the convolution layer using various convolutional kernels (typically utilising a k × k matrix), each with a different weight vector. The subsampling layer performs a local averaging or maximum function to lower the resolution of the feature map and the sensitivity of the output to the shift and distortion of the raw image input. A different characteristic of the input image must be identified for each iteration of the convolution and subsampling operations. The number of featured maps at each convolutional layer and subsampling layer must be predetermined in the model. Finally, the result of the CNN model is a probability vector for the number of raindrops in the image, which is produced using the feature maps and full connection layers. This process is described and illustrated in Figure 3.
Various CNN types have been reported in the literature for detecting raindrops from rain images. These models varied in terms of their design, including the number of layers, the size of convolutional kernels, the method of facilitating subsampling, etc. Further information about such CNN models can be found in [23,24,27].
Although deep-learning models have been effective in different applications, model training is still challenging due to the numerous model parameters involved [14]. This study proposes a CNN model based on the general CNN architecture to estimate rainfall intensity from rainfall images.
CNN is a complex model with several internal parameters requiring optimisation, such as the number of convolutional layers, kernel size, activation function type, input size, learning rate, batch size, etc. In the current study, a trial-and-error approach was utilised, similar to several studies in the literature, such as [13,14,15], to achieve the final model topology.
Several input sizes and batches were examined, and the current one documented in this paper was optimal as far as we have investigated, resulting in the model being trained on the given network without crashing. In future studies, a Baye search optimisation method [28,29] could potentially be used for CNN model optimisation and performance improvement.
The architecture of the regression CNN starts with an input image in the first layer containing a greyscale rain image photo of 250 × 250 pixels. Then, 64 convolutional kernels, each of which is a 7 × 7 matrix, are used to create 64 distinct feature maps for the raw input image in the second layer. The convolutional kernel is applied to the input image’s pixel matrix by shifting the kernel by two steps (convolutional stride s = 2) to simplify the resolution of the feature map. After adjusting the number of convolutional and pooling layers and setting the kernels and strides in the network, a dropout layer was added and set to 30%. This means that 30% of the neurons in this layer will be dropped randomly in every epoch to prevent the model from overfitting. Since most CNN models are used to classify objects in images, a regression layer was added to the proposed CNN model to calculate the intensity of the rain using the rectified linear unit known as the relu function. The relu function is an activation function described as a piecewise linear function that outputs the inputs directly when the input is positive, while it outputs zero otherwise [29]. The final CNN architecture in this study consists of 17 convolutional layers, as schematically depicted in Figure 4, and a schematic structure of the CNN model is shown in Figure 5, where all layers have different convolutional kernel sizes and an increased number of feature mappings (ranging from 64 to 512). The CNN model employs three particular subsampling layers that identify the maximum subsampling technique for values of each matrix of the feature maps.
The CNN model was trained to predict rainfall intensity based on image features acquired by the rain images using the Keras network learner [30]. Since the model in this study is designed for regression rather than classification, the mean squared error (MSE) was utilised as the loss function. Due to previously demonstrated outstanding performance [31], the Adam optimiser [32,33] was chosen in this study to train the CNN model. The Adam optimiser is a stochastic gradient descent (SGD) iterative method based on adaptive estimation of first-order and second-order moments [32]. Given the massive number of CNN model parameters that must be calibrated, adequate model training necessitates a vast quantity of rainfall images from the large dataset captured by a surveillance camera and smartphones.

2.4. Model Performance Criteria

The performance criteria employed in this study to evaluate the models’ capacity for prediction at various stages of training, validation, and testing are represented in Table 1. These criteria are frequently used in CNN model development studies, hydrological modelling, and forecasting applications [34,35,36,37]. The CNN model was trained using the mean absolute error (MAE) and the mean squared error (MSE) as its loss functions, where smaller values of MAE and MSE imply higher model performance. Following the model’s training, the goodness of fit between the predicted and observed rainfall intensities was assessed using the coefficient of determination (R2).
Table 1. Model performance criteria.
Table 1. Model performance criteria.
FormulaRange
R 2 = i = 1 n Y O b s , i Ȳ O b s Y S i m , i Ȳ S i m i = 1 n Y O b s , i Ȳ O b s 2 × i = 1 n Y S i m , i Ȳ S i m 2 2 [0–1]
M A E = i = 1 n Y S i m , i Y O b s , i n [0, +∞)
M S E = i = 1 n Y S i m , i Y O b s , i 2 n [0, +∞)
ȲObs and ȲSim are the observed and simulated mean rainfall intensity, respectively; YObs,i and YSim,i are the ith observed and simulated rainfall intensity, respectively; n is the total number of data points.

3. Results and Discussion

3.1. Rainfall Data Analysis

Figure 6 shows the collected rainfall using surveillance and smartphone cameras. As can be seen, there is a skewed distribution toward low rainfall intensities. The low-intensity rainfall data (0–15 mm/h) corresponding to the surveillance camera images (Figure 6a) include 60% of the entire dataset, while this ratio for rainfall data corresponding to the smartphone cameras (Figure 6b) is 68%.
The statistical measures for the training, validation, and testing datasets corresponding to the surveillance camera and smartphone cameras are summarised in Table 2 and Table 3, respectively. This analysis ensures all datasets are statistically close to each other and fairly benefit from the randomised data. As can be seen, the randomised data are not biased and maintain uniformity between the training, validation, and testing phases. Therefore, the model performances in each phase of model development are expected to be unbiased.

3.2. Image Pre-Processing Techniques

Figure 7 shows a non-raining raw image (Figure 7a) and the result of four image pre-processing techniques, including sharpening (Figure 7b), pixel intensity (Figure 7c), Otsu’s thresholding method (Figure 7d), and combined sharpening, pixel intensity, and Ostu’s thresholding method (Figure 7e).
To better understand the capabilities of these image processing techniques, Figure 8 displays the effect of Otsu’s method on an image captured during three different rainfall conditions, including low or no rain, moderate rain, and heavy rain. The disappearance of certain details reveals the impact of heavy rain on the image background. For example, the massive buildings far from the camera are covered by heavy rain and thus are not visible in the processed image.
Based on the capabilities of each image processing technique, the CNN models of this study are trained by pre-processed data resulting from the Otsu threshold method (Model 1), the Yen threshold method (Model 2), the Yen approach combined with image sharpening and pixel intensity methods (Model 3), and the Otsu method combined with image sharpening and pixel intensity methods (Model 4).

3.3. CNN Rainfall Estimation Model Using Surveillance Camera Images

The background frames captured by the surveillance camera were used for CNN model training. The CNN model is trained and validated using 60% and 20% of the captured images, respectively. The calibrated model is then tested using an unseen dataset (testing dataset) using the remaining 20% of the captured images. The trained CNN model can recognise and separate rain features from the background information in the images and then associate them with the expected rainfall coverage to estimate rainfall intensity. Table 4 shows the performance of the four CNN models (Models 1–4 using different pre-processing techniques) on the testing dataset in terms of MSE, MAE, and R2. A baseline CNN model, which used the raw image as input without any image pre-processing, was considered to evaluate the effect of pre-processing techniques on model performance.
The results showed a significant improvement in all statistics when image pre-processing techniques were used before feeding the images to the CNN model (Models 1–4). This was evident from a 67–72% drop in MSE, a 43–49% drop in MAE, and a 12–13% increase in R2 when comparing the baseline model with Models 1–4. Among the four proposed pre-processing techniques, a combination of Otsu’s technique with image sharpening and pixel intensity approaches (Model 4) performed the best by giving the lowest error measures and highest R2. The second-best model was Model 1, which adopted Otsu’s thresholding method on the image background using an image calculator node to cluster similar image layers. For further evaluation of the model performance, a parity plot between observed and predicted rainfall intensity (mm/h) by Model 4 is shown in Figure 9. The plot shows a varying level of underestimation at low, moderate, and high intensities; however, most underestimations occurred in the 30 to 90 mm/h range. In addition, no excessive overestimation of rainfall intensities was observed.

3.4. CNN Rainfall Estimation Model Using Smartphone Camera Images

As discussed in Section 2.1.2, the number of images (data points) captured by the surveillance camera was almost three times larger than by smartphones. Since the CNN-based rainfall estimation model is a data-driven approach, it was decided to adopt the best model developed using the surveillance camera data and build on it for smartphone data. Therefore, Model 4, developed using a CNN algorithm and equipped with a combined image pre-processing technique, was adopted to proceed. Two approaches were considered for model development. In Approach 1, Model 4 was directly used to test smartphone data (testing dataset) without any retraining. However, in Approach 2, Model 4 was retrained using the smartphones’ training datasets and the transfer-learning method for the CNN model. Then, the model was validated for the smartphones’ validation dataset for further parameter fine-tuning. The performances of the two approaches on smartphone testing datasets are compared and presented in Table 5.
As can be seen, the retrained model (Approach 2) performed slightly better than the one using Approach 1 for all performance measures. Although the amount of data used for retraining was insignificant, the CNN model in Approach 2 demonstrated reasonably good results in estimating rainfall intensity. The CNN model’s performance also increased when the number of epochs increased; however, it continued to perform roughly the same for any additional increases after the second run. In other words, when a sufficient variety of backdrop images are acquired by the smartphone and used for model training, the CNN model can identify rain features from the background information. The reason why the improvement was slight could be attributed to the fact that the size of the smartphone’s training dataset is relatively small compared to the surveillance camera’s dataset (1190 vs. 3673 data points). In addition, it is worth noting that images from the surveillance camera have a fixed background, while the ones from smartphones have various backgrounds. Therefore, more data seem to be needed to upgrade Model 4 for smartphone images.
The observed and simulated rainfall intensities by Model 4 using Approach 2 are illustrated in Figure 10 for further comparison. It is evident that the model underestimates rainfall intensities varying in the range of 30 to 90 mm/h. This could be attributed to the fact that a larger portion of the smartphone dataset belongs to the low-intensity rainfall category (0–15 mm/h) compared to the surveillance camera data (68% vs. 60%). Therefore, the model is less prepared for estimating rainfall intensities above 15 mm/h. Perhaps further data collection to enhance the diversity and size of the training dataset could improve the model performance in high-intensity rainfall estimation. Additionally, weather conditions, such as strong wind, could impact the quality of the raindrops captured in the image.
To this end, both models’ results show that CNNs can perform well on data collected from surveillance cameras or smartphones. Although CNNs are known for their ability to generalise unseen data, the degradation of image quality significantly impacts the model performance. It requires fine-tuning (transfer learning) the model on smartphone data to allow it to perform on smartphone data.

4. Conclusions

In this study, an image-based rainfall estimating technique using a deep-learning algorithm CNN was developed and examined for images captured by smartphones and surveillance cameras. The results of this study showed the following:
(1)
The performance of the image-based rainfall estimating model was assessed by comparing the estimated rainfall data with the observed data. In this comparison, the study’s best rainfall estimating CNN model resulted in R2 = 0.955 using outdoor surveillance rainfall photos as the input. Then, the same model was used to estimate rainfall using the smartphone rain image dataset in the KNIME transfer-learning environment using two different approaches, where Approach 1 gave the best result of R2 = 0.844.
(2)
The developed CNN model demonstrated its significant potential as a rain-sensing tool by effectively estimating rainfall intensity based on images captured by smartphones and surveillance cameras (i.e., rainfall videos). This model can be implemented in citizen science applications to enhance spatial coverage of rainfall data in urban areas. In addition, this model leverages urban image-based rainfall sensors, a low-cost data collection system, to improve the spatio-temporal resolution of rainfall data.
This study is subject to certain constraints. The outdoor security camera has low night vision, meaning it can only record rain images during the day. As a result, the surveillance camera collection only includes daytime rain-related photos. Subsequent studies ought to examine the model’s ability to predict the intensity of rainfall during nighttime and low-light rainfall events, as this could impact the visibility of rain features in photos. Apart from that, the quality and completeness of the training data significantly impact how well the CNN model predicts rainfall. Erroneous data values or missing photos for particular rain classes are examples of data-collecting issues.
The suggested models for estimating rainfall could be further optimised and enhanced using more complex machine-learning tools and a larger dataset. Since accumulating various rainfall photos from widely spaced sensors can further improve the model’s calibration and validation for rainfall categorisation and estimation, enriching the model with large, high-quality datasets is recommended. This is essential for fine-tuning a reliable CNN rainfall estimation algorithm and making it more equipped for practical applications. Creating rainfall time series and determining any time gap between records and rain gauge data can be significantly aided by mobile recording devices, such as smartphone cameras. In addition, future research can examine the uncertainties associated with different parts of the proposed model and make thorough comparisons among other image rainfall-intensity-based deep-learning models.
The developed pilot model has been designed with scalability in mind. Although the current study focused on collecting rainfall data in tropical climates using two settings and two devices, further exploration is required for the model to generalise across different urban environments, regions, and varying climatic conditions. Future research will focus on enhancing the model to ensure it can effectively capture rainfall data in diverse locations, including areas with less intense rains, thus improving its scalability.

Author Contributions

Conceptualisation, A.T. and Y.S.; methodology, Y.S., T.K.C. and A.T.; software, Y.S. and M.I.I.A.; validation, A.T., Y.S., T.K.C. and M.I.I.A.; formal analysis, Y.S.; investigation, Y.S. and M.I.I.A.; resources, A.T.; data curation, Y.S.; writing—original draft preparation, Y.S.; writing—review and editing, A.T., M.I.I.A., T.K.C., M.F.C. and V.R.N.P.; visualisation, Y.S. and M.I.I.A.; supervision, A.T., V.R.N.P. and M.F.C.; project administration, A.T. and Y.S.; funding acquisition, A.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data of this study will be available upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Sun, Q.; Miao, C.; Duan, Q.; Ashouri, H.; Sorooshian, S.; Hsu, K.L. A review of global precipitation data sets: Data sources, estimation, and Intercomparisons. Rev. Geophys. 2018, 56, 79–107. [Google Scholar] [CrossRef]
  2. Mishra, A.K. Effect of rain gauge density over the accuracy of rainfall: A case study over Bangalore, India. SpringerPlus 2013, 2, 311. [Google Scholar] [CrossRef]
  3. Kidd, C.; Levizzani, V. Status of satellite precipitation retrievals. Hydrol. Earth Syst. Sci. 2011, 15, 1109–1116. [Google Scholar] [CrossRef]
  4. Kathiravelu, G.; Lucke, T.; Nichols, P. Rain drop measurement techniques: A review. Water 2016, 8, 29. [Google Scholar] [CrossRef]
  5. Kidd, C.; Becker, A.; Huffman, G.J.; Muller, C.L.; Joe, P.; Skofronick-Jackson, G.; Kirschbaum, D.B. So, how much of the Earth’s surface is covered by rain gauges? Bull. Am. Meteorol. Soc. 2017, 98, 69–78. [Google Scholar] [CrossRef]
  6. Michaelides, S.; Levizzani, V.; Anagnostou, E.; Bauer, P.; Kasparis, T.; Lane, J.E. Precipitation: Measurement, remote sensing, climatology and modeling. Atmos. Res. 2009, 94, 512–533. [Google Scholar] [CrossRef]
  7. He, X.; Sonnenborg, T.O.; Refsgaard, J.C.; Vejen, F.; Jensen, K.H. Evaluation of the value of radar QPE data and rain gauge data for hydrological modeling. Water Resour. Res. 2013, 49, 5989–6005. [Google Scholar] [CrossRef]
  8. Youngman, B. Why Citizen Science for Precipitation? NASA 2009. Available online: https://terra.nasa.gov/citizen-science/precipitation (accessed on 6 January 2024).
  9. Starkey, E.; Parkin, G.; Birkinshaw, S.; Large, A.; Quinn, P.; Gibson, C. Demonstrating the value of community-based (‘citizen science’) observations for catchment modelling and characterisation. J. Hydrol. 2017, 548, 801–817. [Google Scholar] [CrossRef]
  10. Cocorahs.org. CoCoRaHS—Community Collaborative Rain, Hail & Snow Network. 2020. Available online: http://www.cocorahs.org/ (accessed on 13 January 2020).
  11. Allamano, P.; Croci, A.; Laio, F. Toward the camera rain gauge. Water Resour. Res. 2015, 51, 1744–1757. [Google Scholar] [CrossRef]
  12. Dong, R.; Liao, J.; Li, B.; Zhou, H.; Crookes, D. Measurements of Rainfall Rates from Videos. In Proceedings of the 2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), Shanghai, China, 14–16 October 2017; pp. 1–9. [Google Scholar] [CrossRef]
  13. Jiang, S.; Babovic, V.; Zheng, Y.; Xiong, J. Advancing opportunistic sensing in hydrology: A novel approach to measuring rainfall with ordinary surveillance cameras. Water Resour. Res. 2019, 55, 3004–3027. [Google Scholar] [CrossRef]
  14. Yin, H.; Zheng, F.; Duan, H.F.; Savic, D.; Kapelan, Z. Estimating rainfall intensity using an image-based deep learning model. Engineering 2023, 21, 162–174. [Google Scholar] [CrossRef]
  15. Wang, X.; Wang, M.; Liu, X.; Zhu, L.; Shi, S.; Glade, T.; Chen, M.; Xie, Y.; Wu, Y.; He, Y. Near-infrared surveillance video-based rain gauge. J. Hydrol. 2023, 618, 129173. [Google Scholar] [CrossRef]
  16. Hussain, M.; Bird, J.J.; Faria, D.R. A Study on CNN Transfer Learning for Image Classification. In Advances in Computational Intelligence Systems, Proceedings of the 18th UK Workshop on Computational Intelligence, Nottingham, UK, 5–7 September 2018; Lotfi, A., Bouchachia, H., Gegov, A., Langensiepen, C., McGinnity, M., Eds.; Springer: Cham, Switzerland, 2018; Volume 840. [Google Scholar] [CrossRef]
  17. MSMA. Urban Stormwater Management Manual for Malaysia, 2nd ed.; Department of Irrigation and Drainage Malaysia: Kuala Lumpur, Malaysia, 2012. [Google Scholar]
  18. Mathivanan, N.; Ghani, M.; Janor, R. Improving Classification Accuracy Using Clustering Technique. Bull. Electr. Eng. Inform. 2018, 7, 465–470. [Google Scholar] [CrossRef]
  19. Yen, J.C.; Chang, F.J.; Chang, S. A new criterion for automatic multilevel thresholding. IEEE Trans. Image Process. 1995, 4, 370–378. [Google Scholar] [CrossRef]
  20. Zheng, C.; Sun, D.-W. Image Segmentation Techniques. In Computer Vision Technology for Food Quality Evaluation; Academic Press: Cambridge, MA, USA, 2008; pp. 37–56. [Google Scholar] [CrossRef]
  21. Bangare, S.L.; Dubal, A.; Bangare, P.S.; Patil, S. Reviewing Otsu’s method for image thresholding. Int. J. Appl. Eng. Res. 2015, 10, 21777–21783. [Google Scholar] [CrossRef]
  22. Fraser, B.; Schewe, J. Real World Image Sharpening with Adobe Photoshop, Camera Raw, and Lightroom, 2nd ed.; Peachpit Press: Berkeley, CA, USA, 2009. [Google Scholar]
  23. Fu, X.; Huang, J.; Ding, X.; Liao, Y.; Paisley, J. Clearing the skies: A deep network architecture for single-image rain removal. IEEE Trans. Image Process. 2017, 26, 2944–2956. [Google Scholar] [CrossRef] [PubMed]
  24. Ren, Y.; Nie, M.; Li, S.; Li, C. Single image de-raining via improved generative adversarial nets. Sensors 2020, 20, 1591. [Google Scholar] [CrossRef]
  25. KNIME Hub. Partitioning. Available online: https://hub.knime.com/knime/extensions/org.knime.features.base/latest/org.knime.base.node.preproc.partition.PartitionNodeFactory (accessed on 3 October 2022).
  26. Garg, K.; Nayar, S.K. Detection and Removal of Rain from Videos. In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2004, Washington, DC, USA, 27 June–2 July 2004; Volume 1. [Google Scholar]
  27. Kim, J.H.; Lee, C.; Sim, J.Y.; Kim, C.S. Single-Image Deraining Using an Adaptive Nonlocal Means Filter. In Proceedings of the 2013 20th IEEE International Conference Image Processing (ICIP), Melbourne, Australia, 15–18 September 2013; pp. 914–917. [Google Scholar] [CrossRef]
  28. Močkus, J. On Bayesian Methods for Seeking the Extremum. In Optimization Techniques, Proceedings of the IFIP Technical Conference, Novosibirsk, Russia, 1–7 July 1974; Springer: Berlin/Heidelberg, Germany, 1975; pp. 400–404. [Google Scholar] [CrossRef]
  29. Snoek, J.; Larochelle, H.; Adams, R.P. Practical Bayesian Optimisation of Machine Learning Algorithms. arXiv 2012, arXiv:1206.2944. [Google Scholar]
  30. Praharsha, V. Relu (Rectified Linear Unit) Activation Function, OpenGenus IQ: Computing Expertise & Amp; Legacy. 2022. Available online: https://iq.opengenus.org/relu-activation/#google_vignette (accessed on 17 March 2023).
  31. KNIME Hub. Transfer Learning Made Easy with Deep Learning Keras Integration|KNIME. 2022. Available online: https://www.knime.com/blog/transfer-learning-made-easy-with-deep-learning-keras-integration (accessed on 2 October 2022).
  32. Bushaev, V. Adam-Latest Trends in Deep Learning Optimisation. Medium. Towards Data Science. 2018. Available online: https://towardsdatascience.com/adam-latest-trends-in-deep-learning-optimization-6be9a291375c (accessed on 17 March 2023).
  33. Brownlee, J. Gentle Introduction to the Adam Optimization Algorithm for Deep Learning, Machine Learning Mastery. 2021. Available online: https://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/ (accessed on 18 October 2022).
  34. Chang, T.; Talei, A.; Chua, L.; Alaghmand, S. The impact of training data sequence on the performance of neuro-fuzzy rainfall-runoff models with online learning. Water 2018, 11, 52. [Google Scholar] [CrossRef]
  35. Nguyen, P.K.T.; Chua, L.H.-C.; Talei, A.; Chai, Q.H. Water level forecasting using neuro-fuzzy models with local learning. Neural Comput. Appl. 2016, 30, 1877–1887. [Google Scholar] [CrossRef]
  36. Alkhatib, M.I.I.; Talei, A.; Chang, T.K.; Pauwels, V.R.N.; Chow, M.F. An Urban Acoustic Rainfall Estimation Technique Using a CNN Inversion Approach for Potential Smart City Applications. Smart Cities 2023, 6, 3112–3137. [Google Scholar] [CrossRef]
  37. Alkhatib, M.I.I.; Talei, A.; Chang, T.K.; Pauwels, V.R.N.; Chow, M.F. Towards the development of a citizens’ science-based acoustic rainfall sensing system. J. Hydrol. 2024, 633, 130973. [Google Scholar] [CrossRef]
Figure 3. Schematic representation of the CNN architecture in this study.
Figure 3. Schematic representation of the CNN architecture in this study.
Bdcc 08 00126 g003
Figure 4. The regression CNN workflow.
Figure 4. The regression CNN workflow.
Bdcc 08 00126 g004
Figure 5. Schematic structure of the CNN model importing image data, thresholding, partitioning, training, and testing (deep learning) for rainfall intensity prediction.
Figure 5. Schematic structure of the CNN model importing image data, thresholding, partitioning, training, and testing (deep learning) for rainfall intensity prediction.
Bdcc 08 00126 g005
Figure 6. Rainfall intensity distribution corresponding to (a) surveillance camera and (b) smartphone camera images.
Figure 6. Rainfall intensity distribution corresponding to (a) surveillance camera and (b) smartphone camera images.
Bdcc 08 00126 g006
Figure 7. (a) displays the raw rain image, (b) is a sharpened image of the raw image input, (c) is a greyscale image that shows pixel intensity, and (d) displays the outcome of applying Otsu’s thresholding method. (e) combines the thresholding approach with image processing to merge two images.
Figure 7. (a) displays the raw rain image, (b) is a sharpened image of the raw image input, (c) is a greyscale image that shows pixel intensity, and (d) displays the outcome of applying Otsu’s thresholding method. (e) combines the thresholding approach with image processing to merge two images.
Bdcc 08 00126 g007
Figure 8. Samples of pre-processed images using Otsu’s method under different rainfall conditions: (a) no or low rain, (b) moderate rain, and (c) heavy rain.
Figure 8. Samples of pre-processed images using Otsu’s method under different rainfall conditions: (a) no or low rain, (b) moderate rain, and (c) heavy rain.
Bdcc 08 00126 g008
Figure 9. Observed vs. predicted rainfall intensity by CNN Model 4 using rainfall images captured by a surveillance camera. The blue dot line shows the fitted line corresponding to the R2.
Figure 9. Observed vs. predicted rainfall intensity by CNN Model 4 using rainfall images captured by a surveillance camera. The blue dot line shows the fitted line corresponding to the R2.
Bdcc 08 00126 g009
Figure 10. Observed vs. simulated rainfall intensities by CNN Model 4 using Approach 2 on the smartphone testing dataset.
Figure 10. Observed vs. simulated rainfall intensities by CNN Model 4 using Approach 2 on the smartphone testing dataset.
Bdcc 08 00126 g010
Table 2. Statistical data for training, validation, and testing datasets corresponding to the surveillance camera.
Table 2. Statistical data for training, validation, and testing datasets corresponding to the surveillance camera.
DatasetMean
(mm/h)
Min.
(mm/h)
Max.
(mm/h)
Standard Deviation
(mm/h)
Training
(60%)
19.77090.0020.92
Validation
(20%)
20.08090.5020.92
Testing
(20%)
19.82090.0020.93
Table 3. Statistical data for training, validation, and testing datasets corresponding to the smartphone camera.
Table 3. Statistical data for training, validation, and testing datasets corresponding to the smartphone camera.
DatasetMean
(mm/h)
Min.
(mm/h)
Max.
(mm/h)
Standard Deviation
(mm/h)
Training
(60%)
17.37089.6820.76
Validation
(20%)
16.97090.0021.14
Testing
(20%)
17.19089.3620.83
Table 4. Testing performance of models with/without image pre-processing techniques trained using surveillance camera data.
Table 4. Testing performance of models with/without image pre-processing techniques trained using surveillance camera data.
Performance CriteriaDatasetMSE
(mm/h)2
MAE
(mm/h)
R2
Baseline CNNTesting71.0414.9410.843
Model 1Testing20.6622.8180.953
Model 2Testing23.1232.6590.947
Model 3Testing22.7382.8080.948
Model 4Testing19.6012.5080.955
The best testing results are shown with bold letters.
Table 5. Model 4 performance on the smartphone testing dataset using Approaches 1 and 2.
Table 5. Model 4 performance on the smartphone testing dataset using Approaches 1 and 2.
Performance CriteriaMSE
(mm/h)2
MAE
(mm/h)
R2
Approach 184.9944.7150.804
Approach 269.2754.3740.840
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Shalaby, Y.; Alkhatib, M.I.I.; Talei, A.; Chang, T.K.; Chow, M.F.; Pauwels, V.R.N. Estimating Rainfall Intensity Using an Image-Based Convolutional Neural Network Inversion Technique for Potential Crowdsourcing Applications in Urban Areas. Big Data Cogn. Comput. 2024, 8, 126. https://doi.org/10.3390/bdcc8100126

AMA Style

Shalaby Y, Alkhatib MII, Talei A, Chang TK, Chow MF, Pauwels VRN. Estimating Rainfall Intensity Using an Image-Based Convolutional Neural Network Inversion Technique for Potential Crowdsourcing Applications in Urban Areas. Big Data and Cognitive Computing. 2024; 8(10):126. https://doi.org/10.3390/bdcc8100126

Chicago/Turabian Style

Shalaby, Youssef, Mohammed I. I. Alkhatib, Amin Talei, Tak Kwin Chang, Ming Fai Chow, and Valentijn R. N. Pauwels. 2024. "Estimating Rainfall Intensity Using an Image-Based Convolutional Neural Network Inversion Technique for Potential Crowdsourcing Applications in Urban Areas" Big Data and Cognitive Computing 8, no. 10: 126. https://doi.org/10.3390/bdcc8100126

APA Style

Shalaby, Y., Alkhatib, M. I. I., Talei, A., Chang, T. K., Chow, M. F., & Pauwels, V. R. N. (2024). Estimating Rainfall Intensity Using an Image-Based Convolutional Neural Network Inversion Technique for Potential Crowdsourcing Applications in Urban Areas. Big Data and Cognitive Computing, 8(10), 126. https://doi.org/10.3390/bdcc8100126

Article Metrics

Back to TopTop