1. Introduction
Precipitation has been intricately linked to human life since ancient times [
1]. With global climate change, predicting precipitation worldwide has become more challenging, and the various disasters caused by extreme precipitation have had a significant impact on people around the world [
2]. Therefore, accurately forecasting precipitation is a matter of great concern for countries worldwide [
3].
Precipitation can be categorized into short-term, mid-term [
4], and long-term precipitation [
5] on a time scale [
6]. Among these, short-term precipitation (0–2 h) has the most severe impact on residents [
7]. Firstly, it can trigger natural disasters such as floods, debris flows, and landslides; secondly, short-term precipitation may lead to flooding, causing urban waterlogging; thirdly, it can damage transportation infrastructure, resulting in traffic accidents [
8,
9]; and fourthly, it disrupts residents’ daily routines, including outdoor activities and flight plans [
10,
11]. Therefore, accurate short-term precipitation prediction is of the utmost importance [
12,
13].
Conventional methods for short-term precipitation forecasting heavily depend on numerical weather prediction (NWP) [
14] and extrapolation of radar echo reflectivity. NWP [
14], based on physics-based prior knowledge and extensive meteorological data, calculates precipitation forecasts using computers [
15]. However, this method has limitations. Firstly, the use of prior physical knowledge in the form of physics models may lead to convergence issues. Secondly, dealing with extensive meteorological data computationally demands significant resources, resulting in low computational efficiency and poor real-time performance. Lastly, physical models have a limited utilization rate of historical meteorological data, making it challenging to integrate existing meteorological data for precipitation prediction [
16,
17]. Currently, mainstream methods based on radar echo reflectivity extrapolation include the cross-correlation method (TREC) [
18], the optical flow method [
19], the monomer centroid method [
18], and deep learning-based radar echo extrapolation methods [
20]. TREC [
18] calculates the cross-correlation coefficient between pre- and post-movement images, retains the maximum value during movement, and considers the displacement obtained when the maximum cross-correlation coefficient is achieved as the movement speed of the echo image. It then extrapolates radar echo images for the next moment based on this speed. The main drawback is the difficulty in capturing the growth and decay of weather systems as well as the uncertainty in displacement. The optical flow method [
19], originating from target tracking methods in computer vision, calculates motion information based on the correspondence between reflectivity at corresponding positions of adjacent echoes over time. However, it is constrained by two conditions: the total intensity remains constant and the motion does not include rapid nonlinear changes. The monomer centroid method [
18] treats radar echoes as a point, conducting recognition and tracking. However, it is sensitive to noise and prone to interference, leading to a decrease in prediction accuracy.
With the development and application of deep learning, radar echo extrapolation methods based on deep learning have been employed in precipitation forecasting [
21,
22,
23,
24]. Compared to traditional radar echo extrapolation methods, deep learning-based radar echo extrapolation methods are less sensitive to noise, have fewer constraints, and offer faster computational speed. The general process involves predicting future radar echo image sequences based on input historical radar echo images and then applying the obtained echo distribution to estimate precipitation using the Z-R relationship [
25]. The Z-R relationship [
25] refers to the Marshall–Palmer relationship, where Z represents radar reflectivity and R represents precipitation intensity. Currently, deep learning-based radar echo extrapolation methods are divided into two technical approaches: convolutional neural networks (CNN) and recurrent neural networks (RNN) [
26]. Due to their ability to extract temporal correlations, RNNs are currently more widely used and have developed various variants to better handle spatiotemporal correlations. However, such deep learning models suffer from deficiencies in extracting spatiotemporal features and long-tail blur issues. Moreover, these models, which are general-purpose models for video prediction, still exhibit significant disparities between predicted results and actual values. Real-time precipitation forecasting poses greater challenges compared to typical video prediction problems. Consequently, leveraging radar echo data to improve precipitation forecasting accuracy remains a significant challenge.
This paper proposes a novel meteorology feature-based and deep learning-based quantitative precipitation nowcasting network (METEO-DLNet), which divides radar echo images into fine-scale and coarse-scale levels, predicts and merges radar echo forecast images, and subsequently uses the Z-R relationship for precipitation forecasting. The model overcomes previous spatiotemporal feature extraction deficiencies and the deviation of local image trends from global trends in deep learning-based precipitation forecasting. It enhances the accuracy of long-term forecast images.
In the proposed model, we employ downsampling to divide radar echo images into fine-scale and coarse-scale levels, which are separately input into the recurrent parts of the encoding-decoding structure of RNNs. Within the RNNs, meteorological factors such as terrain and airflow changes during precipitation processes are incorporated. Spatial attention and differential attention are used to learn the spatial and temporal features of radar echo images. Finally, a fusion mechanism combining self-attention and gating mechanisms is utilized to merge the images from the two scales, thereby obtaining the predicted radar echo images.
To fully utilize meteorological features and address the deviation between global and local trends, we designed a short-term precipitation prediction network based on meteorological features and deep learning. Our contributions are as follows:
We devised a radar echo image prediction framework based on Meteo-LSTM and multiscale fusion, achieving short-term precipitation forecasting;
A novel Meteo-LSTM was crafted, incorporating spatial attention [
27] and differential attention to learn the spatial and temporal features of radar echo images. It maximally leverages meteorological characteristics, enhancing precipitation prediction accuracy from both temporal and spatial perspectives;
A fusion mechanism, combining self-attention and gate mechanisms, was introduced. This mechanism guided local image along with the overall trend, elevating the accuracy of radar echo image predictions.
The remaining sections of this paper are organized as follows:
Section 2 reviews related work.
Section 3 presents the short-term precipitation forecasting method based on meteorological features and deep learning.
Section 4 displays detailed experimental results.
Section 5 is the discussion.
Section 6 concludes the paper.
2. Related Work
Currently, deep learning-based precipitation forecasting can be categorized into four types: utilizing recurrent neural networks (RNN) as iterative prediction models, employing convolutional neural networks (CNN) as feedforward prediction models, using multi-scale methods as prediction models, and integrating attention mechanisms as prediction models. In the feedforward models based on CNN, SimVP [
28] consists of an encoder, translator, and decoder [
29], all constructed entirely with CNN components without additional modules. STRPM [
30] proposes a spatiotemporal encoding-decoding scheme to retain more high-resolution spatiotemporal information. It focuses on the spatiotemporal residual features between the previous and subsequent model time steps and introduces a new loss function called learning perceptual loss (LP-loss). Other CNN-based prediction models primarily use U-net [
31] as a foundation, such as RainNet [
32], SmaAt-Unet [
33], and FURENet [
34]. However, CNN-based prediction models struggle to capture long-term dependencies in radar echo image sequences, hindering the effective handling of spatiotemporal relationships in predictions.
In the realm of iterative prediction models based on recurrent neural networks (RNN), ConvLSTM [
35], proposed by Shi et al., is a classic example. It replaces matrix multiplication in FC-LSTM [
36] with convolution operations, achieving the extraction of spatiotemporal relationships. Subsequently, Shi et al. introduced TrajGRU [
10], which transforms the previous fixed connections into dynamic connections. Building upon this, Wang proposed PredRNN [
37], PredRNN++ [
38], PredRNN-V2 [
39], MIM [
40], and E3D-LSTM [
41], further improving precipitation prediction accuracy. PredRNN [
37] adds long-term memory on top of ConvLSTM [
35] and establishes vertical connections between layers. PredRNN++ [
38] introduces a causal LSTM and a gradient highway unit (GHU), addressing the gradient vanishing issue. PredRNN-V2 [
39] utilizes memory decoupling loss to let memory cells focus on short-term and long-term memories separately. MIM [
40] introduces stationary and non-stationary modules to address stability and non-stationarity during the prediction process. E3D-LSTM [
41] replaces the forget gate in PredRNN [
37] with an attention mechanism. Wu Haixu proposed MotionRNN [
42] and designed MotionGRU to model overall motion trends and instantaneous changes cohesively. Additionally, there are other prediction models like CMS-LSTM [
43], MS-LSTM [
44], PrecipLSTM [
45], and MM-RNN [
46], all belonging to iterative prediction models based on recurrent neural networks. Currently, RNN-based precipitation prediction models lack consideration for meteorological features and do not make sufficient use of meteorological characteristics within precipitation.
In the field of attention-based prediction models, MFCA [
47] introduced a method for multimodal data fusion, merging data from different sources while utilizing cross-attention mechanisms to facilitate information interaction and feature integration across different data modalities. Tekin, S.F. [
48] incorporated attention mechanisms to help the model focus more attentively on important information when dealing with spatiotemporal data. By dynamically adjusting attention weights at each time step, the model can better capture correlations between spatiotemporal data, thereby enhancing prediction performance. Li, Y. [
49] introduced an integrated model of spatial-temporal attention network and multi-layer perceptron. By combining these two different types of neural networks, the model effectively leverages their advantages in capturing spatiotemporal information and nonlinear features, thus improving the accuracy and robustness of meteorological forecasts. Self-attention ConvLSTM [
35] introduced a self-attention memory module into ConvLSTM to enhance the model’s ability to model spatiotemporal data, more effectively capturing long-range dependencies and important features in spatiotemporal data.
In the realm of multi-scale prediction models, Broad-UNet [
50] employs an innovative multi-scale feature learning approach, adaptively learning and utilizing information from different scales. By learning and integrating multi-scale feature representations at different levels, the model can comprehensively understand the spatial and temporal structures of input data, thereby improving its ability to predict complex weather phenomena. TRU-NET [
51], combining deep learning techniques with high-resolution rainfall data, introduces spatial and temporal attention mechanisms and multi-level feature learning methods to effectively capture complex spatial and temporal relationships.
5. Discussion
Through comparison with SOTA and ablation experiments, the observations can be summarized as follows. Firstly, the model using Meteo-LSTM performs better in precipitation prediction compared to traditional ST-LSTM [
37], indicating that the method of incorporating meteorological features into deep learning models is effective. This method utilizes spatial attention to learn terrain features during precipitation and differential attention to learn differences between adjacent time frames, fully utilizing meteorological features to improve accuracy in precipitation prediction.
Secondly, the fusion mechanism proposed in this paper, based on the combination of self-attention and gating mechanisms, enhances the accuracy of precipitation prediction. This indicates that it retains parts of the fine prediction image consistent with overall trends and discards parts that deviate from overall trends.
Finally, combining Meteo-LSTM with the fusion mechanism based on self-attention and gating mechanisms achieves more accurate precipitation forecasts.
While this study has indeed further enhanced the accuracy of precipitation forecasting, it is important to note that the Z-R formula is an empirical formula derived from precipitation statistics, and thus, there exists error between the predicted and observed precipitation values. In our next step, we aim to explore the utilization of deep learning methods to establish end-to-end precipitation forecasting models to replace traditional Z-R relationships. Additionally, we plan to consider incorporating other meteorological parameters such as wind into the precipitation forecasting model to enhance the accuracy of radar echo variation predictions.
6. Conclusions
Currently, there is still room for significant improvement in radar echo extrapolation methods based on deep learning, with inherent meteorological characteristics in precipitation often not receiving sufficient attention. Precipitation exhibits intrinsic meteorological features. Spatially, terrain has a significant impact, triggering dynamic and thermodynamic effects in the atmosphere and leading to diverse precipitation patterns in different regions. Temporally, convective activity and airflow variations preceding precipitation may persistently influence later precipitation periods. In addition to considering meteorological features, traditional predictions face challenges in handling local images that deviate from the overall motion trend. Therefore, to fully leverage meteorological features and address the deviation between local and global trends, we designed METEO-DLNet, a short-term precipitation forecasting network based on meteorological features and deep learning. Through experiments, the following conclusions were drawn.
Firstly, our designed Meteo-LSTM effectively learns the spatial and temporal features of radar echo images using spatial and differential attention, fully utilizing meteorological features, and enhancing precipitation prediction accuracy at both temporal and spatial scales.
Secondly, the fusion mechanism based on self-attention and gating mechanisms, designed in this paper, guides overall trends in local images and enhances the prediction accuracy of radar echo images.
Thirdly, both quantitative and qualitative experiments indicate that our designed short-term precipitation forecasting network (METEO-DLNet) outperforms current mainstream deep learning-based precipitation forecasting models in this challenging natural spatiotemporal sequence problem.