METEO-DLNet: Quantitative Precipitation Nowcasting Net Based on Meteorological Features and Deep Learning

Hu, Jianping; Yin, Bo; Guo, Chaoqun

doi:10.3390/rs16061063

Open AccessArticle

METEO-DLNet: Quantitative Precipitation Nowcasting Net Based on Meteorological Features and Deep Learning

by

Jianping Hu

,

Bo Yin

^* and

Chaoqun Guo

College of Information Science and Engineering, Ocean University of China, Qingdao 266005, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(6), 1063; https://doi.org/10.3390/rs16061063

Submission received: 9 December 2023 / Revised: 12 March 2024 / Accepted: 14 March 2024 / Published: 17 March 2024

(This article belongs to the Special Issue Recent Advances in Precipitation Radar)

Download

Browse Figures

Versions Notes

Abstract

:

Precipitation prediction plays a crucial role in people’s daily lives, work, and social development. Especially in the context of global climate variability, where extreme precipitation causes significant losses to the property of people worldwide, it is urgently necessary to use deep learning algorithms based on radar echo extrapolation for short-term precipitation forecasting. However, there are inadequately addressed issues with radar echo extrapolation methods based on deep learning, particularly when considering the inherent meteorological characteristics of precipitation on spatial and temporal scales. Additionally, traditional forecasting methods face challenges in handling local images that deviate from the overall trend. To address these problems, we propose the METEO-DLNet short-term precipitation prediction network based on meteorological features and deep learning. Experimental results demonstrate that the Meteo-LSTM of METEO-DLNet, utilizing spatial attention and differential attention, adequately learns the influence of meteorological features on spatial and temporal scales. The fusion mechanism, combining self-attention and gating mechanisms, resolves the divergence between local images and the overall trend. Quantitative and qualitative experiments show that METEO-DLNet outperforms current mainstream deep learning precipitation prediction models in natural spatiotemporal sequence problems.

Keywords:

precipitation forecasting; attention mechanism; multi-scale fusion; spatiotemporal prediction

1. Introduction

Precipitation has been intricately linked to human life since ancient times [1]. With global climate change, predicting precipitation worldwide has become more challenging, and the various disasters caused by extreme precipitation have had a significant impact on people around the world [2]. Therefore, accurately forecasting precipitation is a matter of great concern for countries worldwide [3].

Precipitation can be categorized into short-term, mid-term [4], and long-term precipitation [5] on a time scale [6]. Among these, short-term precipitation (0–2 h) has the most severe impact on residents [7]. Firstly, it can trigger natural disasters such as floods, debris flows, and landslides; secondly, short-term precipitation may lead to flooding, causing urban waterlogging; thirdly, it can damage transportation infrastructure, resulting in traffic accidents [8,9]; and fourthly, it disrupts residents’ daily routines, including outdoor activities and flight plans [10,11]. Therefore, accurate short-term precipitation prediction is of the utmost importance [12,13].

Conventional methods for short-term precipitation forecasting heavily depend on numerical weather prediction (NWP) [14] and extrapolation of radar echo reflectivity. NWP [14], based on physics-based prior knowledge and extensive meteorological data, calculates precipitation forecasts using computers [15]. However, this method has limitations. Firstly, the use of prior physical knowledge in the form of physics models may lead to convergence issues. Secondly, dealing with extensive meteorological data computationally demands significant resources, resulting in low computational efficiency and poor real-time performance. Lastly, physical models have a limited utilization rate of historical meteorological data, making it challenging to integrate existing meteorological data for precipitation prediction [16,17]. Currently, mainstream methods based on radar echo reflectivity extrapolation include the cross-correlation method (TREC) [18], the optical flow method [19], the monomer centroid method [18], and deep learning-based radar echo extrapolation methods [20]. TREC [18] calculates the cross-correlation coefficient between pre- and post-movement images, retains the maximum value during movement, and considers the displacement obtained when the maximum cross-correlation coefficient is achieved as the movement speed of the echo image. It then extrapolates radar echo images for the next moment based on this speed. The main drawback is the difficulty in capturing the growth and decay of weather systems as well as the uncertainty in displacement. The optical flow method [19], originating from target tracking methods in computer vision, calculates motion information based on the correspondence between reflectivity at corresponding positions of adjacent echoes over time. However, it is constrained by two conditions: the total intensity remains constant and the motion does not include rapid nonlinear changes. The monomer centroid method [18] treats radar echoes as a point, conducting recognition and tracking. However, it is sensitive to noise and prone to interference, leading to a decrease in prediction accuracy.

With the development and application of deep learning, radar echo extrapolation methods based on deep learning have been employed in precipitation forecasting [21,22,23,24]. Compared to traditional radar echo extrapolation methods, deep learning-based radar echo extrapolation methods are less sensitive to noise, have fewer constraints, and offer faster computational speed. The general process involves predicting future radar echo image sequences based on input historical radar echo images and then applying the obtained echo distribution to estimate precipitation using the Z-R relationship [25]. The Z-R relationship [25] refers to the Marshall–Palmer relationship, where Z represents radar reflectivity and R represents precipitation intensity. Currently, deep learning-based radar echo extrapolation methods are divided into two technical approaches: convolutional neural networks (CNN) and recurrent neural networks (RNN) [26]. Due to their ability to extract temporal correlations, RNNs are currently more widely used and have developed various variants to better handle spatiotemporal correlations. However, such deep learning models suffer from deficiencies in extracting spatiotemporal features and long-tail blur issues. Moreover, these models, which are general-purpose models for video prediction, still exhibit significant disparities between predicted results and actual values. Real-time precipitation forecasting poses greater challenges compared to typical video prediction problems. Consequently, leveraging radar echo data to improve precipitation forecasting accuracy remains a significant challenge.

This paper proposes a novel meteorology feature-based and deep learning-based quantitative precipitation nowcasting network (METEO-DLNet), which divides radar echo images into fine-scale and coarse-scale levels, predicts and merges radar echo forecast images, and subsequently uses the Z-R relationship for precipitation forecasting. The model overcomes previous spatiotemporal feature extraction deficiencies and the deviation of local image trends from global trends in deep learning-based precipitation forecasting. It enhances the accuracy of long-term forecast images.

In the proposed model, we employ downsampling to divide radar echo images into fine-scale and coarse-scale levels, which are separately input into the recurrent parts of the encoding-decoding structure of RNNs. Within the RNNs, meteorological factors such as terrain and airflow changes during precipitation processes are incorporated. Spatial attention and differential attention are used to learn the spatial and temporal features of radar echo images. Finally, a fusion mechanism combining self-attention and gating mechanisms is utilized to merge the images from the two scales, thereby obtaining the predicted radar echo images.

To fully utilize meteorological features and address the deviation between global and local trends, we designed a short-term precipitation prediction network based on meteorological features and deep learning. Our contributions are as follows:

We devised a radar echo image prediction framework based on Meteo-LSTM and multiscale fusion, achieving short-term precipitation forecasting;
A novel Meteo-LSTM was crafted, incorporating spatial attention [27] and differential attention to learn the spatial and temporal features of radar echo images. It maximally leverages meteorological characteristics, enhancing precipitation prediction accuracy from both temporal and spatial perspectives;
A fusion mechanism, combining self-attention and gate mechanisms, was introduced. This mechanism guided local image along with the overall trend, elevating the accuracy of radar echo image predictions.

The remaining sections of this paper are organized as follows: Section 2 reviews related work. Section 3 presents the short-term precipitation forecasting method based on meteorological features and deep learning. Section 4 displays detailed experimental results. Section 5 is the discussion. Section 6 concludes the paper.

2. Related Work

Currently, deep learning-based precipitation forecasting can be categorized into four types: utilizing recurrent neural networks (RNN) as iterative prediction models, employing convolutional neural networks (CNN) as feedforward prediction models, using multi-scale methods as prediction models, and integrating attention mechanisms as prediction models. In the feedforward models based on CNN, SimVP [28] consists of an encoder, translator, and decoder [29], all constructed entirely with CNN components without additional modules. STRPM [30] proposes a spatiotemporal encoding-decoding scheme to retain more high-resolution spatiotemporal information. It focuses on the spatiotemporal residual features between the previous and subsequent model time steps and introduces a new loss function called learning perceptual loss (LP-loss). Other CNN-based prediction models primarily use U-net [31] as a foundation, such as RainNet [32], SmaAt-Unet [33], and FURENet [34]. However, CNN-based prediction models struggle to capture long-term dependencies in radar echo image sequences, hindering the effective handling of spatiotemporal relationships in predictions.

In the realm of iterative prediction models based on recurrent neural networks (RNN), ConvLSTM [35], proposed by Shi et al., is a classic example. It replaces matrix multiplication in FC-LSTM [36] with convolution operations, achieving the extraction of spatiotemporal relationships. Subsequently, Shi et al. introduced TrajGRU [10], which transforms the previous fixed connections into dynamic connections. Building upon this, Wang proposed PredRNN [37], PredRNN++ [38], PredRNN-V2 [39], MIM [40], and E3D-LSTM [41], further improving precipitation prediction accuracy. PredRNN [37] adds long-term memory on top of ConvLSTM [35] and establishes vertical connections between layers. PredRNN++ [38] introduces a causal LSTM and a gradient highway unit (GHU), addressing the gradient vanishing issue. PredRNN-V2 [39] utilizes memory decoupling loss to let memory cells focus on short-term and long-term memories separately. MIM [40] introduces stationary and non-stationary modules to address stability and non-stationarity during the prediction process. E3D-LSTM [41] replaces the forget gate in PredRNN [37] with an attention mechanism. Wu Haixu proposed MotionRNN [42] and designed MotionGRU to model overall motion trends and instantaneous changes cohesively. Additionally, there are other prediction models like CMS-LSTM [43], MS-LSTM [44], PrecipLSTM [45], and MM-RNN [46], all belonging to iterative prediction models based on recurrent neural networks. Currently, RNN-based precipitation prediction models lack consideration for meteorological features and do not make sufficient use of meteorological characteristics within precipitation.

In the field of attention-based prediction models, MFCA [47] introduced a method for multimodal data fusion, merging data from different sources while utilizing cross-attention mechanisms to facilitate information interaction and feature integration across different data modalities. Tekin, S.F. [48] incorporated attention mechanisms to help the model focus more attentively on important information when dealing with spatiotemporal data. By dynamically adjusting attention weights at each time step, the model can better capture correlations between spatiotemporal data, thereby enhancing prediction performance. Li, Y. [49] introduced an integrated model of spatial-temporal attention network and multi-layer perceptron. By combining these two different types of neural networks, the model effectively leverages their advantages in capturing spatiotemporal information and nonlinear features, thus improving the accuracy and robustness of meteorological forecasts. Self-attention ConvLSTM [35] introduced a self-attention memory module into ConvLSTM to enhance the model’s ability to model spatiotemporal data, more effectively capturing long-range dependencies and important features in spatiotemporal data.

In the realm of multi-scale prediction models, Broad-UNet [50] employs an innovative multi-scale feature learning approach, adaptively learning and utilizing information from different scales. By learning and integrating multi-scale feature representations at different levels, the model can comprehensively understand the spatial and temporal structures of input data, thereby improving its ability to predict complex weather phenomena. TRU-NET [51], combining deep learning techniques with high-resolution rainfall data, introduces spatial and temporal attention mechanisms and multi-level feature learning methods to effectively capture complex spatial and temporal relationships.

3. Methods

3.1. Problem Definition

Due to the significant correlation between radar reflectivity and precipitation intensity obtained through meteorological radar, the precipitation forecasting problem can be transformed as follows. Firstly, to predict the future radar echo sequence based on the observed radar echo sequence. Then, using the Z-R relationship, calculate precipitation rate values through radar reflectivity. Here, Z represents radar reflectivity, and R represents precipitation intensity. In practical terms, we can represent P measurements at a certain local area (H × W grid points) and time t as a tensor

X_{t} \in R^{P \times H \times W}

. Hence, this issue can be characterized as:

\hat{Y} = Γ (X, Y),

(1)

where

X = {X_{t - n}, X_{t - n + 1}, X_{t}}

is the collection of radar echo images from time t − n to t,

Y = \{Y_{t - n}, Y_{t - n + 1}, Y_{t}\}

is the real radar echo images from time,

\hat{Y} = \{{\hat{Y}}_{t + 1}, {\hat{Y}}_{t + 2} \dots {\hat{Y}}_{t + m}\}

is the predicted radar echo images from time t + 1 to t + m, and Γ is the precipitation prediction model. The goal of the problem is to get

\hat{Y}

as close as possible to

Y

.

3.2. Base Model: ST-LSTM

The spatiotemporal long short-term memory unit (ST-LSTM [37]) is a classical predictive unit in deep learning for spatiotemporal forecasting. It has shown good performance on some video prediction datasets. Internally, ST-LSTM [37] consists of two ConvLSTM [35] units, dividing the memory flow into short-term and long-term memories. Short-term memories flow within the same layer, while long-term memories flow between layers. The formula for ST-LSTM [37] is as follows:

\begin{array}{l} i = σ (W_{i x} * X_{t} + W_{i h} * H_{t - 1}^{k}) \\ f = σ (W_{f x} * X_{t} + W_{f h} * H_{t - 1}^{k}) \\ g = \tanh (W_{g x} * X_{t} + W_{g h} * H_{t - 1}^{k}) \\ C_{t}^{l} = f \circ C_{t - 1}^{k} + i \circ g \\ i^{'} = σ (W_{i^{'} x} * X_{t} + W_{i^{'} m} * M_{t}^{k - 1}) \\ f^{'} = σ (W_{f^{'} x} * X_{t} + W_{f^{'} m} * M_{t}^{k - 1}) \\ g^{'} = \tanh (W_{g^{'} x} * X_{t} + W_{g^{'} m} * M_{t}^{k - 1}) \\ M_{t}^{k} = f^{'} \circ M_{t}^{k - 1} + i^{'} \circ g^{'} \\ o = σ (W_{o x} * X_{t} + W_{o h} * H_{t - 1}^{k} + W_{o c} * C_{t}^{k} + W_{o m} * M_{t}^{k}) \\ H_{t}^{k} = o \circ \tanh (W_{1 \times 1} * [C_{t}^{k}, M_{t}^{k}]), \end{array}

(2)

Here,

W_{* *}

represents the convolutional kernel parameters, “

*

” denotes the convolution operation, “

\circ

” is the Hadamard product,

σ

represents the sigmoid activation function,

X_{t}

is the input,

C_{t}^{k}

denotes the short-term dependent memory at time t,

M_{t}^{k}

represents the long-term dependent memory at time t, and

H_{t}^{k}

is the hidden state at time t. Input gates i or

i^{'}

, forget gates f or

f^{'}

, input modulation gates

g

or

g^{'}

, and output gates o, respectively, control the information flow. The structure diagram of ST-LSTM [37] is shown in Figure 1.

Although ST-LSTM [37] has made significant progress in handling spatiotemporal correlations, it is a general video prediction model and has not been specifically designed for precipitation forecasting, making it challenging to learn complex precipitation patterns.

3.3. Network Structure

3.3.1. Whole Network

The model takes in a sequence of 10 consecutive frames of radar echo images, and outputs predictions for the next 10 frames of radar echo images. The resolution of each frame is 128 × 128. The model’s comprehensive structure is depicted in Figure 2, comprising two main components. The model consists of two parts. First, there’s the multiscale RNN model composed of Meteorology-LSTM (Meteo-LSTM), based on the differential self-attention mechanism and spatial attention mechanism. Due to the consideration that low-resolution images contain more global information while high-resolution images contain more local information, The model utilizes max-pooling with a kernel size of 2 × 2 to downsample the original images, resulting in low-resolution images of size 64 × 64. High-resolution images are obtained by duplicating the original images, resulting in images of size 128 × 128. Subsequently, the number of channels is increased from 1 to 64 using an embed layer. Then, both types of images are fed into an encoding-decoding structure composed of three layers of Meteo-LSTM. Each Meteo-LSTM layer has 64 input and output channels, with a convolutional kernel size of 3 × 3, and both stride and padding set to 1. The encoding-decoding structure, composed of three layers of Meteo-LSTM, extracts global spatiotemporal features and local spatiotemporal features from the radar echo images, resulting in radar echo prediction images at two different resolutions. Meteo-LSTM incorporates a spatial differential self-attention mechanism for effective extraction of temporal information and a spatial attention mechanism for spatial information extraction.

The second part is the multiscale fusion module, based on the self-attention mechanism and gate mechanism. Firstly, the low-resolution images are upsampled from 64 × 64 to 128 × 128 using bilinear interpolation to facilitate the subsequent fusion process. In the attention module, the convolutional layer has 1 input channel and 100 output channels, with a kernel size of 1 × 1. In the gating mechanism module, the convolutional kernel size is 3 × 3. The final output is radar echo prediction images with a resolution of 128 × 128. This module effectively merges feature maps from different scales, retaining valuable information while discarding less important details. Next, we will introduce the structure and function of each part.

The RNN recurrent part uses an encode-decode architecture, as shown in Figure 3. To efficiently learn meteorological features across spatiotemporal scales, we employ a three-layer Meteo-LSTM as both an encoder and a decoder. Here,

D L_{t}^{l}

,

D S_{t}^{l}

represent meteorological feature memory on the spatial scale and time scale at layer l at time t. These memories are propagated horizontally.

H_{t}

,

C_{t}

, and

M_{t}

correspond to the representations in ST-LSTM [37].

3.3.2. Meteo-LSTM

To address the specific issue of precipitation evolution, this paper builds upon ST-LSTM [37] and introduces Meteo-LSTM, designed to capture both temporal and spatial information. Meteo-LSTM incorporates two modules, the Spatial Attention Module (SA), which focuses on learning geographic spatial features, and the Differential Attention Module (DA), which concentrates on understanding the impact of historical precipitation on the present. The structure diagram of Meteo-LSTM is shown in Figure 4.

Spatial features in radar echo images exhibit anisotropy, organized into bands rather than isotropic features, and the statistical structure of rainfall is influenced by topography such as mountains. Therefore, the Spatial Attention Module (SA) is designed to learn geographical features in radar echo images. Experimental results show that differential information is crucial in predicting radar echo images. Hence, the Differential Attention Module (DA) first processes adjacent time frames differentially, then applies an attention mechanism to learn the impact of historical precipitation on the present, and finally employs LSTM to enhance both short-term and long-term memory, effectively capturing temporal changes in radar echo images.

We recognize that precipitation is influenced by topography on a spatial scale. Topography can affect airflow and wind direction in the atmosphere, causing airflow to be blocked or forced to rise. Additionally, topography can induce changes in wind speed, leading to the formation of topographic vortices that result in precipitation. Topographic features can trigger dynamic and thermodynamic effects in the atmosphere, resulting in different precipitation patterns in different regions. Therefore, we introduced the spatial attention mechanism to learn topographic features. The configuration of the Spatial Attention Module (SA) is depicted in Figure 5. The formula for the Spatial Attention Module (SA) is as follows:

\begin{array}{l} {\tilde{X}}_{t} = W_{a x} * (σ (W_{a m} * [A v g P o o l (X_{t}), M a x P o o l (X_{t})]) \cdot X_{t}) \\ i = σ (W_{i x} * {\tilde{X}}_{t} + W_{i h} * D L_{t - 1}^{k}) \\ f = σ (W_{f x} * {\tilde{X}}_{t} + W_{f h} * D L_{t - 1}^{k}) \\ g = \tanh (W_{g x} * {\tilde{X}}_{t} + W_{g h} * D L_{t - 1}^{k}) \\ D L_{t}^{k} = f \circ D L_{t - 1}^{k} + i \circ g \end{array}

(3)

X_{t}

represents the input,

{\tilde{X}}_{t}

is the result after spatial attention processing, “

*

” denotes convolution operation,

W_{* *}

represents convolution kernel parameters, the convolution kernel size is 3 × 3, “

\circ

” is the Hadamard product,

σ

represents the sigmoid activation function, and

D L_{t}^{k}

represents the spatial feature storage of the k-th layer at time t.

We also note that, at the temporal scale, convective activity and changes in airflow during the early stages of precipitation may continue to influence the later stages. Intense convective activity may form in the early stages, leading to localized heavy precipitation and potentially affecting the precipitation pattern in the later stages. Meanwhile, moist soil and high surface humidity in the early stages may increase evaporation and evaporative cooling, affecting the formation and distribution of precipitation in the later stages. Therefore, we propose a module based on the differential attention mechanism, which captures the differences between adjacent time frames and uses an attention mechanism to learn the impact of early-stage precipitation on later-stage precipitation. The configuration of the Differential Attention Module (DA) is depicted in Figure 6. The formula for the Differential Attention Module (DA) is provided.

\begin{array}{l} x = X_{t}^{k} - X_{t - 1}^{k} \\ Z = x + s o f t m a x (x \cdot {[D S_{0 : t - 1}^{k}, x]}^{⊤}) \cdot [D S_{0 : t - 1}^{k}, x] \\ i = σ (W_{i x} * Z + W_{i h} * D S_{t - 1}^{k}) \\ f = σ (W_{f x} * Z + W_{f h} * D S_{t - 1}^{k}) \\ g = \tanh (W_{g x} * Z + W_{g h} * D S_{t - 1}^{k}) \\ D S_{t}^{k} = f \circ D S_{t - 1}^{k} + i \circ g \end{array}

(4)

X_{t}

and

X_{t - 1}^{k}

represent the input,

x

represents the new matrix obtained after differentiation, Z represents the matrix of the impact of historical precipitation on the present obtained through an attention mechanism in feature space, “

*

” denotes convolution operation,

W_{* *}

represents convolution kernel parameters, The convolution kernel size is 3 × 3, “

\circ

” is the Hadamard product, “

\cdot

” denotes matrix multiplication, σ represents the sigmoid activation function, and

D S_{t}^{k}

represents the temporal feature storage of the k-th layer at time t.

To effectively fuse the features obtained by the Spatial Attention Module (SA) and Differential Attention Module (DA), we use Formula (5) for feature fusion.

H_{t}^{k}

represents the hidden state of the k-th layer at time t.

W_{1 \times 1}

represents the convolution kernel parameter, with a kernel size of 1 × 1.

o = σ (W_{o x} * X_{t} + W_{o h} * H_{t - 1}^{k} + W_{o c} * C_{t}^{k} + W_{o m} * M_{t}^{k} + W_{o d s} * D S_{t}^{k} + W_{o d l} * D L_{t}^{k}) H_{t}^{k} = o \circ \tanh (W_{1 \times 1} * [C_{t}^{k}, M_{t}^{k}, D S_{t}^{k}, D L_{t}^{k}])

(5)

3.3.3. The Fusion Mechanism Based on Self-Attention and Gating Mechanisms

Meteo-LSTM yields predictions at both high and low resolutions, capturing two distinct scales of forecast images. The low-resolution prediction image displays global information, capturing overall features of spatiotemporal movements. In contrast, the high-resolution prediction image reveals local details, capturing specific features of spatiotemporal movements. However, local features might deviate partially from the overall trend during the prediction process. Therefore, using the low-resolution image as a guide, the overall features of the high-resolution prediction image are corrected.

Figure 7 depicts the fusion module based on self-attention and gating mechanisms. Initially, the predictions from both scales are separately input into the self-attention module. Subsequently, using the gating mechanism, the module retains parts of the detailed prediction image that align with the overall trend and discards those that deviate from it. The formula for the fusion module incorporating self-attention and gating mechanisms is as follows:

\begin{array}{l} Z_{l} = s o f t m a x ((W_{l q} * X_{l}) \cdot {(W_{l k} * X_{l})}^{⊤}) \cdot (W_{l v} * X_{l}) \\ Z_{h} = s o f t m a x ((W_{l q} * X_{l}) \cdot {(W_{h k} * X_{h})}^{⊤}) \cdot (W_{h v} * X_{h}) \\ Z_{m} = (W_{m} * [Z_{l}, Z_{h}]) \\ i = σ (W_{i, z} * Z_{m} + W_{i, x l} * X_{l}) \\ g = \tanh (W_{g, z} * Z_{m} + W_{g, x l} * X_{l}) \\ H = (1 - i) \circ X_{h} + i \circ g \end{array}

(6)

The model takes

X_{l}

and

X_{h}

as inputs, where

X_{l}

represents the low-resolution prediction image and

X_{h}

represents the high-resolution image. “

*

” denotes convolution operation,

W_{* *}

represents convolution kernel parameters, The convolution kernel size is 1 × 1, “

\cdot

” denotes matrix multiplication, σ represents the sigmoid activation function, After convolution, corresponding Q, K, and V are obtained. We use

Q_{l}

obtained from the low-resolution image as the query for the high-resolution image to focus on regions where local features align with global features. Through attention, we obtain two feature vectors,

Z_{l}

and

Z_{h}

. After concatenation and convolution, they are input into the gating mechanism. The gating mechanism adjusts the weights of the input data, allowing for better information processing in different contexts.

4. Experiment

4.1. Dataset

The MeteoNet dataset [52] was collected and organized by the French Meteorological Office. It consists of two regions, each covering an area of 550 km by 550 km. One region is in the northwest of France, and the other is in the southeast. The spatial resolution of the radar is 0.01°, and the projection system used is EPSG:4326. We chose the data from the northwest region of France, spanning three years from 2016 to 2018. The collected dataset includes subsets of rainfall, rainfall radar data quality, and radar reflectivity. An example diagram is shown in Figure 8. The values in the radar reflectivity dataset range from 0 to 70 dBZ, with each observation having a data grid size of 784 × 565 pixels. Data is recorded every five minutes. To achieve precipitation forecasting for the next two hours, one radar echo image is selected every 15 min. A sample consists of 20 radar echo images. For computational convenience and due to limitations in computing power, we downsampled the original resolution data to 128 × 128 pixels. The rainfall radar data quality dataset associates a quality code with each pixel value of the rainfall product, ranging from 0 (very poor) to 100 (perfect), measured in percentages. If the value is missing, it is set to 255.

We used radar reflectivity as the training and testing dataset for the model. During precipitation forecast evaluation, we first calculate the reflectivity corresponding to R = 0.5, 2, 5, and 10 using the E-R relationship. Then, we use these values as thresholds to compute the CSI and HSS indicators for the predicted radar reflectivity image compared to the true radar reflectivity image. Based on rainfall radar data quality, data with poor quality in the cumulative rainfall dataset was excluded. After preprocessing, the initial dataset is sorted chronologically, and samples are taken at intervals of 15 min. Hence, there is no temporal overlap between samples. The training/evaluation/testing datasets are organized in a ratio of 8:1:1 and split randomly. The training set consists of 4000 data samples, and both the testing and validation sets consist of 500 data samples each. Each data sample includes 20 consecutive radar echo images, with the first 10 frames used as input and the remaining 10 frames for output. The model’s evaluation results are based on the validation set, and to minimize errors, all models were experimented with three times. The final results were averaged over the three validation sets. The early part of the validation set was not involved in model training and testing but served as the ultimate evaluation standard.

4.2. Loss Function

This model utilizes an L1 + L2 loss function for optimization. Additionally, a reverse planning sampling scheme is employed to accelerate the iterative convergence of the model. The loss function is defined as:

l o s s = \frac{1}{n} \sum_{t = 1}^{n} (|E_{t} - {\hat{E}}_{t}| + {|E_{t} - {\hat{E}}_{t}|}^{2})

(7)

Here, n represents the total number of output images obtained through the RNN loop in a sample set, set to 19 in this case.

E_{t}

denotes the real image, and

{\hat{E}}_{t}

represents the predicted image.

4.3. Implementation Details

This study used Python as the programming language and PyTorch [53] as the deep learning framework, implemented on an NVIDIA RTX A100 GPU. All models were trained using the Adam optimizer [54], with a batch size of four, an initial learning rate of 10⁻⁴, and a training duration of 30 epochs for each model. Early stopping strategy was employed during training, monitoring the validation loss, and halting training when the validation loss ceased to decrease or began to increase, thereby preventing overfitting. To prevent overfitting, this paper employs 5-fold cross-validation. The total of 4500 samples from the training and testing sets are divided into five groups. Validation is conducted on the validation set only when the cross-validation differences are within 1%. The validation set is then used as the final result. To ensure a fair comparison, all models shared the same hyperparameters. The input to the models consisted of 10 consecutive radar echo images with a resolution of 128 × 128, and the output comprised 10 radar echo images. Since the comparative model in this study adopts an encoding-decoding structure, and the first part of the proposed METEO-DLNet also utilizes an encoding-decoding structure, for the sake of fairness and reliability in comparing experiments and ablation experiments, the encoding-decoding structures are equipped with the same number of stacked RNN layers. Moreover, considering that many models compared in the literature often consist of three layers of stacked RNNs in the encoding-decoding structure, this study also employs three layers of Meteo-LSTM stacked to form the encoding-decoding structure. The reverse planning sampling method [39] was also employed to expedite model training. Reverse planning sampling [39] forces the model to learn more about long-term dynamics during the encoding process by reducing the probability of randomly hiding true observations. As the training phase progresses, the probability will decrease.

4.4. Performance Metrics

Due to the broad range of precipitation amounts, evaluating predictions based solely on whether it is raining in a region is not sufficient. This study categorized precipitation amounts into five thresholds (0.5, 2, 5, and 10 mm/h) to assess the model’s precipitation forecasting capabilities at different levels. To evaluate the accuracy of precipitation predictions, the Heidke skill score (HSS) [10] and the critical success index (CSI) [55] were used.

Firstly, the pixel values (P: 0–255) were converted to radar echo reflectivity intensity (dBZ) using the transformation:

P = 255 \times \frac{dBZ}{60}

(8)

Based on the Z-R relationship [25], the conversion between radar echo reflectivity intensity (dBZ) and precipitation intensity values (R: mm/h) for the MeteoNet dataset is given by:

dBZ = 10 \times \lg (200 \times R^{1.6})

(9)

After obtaining the precipitation intensity prediction matrix, values below the threshold were set to 0 (indicating no precipitation), and values above the threshold were set to 1 (indicating precipitation). The same procedure was applied to the rainfall dataset to obtain the true precipitation intensity matrix. True positive (TP), false positive (FP), true negative (TN), and false negative (FN) predictions were then calculated, and the formulas for calculating the CSI and HSS are as follows:

CSI = \frac{TP}{TP + FN + FP} HSS = \frac{TP \times TN - FN \times FP}{(TP + FN) (FN + TN) + (TP + FP) (FP + TN)}

(10)

In addition to the mentioned metrics, mean squared error (MSE), mean absolute error (MAE), and the structural similarity index (SSIM) [56] were also used. Due to the small scale of MSE and MAE values, making it inconvenient to display and compare results, we utilize MSE×N² and MAE×N² instead of MSE and MAE for evaluating the model, where N represents the size of the radar echo image, and N=128. A higher SSIM value indicates greater similarity between the two images. The SSIM formula is as follows:

S S I M (x, y) = \frac{(2 μ_{x} μ_{y} + C_{1}) (2 σ_{x y} + C_{2})}{(μ_{x}^{2} + μ_{y}^{2} + C_{1}) (σ_{x}^{2} + σ_{y}^{2} + C_{2})}

(11)

4.5. Experimental Results and Comparison with SOTA

In this section, we compare our precipitation forecasting network based on meteorological features and deep learning with current outstanding spatiotemporal prediction models in short-term precipitation forecasting. The models for comparison include ConvLSTM [35], TrajGRU [10], PredRNN [37], MIM [40], and MotionRNN [42]. For a fair comparison, all models adopted an encoding-decoding structure. Since MotionRNN [42] needs to be added to an RNN recurrent unit and the original paper suggested that the best combination for prediction was PredRNN [37] + MotionRNN [42], we added MotionRNN [42] to the model based on PredRNN [38]. The average evaluation results for short-term precipitation forecasting are shown in Table 1.

From Table 1, it can be observed that, compared to ConvLSTM (2015), METEO-DLNet’s MAE×N² has decreased from 187.010 to 176.844, marking a 5.4% reduction. In comparison to MotionRNN (2021) [42], METEO-DLNet’s MAE × N² has decreased from 180.778 to 176.844, showing a 2.2% reduction. This indicates that METEO-DLNet, as proposed in this paper, effectively reduces prediction errors and enhances the accuracy of image predictions compared to previous models.

Figure 9 illustrates the variation of prediction performance over time steps. It can be observed that MSE shows an increasing trend, while SSIM shows a decreasing trend. This is because as the prediction time lengthens, the error accumulates for each frame, leading to a continuous increase in prediction loss and a decrease in prediction accuracy.

Furthermore, various common precipitation forecast indicators were used to evaluate the accuracy of these methods. Precipitation was categorized into four ranges based on the previously mentioned thresholds: R ≥ 0.5, R ≥ 2, R ≥ 5, and R ≥ 10. The results are shown in Table 2, Table 3, Table 4 and Table 5.

In these tables, higher CSI and HSS values indicate better performance. It can be observed that due to the long-tail distribution of precipitation (few heavy rains but many light rains) [57], the performance of all models deteriorates as the rainfall threshold increases. Since heavy precipitation has the greatest impact on daily life, the focus here is primarily on the R ≥ 10 scenario. As shown in the table, compared to ConvLSTM (2015) [35], METEO-DLNet’s CSI (R ≥ 10) has increased from 0.009 to 0.234, a 25 times increase, and HSS (R ≥ 10) has increased from 0.018 to 0.336, a 17.7 times increase. In comparison to MotionRNN (2021) [42], METEO-DLNet’s CSI (R ≥ 10) has increased from 0.198 to 0.234, a 18.2% increase, and HSS (R ≥ 10) has increased from 0.292 to 0.336, a 15.1% increase. METEO-DLNet consistently outperforms both the early ConvLSTM model (2015) [35] and the more recent MotionRNN model (2021) [42] in nearly all metrics, underscoring its superior performance. This suggests that METEO-DLNet, introduced in this study, excels by proficiently capturing the spatial and temporal characteristics of radar echo images through spatial attention, differential attention, and effective utilization of meteorological features. As a result, it significantly enhances precipitation prediction accuracy across both temporal and spatial dimensions. Additionally, the fusion mechanism based on self-attention and gating has successfully guided overall trends and improved the prediction accuracy of radar echo images. Especially in the prediction of heavy precipitation, METEO-DLNet demonstrates a significant improvement compared to previous models.

To further visually demonstrate METEO-DLNet’s ability to predict high-intensity rainfall, Figure 10 shows randomly selected visualization examples.

As the prediction time increases, ConvLSTM [35] and TrajGRU [10] perform poorly in predicting high-intensity precipitation, primarily manifested in the widespread absence of rainfall. PredRNN [37] seems to exhibit better integrity in predicting the area, but similar to MIM [40] and MotionRNN [42], it experiences blurriness in localized rainfall prediction images. Specifically, in the lower right part of the predicted images, METEO-DLNet is closest to the ground truth, while other models show larger discrepancies. METEO-DLNet, due to the introduction of meteorological features at both temporal and spatial scales and the use of a multi-scale fusion mechanism, excels in handling rainfall details, resulting in clearer precipitation prediction images. In summary, the observations indicate that METEO-DLNet’s advantages become more pronounced with increasing rainfall intensity.

4.6. Ablation Experiment and Analysis

We discuss the effectiveness of Meteo-LSTM and the fusion mechanism in this section.

Effectiveness of Meteo-LSTM: In this comparison between ST-LSTM and Meteo-LSTM, both are evaluated separately to ensure a fair comparison. Both models utilize a three-layer encoding-decoding structure. Additionally, experiments were conducted with Meteo-LSTM configurations lacking the SA module and the DA module. The results are shown in Table 6.

Table 6 indicates significant improvements in precipitation forecasting with Meteo-LSTM compared to ST-LSTM. The Critical Success Index (CSI) (R > 10) increased from 0.108 to 0.157, marking a 45.3% improvement, while the Heidke Skill Score (HSS) (R > 10) rose from 0.177 to 0.249, indicating a 40.7% increase. Moreover, the table illustrates that the predictive capability of Meteo-LSTM decreases when either the DA or SA module is absent. These results demonstrate that spatial attention and differential attention effectively leverage meteorological features to learn spatial and temporal characteristics of radar echo images. Meteo-LSTM indeed enhances precipitation forecasting accuracy from both temporal and spatial perspectives.

Effectiveness of the fusion mechanism: a comparison was made between the model without the fusion mechanism (DLNet) and the model with the fusion mechanism (METEO-DLNet). In DLNet, the fusion involves directly adding models of two different scales and equally dividing the result. The results are shown in Table 7.

After replacement, all indicators of the model show a decline, with a 18.8% decrease in HSS (R ≥ 10) and a 22.2% decline in CSI. The results suggest that the fusion mechanism, based on the combination of self-attention and gating mechanisms, achieves the functionality of guiding overall trends in local images and enhances the prediction accuracy of radar echo images.

5. Discussion

Through comparison with SOTA and ablation experiments, the observations can be summarized as follows. Firstly, the model using Meteo-LSTM performs better in precipitation prediction compared to traditional ST-LSTM [37], indicating that the method of incorporating meteorological features into deep learning models is effective. This method utilizes spatial attention to learn terrain features during precipitation and differential attention to learn differences between adjacent time frames, fully utilizing meteorological features to improve accuracy in precipitation prediction.

Secondly, the fusion mechanism proposed in this paper, based on the combination of self-attention and gating mechanisms, enhances the accuracy of precipitation prediction. This indicates that it retains parts of the fine prediction image consistent with overall trends and discards parts that deviate from overall trends.

Finally, combining Meteo-LSTM with the fusion mechanism based on self-attention and gating mechanisms achieves more accurate precipitation forecasts.

While this study has indeed further enhanced the accuracy of precipitation forecasting, it is important to note that the Z-R formula is an empirical formula derived from precipitation statistics, and thus, there exists error between the predicted and observed precipitation values. In our next step, we aim to explore the utilization of deep learning methods to establish end-to-end precipitation forecasting models to replace traditional Z-R relationships. Additionally, we plan to consider incorporating other meteorological parameters such as wind into the precipitation forecasting model to enhance the accuracy of radar echo variation predictions.

6. Conclusions

Currently, there is still room for significant improvement in radar echo extrapolation methods based on deep learning, with inherent meteorological characteristics in precipitation often not receiving sufficient attention. Precipitation exhibits intrinsic meteorological features. Spatially, terrain has a significant impact, triggering dynamic and thermodynamic effects in the atmosphere and leading to diverse precipitation patterns in different regions. Temporally, convective activity and airflow variations preceding precipitation may persistently influence later precipitation periods. In addition to considering meteorological features, traditional predictions face challenges in handling local images that deviate from the overall motion trend. Therefore, to fully leverage meteorological features and address the deviation between local and global trends, we designed METEO-DLNet, a short-term precipitation forecasting network based on meteorological features and deep learning. Through experiments, the following conclusions were drawn.

Firstly, our designed Meteo-LSTM effectively learns the spatial and temporal features of radar echo images using spatial and differential attention, fully utilizing meteorological features, and enhancing precipitation prediction accuracy at both temporal and spatial scales.

Secondly, the fusion mechanism based on self-attention and gating mechanisms, designed in this paper, guides overall trends in local images and enhances the prediction accuracy of radar echo images.

Thirdly, both quantitative and qualitative experiments indicate that our designed short-term precipitation forecasting network (METEO-DLNet) outperforms current mainstream deep learning-based precipitation forecasting models in this challenging natural spatiotemporal sequence problem.

Author Contributions

Conceptualization, J.H.; methodology, J.H.; software, J.H.; validation, J.H.; formal analysis, J.H.; investigation, J.H. and C.G.; resources, J.H.; data curation, J.H.; writing—original draft preparation, J.H.; writing—review and editing, B.Y., J.H. and C.G.; visualization, J.H.; supervision, B.Y.; project administration, B.Y.; funding acquisition, B.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National natural science foundation of China under Grant 61972367.

Data Availability Statement

The data presented in this study are available at https://meteofrance.github.io/meteonet/, accessed on 5 July 2023.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Ravuri, S.; Lenc, K.; Willson, M.; Kangin, D.; Lam, R.; Mirowski, P.; Fitzsimons, M.; Athanassiadou, M.; Kashem, S.; Madge, S. Skilful precipitation nowcasting using deep generative models of radar. Nature 2021, 597, 672–677. [Google Scholar] [CrossRef] [PubMed]
Fabry, F.; Seed, A.W. Quantifying and predicting the accuracy of radar-based quantitative precipitation forecasts. Adv. Water Resour. 2009, 32, 1043–1049. [Google Scholar] [CrossRef]
Ehsani, M.R.; Zarei, A.; Gupta, H.V.; Barnard, K.; Lyons, E.; Behrangi, A. NowCasting-Nets: Representation learning to mitigate latency gap of satellite precipitation products using convolutional and recurrent neural networks. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–21. [Google Scholar] [CrossRef]
Pan, H.-L.; Wu, W.-S. Implementing a Mass Flux Convection Parameterization Package for the NMC Medium-Range Forecast Model. 1995. Office Note (National Centers for Environmental Prediction (U.S.)). Available online: https://repository.library.noaa.gov/view/noaa/11429/noaa_11429_DS1.pdf (accessed on 1 July 2023).
Gowariker, V.; Thapliyal, V.; Kulshrestha, S.; Mandal, G.; Roy, N.S.; Sikka, D. A power regression model for long range forecast of southwest monsoon rainfall over India. Mausam 1991, 42, 125–130. [Google Scholar] [CrossRef]
Loh, J.L.; Lee, D.-I.; Kang, M.-Y.; You, C.-H. Classification of Rainfall Types Using Parsivel Disdrometer and S-Band Polarimetric Radar in Central Korea. Remote Sens. 2020, 12, 642. [Google Scholar] [CrossRef]
Shukla, B.P.; Kishtawal, C.M.; Pal, P.K. Prediction of satellite image sequence for weather nowcasting using cluster-based spatiotemporal regression. IEEE Trans. Geosci. Remote Sens. 2013, 52, 4155–4160. [Google Scholar] [CrossRef]
Zhang, J.; Zheng, Y.; Qi, D. Deep spatio-temporal residual networks for citywide crowd flows prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017. [Google Scholar]
Xu, Z.; Wang, Y.; Long, M.; Wang, J. PredCNN: Predictive Learning with Cascade Convolutions. In Proceedings of the IJCAI, Stockholm, Sweden, 13–19 July 2018; pp. 2940–2947. [Google Scholar]
Shi, X.; Gao, Z.; Lausen, L.; Wang, H.; Yeung, D.-Y.; Wong, W.-K.; Woo, W.-C. Deep learning for precipitation nowcasting: A benchmark and a new model. Adv. Neural Inf. Process. Syst. 2017, 30, 5617–5627. [Google Scholar]
Yan, B.-Y.; Yang, C.; Chen, F.; Takeda, K.; Wang, C. FDNet: A deep learning approach with two parallel cross encoding pathways for precipitation nowcasting. arXiv 2021, arXiv:2105.02585. [Google Scholar] [CrossRef]
Ham, Y.G.; Kim, J.H.; Luo, J.J. Deep learning for multi-year ENSO forecasts. Nature 2019, 573, 568–572. [Google Scholar] [CrossRef]
Bouget, V.; Béréziat, D.; Brajard, J.; Charantonis, A.; Filoche, A. Fusion of Rain Radar Images and Wind Forecasts in a Deep Learning Model Applied to Rain Nowcasting. Remote Sens. 2021, 13, 246. [Google Scholar] [CrossRef]
Shukla, J. Predictability in the midst of chaos: A scientific basis for climate forecasting. Science 1998, 282, 728–731. [Google Scholar] [CrossRef]
Sønderby, C.K.; Espeholt, L.; Heek, J.; Dehghani, M.; Oliver, A.; Salimans, T.; Agrawal, S.; Hickey, J.; Kalchbrenner, N. Metnet: A neural weather model for precipitation forecasting. arXiv 2020, arXiv:2003.12140. [Google Scholar]
Yan, Q.; Ji, F.; Miao, K.; Wu, Q.; Xia, Y.; Li, T. Convolutional residual-attention: A deep learning approach for precipitation nowcasting. Adv. Meteorol. 2020, 2020, 6484812. [Google Scholar] [CrossRef]
Ma, Z.; Zhang, H.; Liu, J. Focal frame loss: A simple but effective loss for precipitation nowcasting. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 6781–6788. [Google Scholar] [CrossRef]
Rinehart, R.; Garvey, E. Three-dimensional storm motion detection by conventional weather radar. Nature 1978, 273, 287–289. [Google Scholar] [CrossRef]
Sevilla-Lara, L.; Liao, Y.; Güney, F.; Jampani, V.; Geiger, A.; Black, M.J. On the integration of optical flow and action recognition. In Proceedings of the Pattern Recognition: 40th German Conference, GCPR 2018, Stuttgart, Germany, 9–12 October 2018; pp. 281–297. [Google Scholar]
Tian, L.; Li, X.; Ye, Y.; Xie, P.; Li, Y. A generative adversarial gated recurrent unit model for precipitation nowcasting. IEEE Geosci. Remote Sens. Lett. 2019, 17, 601–605. [Google Scholar] [CrossRef]
Cholissodin, I.; Sutrisno, S. Prediction of rainfall using improved deep learning with particle swarm optimization. TELKOMNIKA (Telecommun. Comput. Electron. Control) 2020, 18, 2498–2504. [Google Scholar] [CrossRef]
Wang, B.; Lu, J.; Yan, Z.; Luo, H.; Li, T.; Zheng, Y.; Zhang, G. Deep uncertainty quantification: A machine learning approach for weather forecasting. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 2087–2095. [Google Scholar]
McGovern, A.; Elmore, K.L.; Gagne, D.J.; Haupt, S.E.; Karstens, C.D.; Lagerquist, R.; Smith, T.; Williams, J.K. Using artificial intelligence to improve real-time decision-making for high-impact weather. Bull. Am. Meteorol. Soc. 2017, 98, 2073–2090. [Google Scholar] [CrossRef]
Voyant, C.; Muselli, M.; Paoli, C.; Nivet, M.-L. Numerical weather prediction (NWP) and hybrid ARMA/ANN model to predict global radiation. Energy 2012, 39, 341–355. [Google Scholar] [CrossRef]
Bowler, N.E.; Pierce, C.E.; Seed, A. Development of a precipitation nowcasting algorithm based upon optical flow techniques. J. Hydrol. 2004, 288, 74–91. [Google Scholar] [CrossRef]
Oprea, S.; Martinez-Gonzalez, P.; Garcia-Garcia, A.; Castro-Vargas, J.A.; Orts-Escolano, S.; Garcia-Rodriguez, J.; Argyros, A. A review on deep learning techniques for video prediction. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 44, 2806–2826. [Google Scholar] [CrossRef]
Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European conference on computer vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
Gao, Z.Y.; Tan, C.; Wu, L.R.; Li, S.Z. SimVP: Simpler yet Better Video Prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 3160–3170. [Google Scholar]
Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv 2014, arXiv:1406.1078. [Google Scholar]
Chang, Z.; Zhang, X.F.; Wang, S.S.; Ma, S.W.; Gao, W. STRPM: A Spatiotemporal Residual Predictive Model for High-Resolution Video Prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 13926–13935. [Google Scholar]
Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
Ayzel, G.; Scheffer, T.; Heistermann, M. RainNet v1.0: A convolutional neural network for radar-based precipitation nowcasting. Geosci. Model Dev. 2020, 13, 2631–2644. [Google Scholar] [CrossRef]
Trebing, K.; Stanczyk, T.; Mehrkanoon, S. SmaAt-UNet: Precipitation nowcasting using a small attention-UNet architecture. Pattern Recognit. Lett. 2021, 145, 178–186. [Google Scholar] [CrossRef]
Pan, X.; Lu, Y.; Zhao, K.; Huang, H.; Wang, M.; Chen, H. Improving Nowcasting of convective development by incorporating polarimetric radar variables into a deep-learning model. Geophys. Res. Lett. 2021, 48, e2021GL095302. [Google Scholar] [CrossRef]
Lin, Z.; Li, M.; Zheng, Z.; Cheng, Y.; Yuan, C. Self-attention convlstm for spatiotemporal prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 11531–11538. [Google Scholar]
Srivastava, N.; Mansimov, E.; Salakhutdinov, R. Unsupervised Learning of Video Representations using LSTMs. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, 7–9 July 2015; pp. 843–852. [Google Scholar]
Wang, Y.B.; Long, M.S.; Wang, J.M.; Gao, Z.F.; Yu, P.S. PredRNN: Recurrent Neural Networks for Predictive Learning using Spatiotemporal LSTMs. In Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
Wang, Y.B.; Gao, Z.F.; Long, M.S.; Wang, J.M.; Yu, P.S. PredRNN plus plus: Towards A Resolution of the Deep-in-Time Dilemma in Spatiotemporal Predictive Learning. In Proceedings of the 35th International Conference on Machine Learning (ICML), Stockholm, Sweden, 10–15 July 2018. [Google Scholar]
Wang, Y.B.; Wu, H.X.; Zhang, J.J.; Gao, Z.F.; Wang, J.M.; Yu, P.S.; Long, M.S. PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 2208–2225. [Google Scholar] [CrossRef] [PubMed]
Wang, Y.; Zhang, J.; Zhu, H.; Long, M.; Wang, J.; Yu, P.S. Memory in memory: A predictive neural network for learning higher-order non-stationarity from spatiotemporal dynamics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 9154–9162. [Google Scholar]
Wang, Y.; Jiang, L.; Yang, M.-H.; Li, L.-J.; Long, M.; Fei-Fei, L. Eidetic 3D LSTM: A model for video prediction and beyond. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
Wu, H.; Yao, Z.; Wang, J.; Long, M. MotionRNN: A flexible model for video prediction with spacetime-varying motions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 15435–15444. [Google Scholar]
Chai, Z.; Xu, Z.; Bail, Y.; Lin, Z.; Yuan, C. Cms-lstm: Context embedding and multi-scale spatiotemporal expression lstm for predictive learning. In Proceedings of the 2022 IEEE International Conference on Multimedia and Expo (ICME), Taipei, Taiwan, 18–22 July 2022; pp. 1–6. [Google Scholar]
Cheng, M.; Xu, Q.; Jianming, L.; Liu, W.; Li, Q.; Wang, J. MS-LSTM: A multi-scale LSTM model for BGP anomaly detection. In Proceedings of the 2016 IEEE 24th International Conference on Network Protocols (ICNP), Singapore, 8–11 November 2016; pp. 1–6. [Google Scholar]
Ma, Z.; Zhang, H.; Liu, J. Preciplstm: A meteorological spatiotemporal lstm for precipitation nowcasting. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–8. [Google Scholar] [CrossRef]
Ma, Z.; Zhang, H.; Liu, J. MM-RNN: A Multimodal RNN for Precipitation Nowcasting. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–14. [Google Scholar] [CrossRef]
Cui, Y.; Qiu, Y.; Sun, L.; Shu, X.; Lu, Z. Quantitative Short-Term Precipitation Model Using Multimodal Data Fusion Based on a Cross-Attention Mechanism. Remote Sens. 2022, 14, 5839. [Google Scholar] [CrossRef]
Tekin, S.F.; Karaahmetoglu, O.; Ilhan, F.; Balaban, I.; Kozat, S.S. Spatio-temporal weather forecasting and attention mechanism on convolutional lstms. arXiv 2021, arXiv:2102.00696. [Google Scholar]
Li, Y.; Lang, J.; Ji, L.; Zhong, J.; Wang, Z.; Guo, Y.; He, S. Weather forecasting using ensemble of spatial-temporal attention network and multi-layer perceptron. Asia-Pac. J. Atmos. Sci. 2021, 57, 533–546. [Google Scholar] [CrossRef]
Fernández, J.G.; Mehrkanoon, S. Broad-UNet: Multi-scale feature learning for nowcasting tasks. Neural Netw. 2021, 144, 419–427. [Google Scholar] [CrossRef] [PubMed]
Adewoyin, R.A.; Dueben, P.; Watson, P.; He, Y.; Dutta, R. TRU-NET: A deep learning approach to high resolution prediction of rainfall. Mach. Learn. 2021, 110, 2035–2062. [Google Scholar] [CrossRef]
Larvor, G.; Berthomier, L.; Chabot, V.; Le Pape, B.; Pradel, B.; Perez, L. MeteoNet, an open reference weather dataset by Meteo-France. 2020. Available online: https://meteofrance.github.io/meteonet/ (accessed on 5 July 2023).
Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L. Pytorch: An imperative style, high-performance deep learning library. In Proceedings of the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, BC, Canada, 8–14 December 2019; Volume 32. [Google Scholar]
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Ruzanski, E.; Chandrasekar, V. Scale filtering for improved nowcasting performance in a high-resolution X-band radar network. IEEE Trans. Geosci. Remote Sens. 2011, 49, 2296–2307. [Google Scholar] [CrossRef]
Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef]
Pathak, J.; Subramanian, S.; Harrington, P.; Raja, S.; Chattopadhyay, A.; Mardani, M.; Kurth, T.; Hall, D.; Li, Z.; Azizzadenesheli, K. Fourcastnet: A global data-driven high-resolution weather model using adaptive fourier neural operators. arXiv 2022, arXiv:2202.11214. [Google Scholar]

Figure 1. Structure diagram of ST-LSTM.

Figure 2. The overall architecture of the METEO-DLNet.

Figure 3. Structure diagram of encode-decode architecture.

Figure 4. Structure diagram of Meteo-LSTM.

Figure 5. Structure of the Spatial Attention Module (SA).

Figure 6. Structure of the Differential Attention Module (DA).

Figure 7. Structure diagram of the fusion module based on self-attention and gating mechanism.

Figure 8. (a) Radar reflectivity example chart for the northwestern region of France; (b) Precipitation example chart for the northwestern region of France.

Figure 9. (a) Mean value of SSIM over time on the test set; (b) Mean value of MSE over time on the test set.

Figure 10. Prediction example of radar echo test set in the northwest region of France.

Table 1. Comparison of radar echo image prediction using different models.

Models	Params	SSIM	MSE × N²	MAE × N²	Time
ConvLSTM [35]	0.94 m	0.793	20.596	187.010	0.8 h
TrajGRU [10]	0.97 m	0.795	20.040	184.981	3.9 h
PredRNN [37]	1.94 m	0.799	20.385	184.374	1.9 h
MIM [40]	3.91 m	0.805	19.961	181.537	4.9 h
MotionRNN [42]	4.02 m	0.807	19.397	180.778	6.4 h
METEO-DLNet	4.07 m	0.814	19.354	176.844	5.5 h

Table 2. CSI and HSS score for R > 0.5 dBZ.

Models	CSI	HSS
ConvLSTM [35]	0.606	0.722
TrajGRU [10]	0.617	0.733
PredRNN [37]	0.621	0.735
MIM [40]	0.631	0.740
MotionRNN [42]	0.635	0.745
METEO-DLNet	0.636	0.746

Table 3. CSI and HSS score for R > 2 dBZ.

Models	CSI	HSS
ConvLSTM [35]	0.263	0.393
TrajGRU [10]	0.417	0.557
PredRNN [37]	0.420	0.557
MIM [40]	0.452	0.593
MotionRNN [42]	0.457	0.591
METEO-DLNet	0.466	0.599

Table 4. CSI and HSS score for R > 5 dBZ.

Models	CSI	HSS
ConvLSTM [35]	0.083	0.140
TrajGRU [10]	0.243	0.356
PredRNN [37]	0.238	0.342
MIM [40]	0.248	0.364
MotionRNN [42]	0.298	0.415
METEO-DLNet	0.310	0.432

Table 5. CSI and HSS score for R > 10 dBZ.

Models	CSI	HSS
ConvLSTM [35]	0.009	0.018
TrajGRU [10]	0.156	0.242
PredRNN [37]	0.108	0.177
MIM [40]	0.091	0.152
MotionRNN [42]	0.198	0.292
METEO-DLNet	0.234	0.336

Table 6. Comparison results between ST-LSTM and Meteo-LSTM on the radar echo test set.

Models	CSI				HSS				SSIM	MSE × N²	MAE × N²
Models	R ≥ 0.5	R ≥ 2	R ≥ 5	R ≥ 10	R ≥ 0.5	R ≥ 2	R ≥ 5	R ≥ 10	SSIM	MSE × N²	MAE × N²
ST-LSTM [37]	0.621	0.420	0.238	0.108	0.735	0.557	0.342	0.177	0.799	20.385	184.374
Meteo-LSTM Without SA	0.628	0.434	0.452	0.136	0.739	0.579	0.371	0.249	0.804	20.012	182.347
Meteo-LSTM Without DA	0.624	0.426	0.440	0.114	0.736	0.568	0.352	0.187	0.800	20.312	183.471
Meteo-LSTM	0.633	0.455	0.269	0.157	0.742	0.594	0.393	0.249	0.806	19.924	180.491

Table 7. Comparison results between DLNet and METEO-DLNet on the radar echo test set.

Models	CSI				HSS				SSIM	MSE × N²	MAE × N²
Models	R ≥ 0.5	R ≥ 2	R ≥ 5	R ≥ 10	R ≥ 0.5	R ≥ 2	R ≥ 5	R ≥ 10	SSIM	MSE × N²	MAE × N²
DLNet	0.635	0.459	0.283	0.182	0.743	0.596	0.410	0.273	0.811	19.986	179.346
METEO-DLNet	0.636	0.466	0.310	0.234	0.746	0.599	0.432	0.336	0.814	19.354	176.844

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hu, J.; Yin, B.; Guo, C. METEO-DLNet: Quantitative Precipitation Nowcasting Net Based on Meteorological Features and Deep Learning. Remote Sens. 2024, 16, 1063. https://doi.org/10.3390/rs16061063

AMA Style

Hu J, Yin B, Guo C. METEO-DLNet: Quantitative Precipitation Nowcasting Net Based on Meteorological Features and Deep Learning. Remote Sensing. 2024; 16(6):1063. https://doi.org/10.3390/rs16061063

Chicago/Turabian Style

Hu, Jianping, Bo Yin, and Chaoqun Guo. 2024. "METEO-DLNet: Quantitative Precipitation Nowcasting Net Based on Meteorological Features and Deep Learning" Remote Sensing 16, no. 6: 1063. https://doi.org/10.3390/rs16061063

APA Style

Hu, J., Yin, B., & Guo, C. (2024). METEO-DLNet: Quantitative Precipitation Nowcasting Net Based on Meteorological Features and Deep Learning. Remote Sensing, 16(6), 1063. https://doi.org/10.3390/rs16061063

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

METEO-DLNet: Quantitative Precipitation Nowcasting Net Based on Meteorological Features and Deep Learning

Abstract

1. Introduction

2. Related Work

3. Methods

3.1. Problem Definition

3.2. Base Model: ST-LSTM

3.3. Network Structure

3.3.1. Whole Network

3.3.2. Meteo-LSTM

3.3.3. The Fusion Mechanism Based on Self-Attention and Gating Mechanisms

4. Experiment

4.1. Dataset

4.2. Loss Function

4.3. Implementation Details

4.4. Performance Metrics

4.5. Experimental Results and Comparison with SOTA

4.6. Ablation Experiment and Analysis

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI