1. Introduction
Sea Surface Temperature (SST), usually referring to the temperature of the water from 1 mm to 20 m below the sea surface, is closely related to Extreme Hydrological Events (EHEs) [
1], extreme rainfall [
2], Tropical Cyclones (TCs) [
3] and many other ecological and climatic changes. Accurate SST prediction can help us monitor global climate, forecast SST anomalies, such as El Nino phenomenon, and forecast extreme hydrological events such as extreme rainfall and tropical cyclones. Therefore, SST prediction is an important fundamental research problem in the field of marine science. Over the past decade, numerous methods have been proposed for SST prediction, and these methods can be generally divided into physical models, traditional machine learning methods and deep learning models. Physical models use mathematical formulas to model the variations of SST. For example, Peng et al. [
4] predict the seasonal SST by using a physical model based on the CMIP6 [
5] from the Coupled Model Intercomparison Projects. Traditional machine learning methods analyze the statistical properties of historical SST data to obtain a mapping between historical SST data and the SST to be predicted. For example, the linear regression model [
6] and the Support Vector Machine (SVM) model [
7] have been used to predict SST. In recent years, deep spatio–temporal prediction models have also been widely used for SST prediction and achieve higher prediction accuracy than physical models and traditional machine methods. For example, Wei et al. [
8] used the Multi-Layer Perceptron (MLP) model to predict SST in the South China Sea, and Xiao et al. [
9] used the ConvLSTM model to capture the spatial and temporal dependencies of SST by combing the Convolutional Neural Networks (CNN) model [
10] and Long Short Term Memory (LSTM) model [
11]. Qiao et al. [
12] proposed the 3DCNN-LSTM-AT model, combining 3D CNN, LSTM and attention mechanism to predict the SST in the Bohai Sea and the South China Sea, where the attention mechanism could reduce the large error in long-term SST prediction. Zhang et al. designed the Memory Graph Convolutional Network (MGCN) [
13], composed of alternating graph convolutional layers and memory layers, to solve the problem that there are no valid SST observation values in land and island regions. In addition, Hou et al. designed the D2CL model [
14] that uses the dilated convolutional network and LSTM network to learn the spatial and temporal features of SST at different scales.
Most existing methods for SST prediction focus on one sea area of interest, such as the Indian Ocean [
6], tropical Atlantic [
7], South China Sea [
8,
12,
14] and East China Sea [
9,
13,
14], to conduct the prediction. However, SST is usually affected by geographic location, ocean currents and sea depth, solar radiation, land climate change, wind direction [
15,
16], etc., and the SST changing pattern is different in different regions.
Figure 1 illustrates the global ocean currents (
https://beachapedia.org/File:Ocean_currents.gif (accessed on 12 March 2023)), which greatly affect SST. Obviously, ocean currents have strong regional characteristics, thus making SST also present certain regional characteristics. Moreover, SST correlations across diverse ocean regions are substantial, suggesting a shared predictive model could harness these relationships for improving the prediction accuracy. This approach not only benefits from learning inter-region correlations but also accommodates distinct SST patterns shaped by geographical features. Thus, a wide-area SST prediction method holds promise for both accurate SST prediction and capturing region-specific variations. Therefore, it is an interesting research topic to explore the regionality of SST and apply the results to further learn the dynamics of SST and improve SST prediction. In practice, unfortunately, we do not have such regionality information for global SST. Existing information regarding ocean currents is also only a qualitative representation of the regionality and cannot be directly embedded into the data-driven models for SST-related downstream tasks such as SST prediction, SST anomaly detection and SST causal analysis. In this case, achieving a quantitative analysis of the regionality of SST is of high necessity.
There are already some studies on exploring the regionality of SST, ocean climate and other ocean factors. Kumar et al. [
17] introduced K-means [
18] to cluster long-term global SST, and analyzed the relationship between clustering results and meteorological indices. Steinbach et al. [
19] used the Shared Nearest Neighbor (SNN) clustering algorithm to discover Ocean Climate Indices (OCIs), which are important tools for analyzing the ocean’s impacts on land climate. To automatically identify coastal upwelling from SST data, Nascimento et al. [
20] proposed a new clustering algorithm based on the Seed Region Growing (SRG) method. Zahraie et al. [
21] used Genetic Algorithm (GA) [
22] and the K-means clustering method [
18] to cluster SST data and found the most relevant geographical zones for precipitation prediction. Considering the uncertainty of the global SST, Qin et al. [
23] used the improved type-2 fuzzy clustering method based on fuzzy theory to cluster SST data. All these methods above use the original SST observations for clustering. However, due to the complex periodicity, disturbing noise and redundant information of SST data, direct clustering based on original SST data cannot well capture the deep spatio–temporal dependencies in SST to extract precise regionality information.
To better uncover the regionality of global SST, we proposed the Multi-Stage Spatio–Temporal Clustering (MuSTC) method to quantitatively identify the sea areas of the similar SST patterns.
2. Study Areas and Datasets
There are already many studies on SST prediction, such as the Convolutional Model [
24], LSTM [
25], the B-spline interpolation and spatiotemporal attention mechanism [
26], the Memory Graph Convolutional Networks (MGCN) [
13] and D2CL [
14]. Most of these studies aim to predict the daily or weekly mean SST in the future on a large spatial scale. In this work, we only follow the setting of these related studies.
The data used in this work is the Version 2 daily Optimum Interpolation Sea Surface Temperature (OISST V2) analysis (
https://www.ncei.noaa.gov/products/optimum-interpolation-sst (accessed on 20 June 2022)) of the National Oceanic and Atmospheric Administration (NOAA). By combining bias-adjusted observations from different platforms such as satellites, ships and buoys, and filling in gaps through interpolation, this dataset offers a complete ocean temperature field on a regular global grid. Among many data sources, the satellite data of the Advanced Very-High Resolution Radiometer (AVHRR) is the main source used in the OISST V2 dataset, and the high spatio–temporal coverage of AVHRR lays the foundation for the data integrity of the dataset. To be specific, this dataset provides global daily SST data with a spatial resolution of 0.25° × 0.25° from 1981 to the present. The spatial coverage is 0.125 E–0.125 W, 89.875 S–89.875 N.
As shown in
Figure 2, we set three large study areas, i.e., the North Pacific Ocean (NPO, 150° E–130° W, 40 N–66 N), the South Atlantic Ocean (SAO, 50° W–20° E, 0°–45° S) and the North Atlantic Ocean (NAO, 90° W–0°, 0°–75° N). The SST data from 2002 to 2021 was selected for analysis.
Considering the large number of grid ocean regions of size 0.25° × 0.25° in each study area and the high computation cost, we reduce the spatial resolution to 1° × 1°. This will not greatly affect the results of regionality analysis since the SST of adjacent locations usually does not change much in a small range. The SST of each coarse-grained grid region of size 1° × 1° is obtained by averaging all the SST records of the 16 (4 × 4) grid regions of size 0.25° × 0.25° covered by it. After reducing the resolution and removing the land regions, the datasets of the North Pacific Ocean, South Atlantic Ocean and North Atlantic Ocean contain 1708, 2711 and 5238 grid regions of size 1° × 1°, respectively.
3. Methods
MuSTC first learns the representation of long-term SST with a deep temporal encoder. Then, with the learned representation, MuSTC calculates the spatial correlation scores between grid ocean regions with self-attention. Finally, MuSTC clusters grid ocean regions based on the original SST data, encoded long-term SST representation and spatial correlation scores, respectively, to obtain the sea areas with similar SST patterns from different perspectives. Since there are no explicit targets for training MuSTC, we reconstruct SST data using the outputs of the self-attention and minimize the difference between the reconstructed SST data and the original SST data during the training.
To evaluate the effectiveness of the proposed method, we applied MuSTC in three ocean areas, i.e., the North Pacific Ocean, the South Atlantic Ocean and the North Atlantic Ocean, and the clustering results, especially the results based on spatial correlation scores, generally match the distribution of global ocean currents. In addition, we integrate the learned regionality information into two representative spatio–temporal prediction models, and the notable improvement in SST prediction accuracy also indicates that our MuSTC method can truly capture the regionality of SST.
3.1. Problem Statement
To discover the regionality of SST, we aim to cluster the
N grid ocean regions in each study area into multiple groups, and each group of regions have similar SST patterns. In other words, with the SST data
for
N grid regions, where
T is the length of time (in days), we wish to find a method
f that generates clustering labels for each grid region, i.e.,
where
is the specified number of clusters and
represents the clustering results of
N grid regions.
Table 1 lists the notations that will be frequently used in this work.
3.2. Multi-Stage Spatio–Temporal Clustering Method
Figure 3 shows the structure of the MuSTC method, which consists of a temporal encoder module, a self-attention module, two fully connected layers (FC) and an SST cluster generation module. First, the temporal encoder module encodes the original SST data to obtain the encoded long-term SST representation, and the self-attention module uses this representation to learn the correlations between different grid ocean regions. The temporal encoder module and self-attention module are connected by an FC layer. Then, another FC layer is introduced to obtain the reconstructed SST data with the same shape as the original SST data. Finally, the SST cluster generation module uses the original SST data, encoded SST representation and attention matrix to perform cluster analysis for extracting the regionality information of SST. In addition, we integrate the learned regionality information into SST prediction models to verify the effectiveness of our MuSTC method. The technical details of the modules in MuSTC and the methods for verifying its validity are elaborated below.
3.3. Temporal Encoder Module
To capture the temporal characteristics of each grid ocean region, we proposed a temporal encoder
e based on Time2vec [
27] to encode the SST data
, i.e.,
where
represents the encoded SST representation and
is the size of the encoded feature dimension.
The temporal encoder can well capture the linear variation trend of SST data and automatically capture the periodicity of different granularities.
Figure 4 presents the structure of the temporal encoder, which contains
units.
Concretely, the ith unit is calculated as follows:
where
represents the encoded data of the ith unit,
is the input SST data,
and
are learnable parameters.
Concatenating the outputs of all
units together generates the encoded SST representation
, i.e.,
where the number of encoded features
and
T is the length of historical SST data.
From the above formula, it can be seen that a linear transformation unit (i.e., the 1st unit) is used in the temporal encoder module to capture the linear trend in SST data, and the other sine transformation units (i.e., the 2nd unit to the th unit) with different and are used to capture the different periodic changes of SST data in different granularities and learn the long-term non-linear dependencies in the SST data.
3.4. Self-Attention Module
We designed a self-attention module to learn the underlying correlations between grid ocean regions. Self-attention is commonly used in Natural Language Processing (NLP) models to capture the relationship between words in a sentence and determine the importance of each word [
28]. In our MuSTC method, the self-attention module tries to reconstruct SST from the encoded SST representation and decides how the output of each grid region is affected by the other grid regions in terms of attention scores. Then, the attention scores can be regarded as the degree of correlations between grid regions and provide abundant information for subsequent regionality analysis of SST.
Figure 5 illustrates the structure of the self-attention module. First, according to
Figure 3, the encoded SST representation
(i.e., the output of the temporal encoder module) is processed by an FC layer to generate the input
of the self-attention module, where
is the feature dimension. Then, the output
of the self-attention module is produced by the self-attention algorithm
r.
where
is the feature dimension of the output of the self-attention module.
The self-attention module uses query matrix
Q to represent the features that need to be matched, key matrix
K to provide the features that can be matched and value matrix
V to keep the features of the input
of the self-attention module. Meanwhile,
Q and
K are multiplied to obtain attention matrix
A, which captures the correlations between grid regions. The calculation process of the self-attention module is formally defined as follows.
where
,
and
are learnable parameters;
,
and
are the query matrix, key matrix and value matrix, respectively, and
is the feature dimension of the query matrix and key matrix. Attention matrix
A is normalized by a softmax operation and the normalized attention matrix
will be used for subsequent regionality analysis of SST.
Finally, according to
Figure 3, another FC layer transforms the output of the self-attenion module
into the reconstructed SST data
, which has the same shape as the original SST data
. MuSTC is trained by minimizing the difference between the original SST data
and the reconstructed SST data
, which will be discussed in
Section 3.6.
3.5. SST Cluster Generation Module
To discover the regionality of SST, we aim to cluster the N grid ocean regions into multiple groups, such that each group of regions has similar SST patterns. To this end, we designed an SST cluster generation module to cluster grid ocean regions based on the original SST data , the encoded SST representation and the normalized attention matrix , respectively, and this generates the cluster labels for all N grid regions.
Two clustering algorithms, i.e., K-means clustering [
18] and Agglomerative clustering [
29], are introduced to achieve the clustering operation. K-means clustering and Agglomerative clustering are two different types of clustering algorithms, and using them together can make the analysis results more convincing.
The goal of K-means clustering is to divide the input data samples into the specified categories, so that the data samples within the same category are as similar as possible, while the data between different categories are as different as possible. K-means first randomly initializes clustering centers. Then, it calculates the distance between each data sample and each clustering center and assigns each data sample to the nearest cluster. After that, we calculate the clustering center of the new clusters and repeat the assignment operation until there are no more updates. This is relatively simple to implement and can scale to large datasets with guaranteed convergence.
Agglomerative clustering is a hierarchical clustering algorithm that uses tree structure to gradually aggregate data into specified categories. Concretely, Agglomerative clustering first sets each data sample to be a separate cluster. Then, it calculates the distance between each pair of clusters and merges the two clusters with the shortest distance to form a new bigger cluster, where the distance between two clusters is defined as the minimum distance between the samples in the two clusters. The merge operation is repeated until there are exactly clusters left. Agglomerative clustering is suitable for clusters of different shapes and sizes.
3.6. Optimizing the MuSTC
Since there are no explicit targets for MuSTC to conduct model training, we adopt the idea of data reconstruction to optimize MuSTC. Given the original SST data
and the reconstructed SST
, the loss function is:
where
calculates the mean absolute error (MAE) between
and
.
Obviously, the closer the SST data reconstructed by the self-attention module is to the encoded long-term SST representation to the original SST data, the encoded SST representation and attention matrix contain more comprehensive SST information, and therefore the cluster analysis results with such data are more reflective of the regionality of SST.
3.7. Application of SST Regionality Information
We integrated the learned regionality information into SST prediction models for improving the prediction accuracy, which further verifies the effectiveness of the MuSTC method. We choose two advanced spatio–temporal prediction models, i.e., Spatio–Temporal Graph Convolutional Networks (STGCN) [
30] and Adaptive Graph Convolutional Recurrent Network (AGCRN) [
31], to conduct the SST prediction.
3.7.1. Spatio–Temporal Graph Convolutional Network
The STGCN model was initially proposed for predicting traffic flow by combining graph convolution and gated temporal convolution. The STGCN model consists of two ST-Conv blocks and one output block. Each ST-Conv block consists of two Gated Temporal Convolution layers sandwiched with a Graph Convolution layer. The Graph Convolution layer is implemented by Graph Convolution and Residual Connection. In this work, the graph structure data is obtained based on the spatial adjacency relation between grid ocean regions. The Gated Temporal Convolution layer is implemented by a Gated Linear Unit (GLU), and uses Casual Convolution. The original output block uses a Gated Temporal Convolution layer and two fully connected layers sandwiched with an activation function to achieve the single step prediction.
In this work, we modified the structure of the output block to incorporate the SST regionality information and enable multi-step SST prediction. The results of the Gated Temporal Convolution layer can be represented by a matrix
, where
is the number of encoded feature dimensions. The STGCN model uses a linear transformation
, an activation function (sigmoid is used in our experiments) and another linear transformation
for multi-step prediction, and outputs the prediction results
, where
is the length of time (in days) for prediction. The formula is as follows:
We use one-hot encoding to encode the clustering results of each grid region to obtain
, where
denotes the number of clusters. Then, we concatenate the
and the one-hot encoding data to obtain
:
Finally, the output of the multi-step prediction is
, i.e.,
3.7.2. Adaptive Graph Convolutional Recurrent Network
Different from the previous graph convolution models, AGCRN proposes Node Adaptive Parameter Learning to solve the problem that all nodes share parameters and cannot learn the unique patterns of each node. Since each node has a set of special feature transformation patterns, the model is difficult to train and easy to overfit due to the large number of parameters. The authors of the AGCRN model use matrix factorization to reduce the number of parameters. In addition, AGCRN also introduces the idea of Data Adaptive Graph Generation to construct a dynamic graph, and it constantly updates the connection relationship between nodes in a dynamic way.
AGCRN uses the graph convolutions modified in the above way as an encoder to capture the spatio–temporal dependencies in historical sequences, and it then performs a convolution to make multi-step predictions. However, in this work, the spatial dependence of historical sequences consists of encoder learning representation and the SST regionality information. Therefore, we concatenate the two types of representation to perform multi-step SST prediction.
The AGCRN model uses a modified graph convolution module to encode the SST data and obtains the original encoded information matrix
, where
represents the length of the time series input to the prediction model and
is the number of encoded feature dimensions. Then, AGCRN uses the data at the last time of the encoded information matrix, i.e.,
, to perform a convolution for multi-step prediction and outputs the prediction results
, i.e.,
Similar to the STGCN model, we concatenate the
and the one-hot encoding SST regionality information to obtain
, i.e.,
Finally, a convolution operation is used to perform the multi-step SST prediction, and the prediction results are
, i.e.,
4. Experiments and Results
4.1. The Settings of Experiments
For learning long-term SST representation and generating spatial correlation scores, we sample historical SST data with a sliding window size of 365 days and a step size of 28 days to generate data samples. In this case, each data sample has 365 consecutive daily SST records. The generated data samples are then divided into training dataset, validation dataset and test dataset, according to the ratio of 3:1:1. The training dataset is used to learn the correlations between grid ocean regions and the validation dataset is used to validate such correlations. The test dataset is used to prove that the MuSTC method can also be applied for unknown data samples and has high generality. In addition, the encoded SST representation and spatial correlation scores for clustering analysis are from the validation dataset. In the experiments, the hyperparameters , , and are set to 16, 32, 16 and 16, respectively. For each hyperparameter, we used a grid search (i.e., enumeration) to try out values of 4, 8, 16 and 32 and found the best combination that balances performance and accuracy based on the experiment results. Meanwhile, the learning rate is set to , the batch size is set to 4 and the MAE is selected as the loss function.
For SST cluster generation, the North Pacific Ocean, the South Atlantic Ocean and the North Atlantic Ocean are clustered into 4, 5 and 7 clusters, respectively, using K-means clustering and Agglomerative clustering. The number of clusters is initially determined by the number of ocean currents in the area and then fine-tuned according to the quality of the clustering results.
For SST prediction, we also divide the SST data into training set, validation set and test set, with a ratio of 3:1:1, and predict the SST of the next 3 days with the historical SST data of 7 days. The STGCN model uses all the data and the AGCRN model samples a set of data every 10 days, since AGCRN is more complex than STGCN and requires higher cost to conduct the training. In the experiments, the learning rate of the AGCRN model is set to 0.003, the batch size is set to 4, and MAE is selected as the loss function. Meanwhile, the learning rate of the STGCN model is set to 0.001 and the batch size is set to 32.
We use four GeForce RTX 3090 graphics cards for training the MuSTC and SST prediction models.
4.2. Results
4.2.1. Results of Regionality Analysis
Figure 6 and
Figure 7 illustrate the clustering results for the North Pacific Ocean using K-means clustering and Agglomerative clustering, respectively. According to the illustration, the clustering results are generally consistent with the flow directions of the Oyashio Cold Current, N. Pacific Current and Alaska Warm Current. Compared with K-means clustering, the boundary of Agglomerative clustering results is smoother.
Figure 8 and
Figure 9 illustrate the clustering results for the South Atlantic Ocean using K-means clustering and Agglomerative clustering, respectively. Obviously, the clustering results match the flow directions of the S. Equatorial Current, Brazil Warm Current, Benguela Cold Current and South Atlantic Current. Compared with K-means clustering, the region division of Agglomerative clustering results is more reasonable and there are fewer outliers.
Figure 10 and
Figure 11 illustrate the clustering results for the North Atlantic Ocean using K-means clustering and Agglomerative clustering, respectively. According to the illustration, the clustering results are generally consistent with the flow directions of the E. Greenland Cold Current, Norwegian Warm Current, Labrador Cold Current, Gulf Stream Warm Current, N. Atlantic Drift Current, Canary Cold Current and N.Equatorial Current.
According to the clustering results in three sea areas, Agglomerative clustering has better performance than K-means clustering, e.g., smoother boundary of clustering results and fewer misclassified grid regions.
Meanwhile, the clustering results based on long-term SST representation and spatial correlation scores are better than the clustering results based on the original SST data. The clustering results based on the original SST data are basically similar to the division of latitudes. In contrast, the clustering results based on spatial correlation scores can well capture deeper ocean features such as currents. In addition, the clustering results based on spatial correlation scores are smoother between clusters and can well match the ocean currents that the other two data cannot match.
4.2.2. Results of SST Prediction
We integrate the learned regionality information by the MuSTC method into two spatio–temporal prediction models, i.e., STGCN and AGCRN, to conduct SST prediction.
Table 2 and
Table 3 present the prediction results in three ocean areas. After integrating regionality information, the error of the STGCN model is reduced by 3.14%, 1.61% and 1.80% in the North Pacific Ocean, the South Atlantic Ocean and the North Atlantic Ocean, respectively, in terms of MAE. Considering RMSE, the reduction is 1.95%, 1.39% and 1.28% in the three oceans. For the AGCRN model, the MAE error is reduced by 1.63% and 1.07% in the North Pacific Ocean and the North Atlantic Ocean, respectively, and increases slightly by 0.05% in the South Atlantic Ocean. The reduction of RMSE is larger, i.e., 4.94%, 0.74% and 1.43%, in the three oceans, respectively.
According to the results of SST prediction, the integration of regionality information can obviously improve the SST prediction accuracy of both the STGCN and AGCRN models in all three study areas, which indirectly indicates that the MuSTC method can well capture the regionality information of SST.
Figure 12 visualizes the prediction results of the AGCRN model.
5. Discussion
According to the results of the experiments, there are obvious regionality characteristics of global SST, and such regionality characteristics are highly related to latitude and ocean currents. We studied three ocean areas, i.e., the North Pacific Ocean, the South Atlantic Ocean and the North Atlantic Ocean. In the North Pacific Ocean, the regionality information learned by the MuSTC method successfully captures the geographic information, such as the Aleutian Islands, and the ocean current information, such as the Oyashio Cold Current, the N. Pacific Current and the Alaska Warm Current. As shown in
Figure 13, in the South Atlantic Ocean, the learned regionality information correctly matches the Benguela Cold Current. As shown in
Figure 14, in the North Atlantic Ocean, the learned regionality information captures the deep influence of the Labrador Cold Current on SST.
In the experiments, we integrated the regionality characteristics into the SST prediction model, which improved the prediction accuracy. The inclusion of regionality features can improve the accuracy of downstream tasks such as SST prediction, SST anomaly detection and chlorophyll concentration prediction. In addition, due to the high similarity of SST in the same cluster region, the biological system in the same region also has a certain correlation, and the results of regionality analysis can benefit biological protection, fishery resource utilization, etc.
6. Conclusions
In this work, we proposed the MuSTC method to achieve quantitative analysis of the regionality of SST. MuSTC consists of a temporal encoder module, a self-attention module and an SST cluster generation module, where the temporal encoder module learns the long-term SST representation, the self-attention module learns the spatial correlation scores between grid regions and the SST cluster generation module performs cluster analysis based on original SST data, encoded long-term SST representation and spatial correlation scores, respectively. According to the results of experiments, the clustering results of the MuSTC method generally match the distribution of the ocean currents, and the clustering results based on spatial correlation scores achieves the best performance.
We also demonstrate the validity of the MuSTC method by integrating the learned regionality information into two spatio–temporal prediction models, i.e., STGCN and AGCRN. For the STGCN, the integration reduces the RMSE by 1.95%, 1.39% and 1.28% in the North Pacific Ocean, the South Atlantic Ocean and the North Atlantic Ocean, respectively, and for the AGCRN, the reductions are 4.94%, 0.74% and 1.43%, which indicates that our quantitative analysis of SST regionality is effective.
In fact, Our method is general and could be used to enhance most data-driven SST prediction models. Considering that the original SST prediction models have already achieved a very high accuracy, the further improvement brought by our method should be very useful. In addition, our proposed MuSTC method is not limited to enhancing SST prediction models and is also applicable to the spatio–temporal prediction models of other oceanographic parameters. For example, it can be used to improve data-driven salinity prediction and chlorophyll-a prediction models.
Overall, the contributions of this work are threefold. First, we proposed the MuSTC method, which learns the representation of long-term SST and spatial correlation scores between grid regions with a deep temporal encoder module and a self-attention module, sequentially, and clusters grid ocean regions based on the original SST data, encoded SST representation and spatial correlation scores, respectively, to uncover the quantitative regionality of SST. Second, we integrated the regionality information of SST into the spatio–temporal SST prediction models to enhance the prediction performance. Third, we conducted extensive experiments over multiple datasets, and the results indicate that the learned regionality information of SST matches the distribution of global ocean currents and can be used to improve the accuracy of existing SST prediction models.
In this work, SST regionality analysis and SST prediction are conducted separately, which may hinder the transmission of information between the two tasks. Therefore, in future work we plan to develop a unified model to combine the two tasks and achieve collaborative optimization. In addition, applying the regionality information of SST to help address other ocean issues, e.g., climate analysis and global warming prevention, is also a promising direction.