1. Introduction
Solar wind is a continuous, high-speed stream of charged particles emanating from the Sun that travels through the solar system and causes significant and widespread effects on Earth. In addition to endangering astronaut safety, these effects include disruptions in power grids, satellite communications, and navigation systems. Therefore, it is essential to forecast near-Earth solar wind conditions precisely in order to mitigate the adverse impact of space weather on various human activities and daily life. Reliable solar wind forecasts play a vital role in establishing a comprehensive space weather prediction system, enhancing the ability to provide timely alerts and effective responses to sudden changes in the space environment. This capability is essential for safeguarding technological infrastructure and ensuring the smooth functioning of modern society.
Extensive research has been conducted by scientists over the years with the objective of developing predictive models of solar wind variations. The following elements represent the principal components of contemporary predictive modeling methodologies: (1) Physics-based modeling based on magnetohydrodynamics (MHD)—Zhou et al. [
1] used a three-dimensional MHD model to explore the propagation characteristics of coronal mass ejections (CMEs); Shen et al. [
2] improved the total variation diminishing MHD model for the solar corona–interplanetary medium by enhancing new boundary conditions; Guo et al. [
3] conducted numerical MHD simulations to study the relationship between interstellar shocks and solar wind events; The National Oceanic and Atmospheric Administration (NOAA) of the United States of America introduced the ENLIL model [
4], while the Space Environment Modeling Centre (SEMC) released the Space Weather Modeling Framework (SWMF) [
5]; Wu et al. proposed the 3-D MHD model [
6], while the Hybrid Heliospheric Modeling System (HAMS) combines the 3-D MHD model [
7] with the Hakamada–Akasofu–Fry (HAF) model. The COrona-INplanetary (COIN) model developed by the SIGMA group at the Chinese Academy of Sciences (CAS) [
8] represents an improvement on the Interplanetary Total Variation Diminishing (IN-TVD) model. (2) Empirical or semi-empirical modeling based on statistics, for example, Bussy et al. [
9], established probability distribution functions linking the current solar wind and slope with the solar wind one solar rotation period in the future; The Wang–Sheeley–Arge (WSA) model proposed by Wang et al. [
10] is based on the negative correlation between the observed solar wind speed and the coronal magnetic field expansion factor at the source surface; Arge et al. [
11] improved the WSA model by introducing the correlation of the continuum function. Riley et al. [
12] further refined the model by introducing an angular distance from the coronal hole boundary. In addition, the Potential Field Source Surface (PFSS) model [
13] is another semi-empirical model that can be used to compute the coronal magnetic field and derive the fs and b parameters. Owens et al. [
14] proposed a simple method based on the Sun’s 27-day rotational period for solar wind parameter prediction. In addition, Innocenti et al. [
15] used a Kalman filter for data assimilation, which significantly improved the model performance compared to the baseline model, and Liu et al. [
16] used a support vector machine algorithm to predict solar wind speed.
However, the irregular and intense solar wind features are shaped by irregular and vigorous solar activities and geomagnetic storms [
17]. These pronounced irregularities lead to unpredictable oscillations and intermittent gaps, making the temporal structure complex and challenging to model using conventional methods. The introduction of deep learning models facilitates the identification of patterns in irregular solar wind properties and the production of precise forecasts.
The development of machine learning techniques has led to further optimization of solar wind predictions. Machine learning models can adaptively learn and make inferences about new or future data and play a role in long-term prediction. Machine learning algorithms such as regression algorithms, Support Vector Machines (SVM) [
18], Random Forest (RF) [
19], Bayesian Additive Regression Trees (BART) [
20], and K-Nearest Neighbor (KNN) [
21] are widely used in the solar wind output forecasting and wind speed prediction fields. Support vector machines are widely used in solar wind speed [
22] prediction, and researchers have improved them to adapt to the characteristics of solar wind data. Parallel SVM (PSVM) [
23] and Least Square SVM (LSSVM) [
24] is a model that improves the robustness of SVM and the accuracy of solar wind output prediction. Lahouar et al. [
25] used RF for hour-ahead solar wind output prediction, which does not require a lot of tuning, and has the advantage of being able to do so. Shi et al. [
26] proposed a two-stage feature selection and decision tree restructuring method to improve the prediction accuracy, efficiency, and robustness of the RF model. Comparisons were made with the decision tree [
27]. Wang et al. [
28] used the RF algorithm for wind speed input feature selection, which simplified the model structure, reduced the training time, and improved the accuracy and generalization ability.
Common deep learning prediction models include autoregressive (AR) [
29], moving average (MA) [
30], autoregressive moving average (ARMA) [
31], and autoregressive integrated moving average (ARIMA) [
32]. Poggi et al. [
33] used the AR model in wind speed prediction. The study by Liu et al. [
34] used ARMA for wind speed prediction, while Magadum et al. [
35] used a calibrated ARMA model to improve the accuracy of short- and medium-term wind power prediction. However, these methods are mainly applicable to linear and smooth time series, making it difficult to deal with non-linear and non-smooth series. To overcome this problem, recurrent neural networks (RNNs) [
36], especially popular ones such as long short-term memory (LSTM) [
37] and gated recurrent unit (GRU) [
38], have been introduced in deep learning variants, performing excellently in time series forecasting, Backhus et al. [
39] predicted the power output of different wind turbine data, which were calculated using techniques such as LSTM. Additionally, RNN models based on attention mechanisms are widely used in time series forecasting tasks.
As a classic deep learning model, Transformer plays an indispensable role in long-term sequence prediction. Self-attention blocks, a cornerstone in Transformer [
40], hold substantial innovative significance in natural language processing. Recent models, including Sparseformer [
41] for sparse attention, Switchformer [
42] for multimodal data, Compressive Transformer [
43] based on Hadamard transform [
44], and Linformer [
45] with low-rank approximation, contribute to handling diverse data efficiently. These models advance natural language processing and broaden practical applications. Self-attention mechanisms [
46] play a crucial role in time series forecasting. Transformer excels in various prediction tasks. However, its limitations in handling long sequences led to the development of enhanced models. LogSparse Transformer [
47] employs sparse attention mechanisms, Pyraformer [
48] reduces complexity with a pyramid attention approach, Informer [
49] uses a hierarchical structure and masking mechanism, Autoformer [
50] introduces adaptive feed-forward mechanisms and reversible embeddings, FEDformer [
51] innovates with a federated learning framework, and InParformer [
52] incorporates interactive attention for enhanced temporal pattern extraction. These models advance time series forecasting, offering diverse solutions for practical applications.
To address the challenge of significant temporal irregularities in solar wind sequences, this study proposes an innovative Transformer-based approach, focusing on two pivotal components: Multi-Mode Decomp Block and Mode Independence Attention. In the first step, the Multi-Mode Decomp Block adapts to the dynamic features of solar wind data, effectively extracting inherent patterns for adaptive decomposition and modeling. The subsequent application of Mode Independence Attention further fortifies predictions by capturing relationships between time series. This novel approach establishes a resilient long-term forecasting model, mitigating the impact of irregular features and providing a precise solution for solar wind prediction. This study introduces novel methods to tackle the challenges of irregular features, particularly in solar wind forecasting, with considerable academic and practical value.
In summary, this study presents the following contributions:
Multi-Mode Decomp Block: Introducing a block named Multi-Mode Decomp for Multi-Mode decomposition of time series. This innovative method enhances the extraction of in-depth correlated information within the sequence, revealing latent patterns and structures in the solar wind data.
Mode Independence Attention: A self-attention module is proposed, the Mode Independence Block, which computes attention independently for each subsequence. This module is designed to focus on capturing the correlations in time series, enhancing the influence of valid features on prediction, and reducing the adverse effects of irregular features on overall forecasting accuracy.
Experimental Evaluations: Conducting extensive experiments on the solar wind dataset, the results showcase MoCoformer’s state-of-the-art performance in the realm of time series solar wind prediction.
3. Methods
This section will provide a detailed explanation of the following: (1) the overall structure of the MoCoformer model, as illustrated in
Figure 2; (2) the Multi-Mode Decomp Block; (3) the Mode Independence Attention, as illustrated in
Figure 3.
As illustrated in
Figure 2, the Encoder in the MoCoformer model employs a multi-layer structure:
, where
represents the output of the second-layer Encoder. Each Encoder layer consists of a Regularity Correction Enhancement and Feed-Forward.
denotes the embedded historical sequence. The specification of the Encoder can be expressed as follows:
where
,
,
represents the n-th decomposed sequence after the i-th module in the first layer.
The Decoder also employs a multi-layer structure, for example,
, where
, indicating the output of the l-th Decoder layer. Each Decoder layer includes Regularity Correction Enhancement, Regularity Calibration Prediction, and Feed-Forward. The specifications of the Decoder can be represented as follows:
where
represents the m-th decomposed sequence after the i-th module in the first layer.
denotes the weight for the i-th decomposed component of
.
The final prediction result is obtained by summing all the decomposed components. Specifically, it is obtained by multiplying each decomposed component by its corresponding weight and then summing them.
3.1. Multi-Mode Decomp Block
To explore the correlation information in solar wind time series, we introduce a time series decomposition module based on variational mode decomposition (VMD). This module is derived from the Whale Optimization Algorithm (WOA) proposed by Mirjalili et al. [
54], which simulates the foraging behavior of whale pods, decomposing historical time series into multiple component sequences, with each component sequence representing a unique time oscillation pattern. This Multi-Mode decomposition technique is an effective tool for revealing the complex internal structure of data, thereby enhancing the accuracy of feature interpretation and analysis.
Assume that the initial optimal position is the current prey’s location. Other predators initially move towards this target position and continuously update their positions:
where t represents the current iteration number,
represents the current position of the whale, and
represents the movement step length when enclosing prey during each iteration:
where rand represents a random number generated between [0, 1], and as the loop iteration increases,
acting as the contraction factor will linearly decrease from 2 to 0. Its expression is as follows:
where
represents the maximum number of iterations set for the run.
Whales achieve continuous shrinking of the surrounding area during the prey search process through Equations (
19) and (
20). At the same time, each whale updates its distance from the target position through a spiral pattern, and the mathematical model simulating the spiral pattern update can be expressed as follows:
where
represents the distance between the i-th whale and the current optimal position, b is a constant coefficient, and l is a random number in the range of [1, −1].
To synchronize the contraction of the surrounding and the spiral update, the selection update of the two methods is achieved through probability. The mathematical expression for this is as follows:
When
, randomly selecting whales to move away from the current optimal whale enhances the algorithm’s global search capability. The mathematical expression for this is as follows:
where
represents the vector of the randomly selected whale’s position.
Multi-Mode Decomposition Steps The combination optimization of VMD parameters k and is carried out through the following steps:
Initialize the parameters k and as k = 8 and = 2000, respectively, to avoid incomplete signal decomposition.
Perform VMD decomposition on the oscillation signal.
Calculate the fitness value for each combination of parameters and update the best fitness value when it exceeds the current value.
Determine whether to terminate the iteration. If t is less than , increment t by 1 and update the position of the whales. Otherwise, terminate the iteration and save the best results.
In summary, the Multi-Mode Decomp Block is improved using the WOA method to extract IMF components with minimum envelope entropy, enhancing its ability to capture deep-level intrinsic correlations in sequences. This enhancement boosts global search capabilities and robustness, aiding in better time series analysis and decomposition.
3.2. Mode Independence Attention
Figure 3 illustrates the self-attention mechanism, which employs parallel connections to systematically reduce the impact of feature irregularities on predictions. This attention mechanism operates based on individual modes, calculating attention distribution independently. It assesses correlations among distinct sequences and assigns varying weights according to the correlation strength. Finally, we combine these weighted sequences to generate the final predicted sequence.
This model adopts the representation scheme of the classical Transformer model. The inputs consist of queries, keys, and values, denoted as
, respectively. In the cross-attention, the Decoder obtains queries through
, where
. The Encoder obtains keys and values through
and
, respectively, where
. The classical attention can be formalized as a dot product and softmax operation based on queries and keys. It calculates the weights between queries and keys, and then applies these weights to the corresponding values for weighted summation, resulting in the final context embedding.
Query–Key Interaction In the computation of query and key, a linear projection matrix
is employed. Initially, the original key
undergoes projection to a key of dimension (
) through linear mapping, facilitating dimension transformation. Next, perform FFT transformations on k and q to bring them into the frequency domain for further computation. The ultimate linear self-attention operation, illustrated in
Figure 3, incorporates a sequence of linear projections and scaled dot-product attention calculations.
SelectSeg Initially, m linear projection matrices
(where j = 1, 2, ..., m) are introduced for projecting the original value
to a projection value layer of dimension (
) through linear mapping. This process achieves dimension transformation, generating m-value matrices that have undergone diverse filtering.
where i represents the layer number, and j = 1, 2, ..., m represents the different projection matrices formed.
The m value matrices are input into the SelectSeg layer, where each matrix is partitioned into n subsequences of length l. The SelectSeg layer is a matrix of length l. For each subsequence, correlations with the other (m − 1) sets of values are computed using a correlation formula.
where
—where i, j = 1, 2, ..., m—the calculation is based on the similarity between two values, with a larger value indicating lower similarity. Similarly, for
, where q = 1, 2, ..., m and p = 1, 2, ..., n, the total similarity between each value matrix
in the n-th subsequence and others is computed. As the numerical value increases, it indicates a decrease in irregularity and a higher weightage. Weighted sums are calculated for all value matrices within the same subsequence, resulting in the final subsequence matrix. Finally, the concatenation of all prediction subsequences forms the ultimate value matrix
.
Formulating the Formula Following the outlined procedure, we derive the computational outcomes for query–key interactions and a value matrix with reduced irregularity. Subsequently, we employ an effective self-attention mechanism through attentive calculations on the sequence data, facilitating the capture of crucial contextual information.
where
represents the activation function.
In summary, the Mode Independence Attention employs independent distribution computation, random subsequence extraction, correlation comparison, and enhancement to capture relationships between sequences. This design aims to reduce the negative impact of irregular temporal patterns on prediction accuracy and enhance the accuracy of solar wind time series prediction.
3.3. Complexity Analysis
The time and space complexity of VMD and WOA in the Multi-Mode Decomp Block in MoCoformer is O(L). In the Mode Independence Attention, although the FFT transformation complexity is O(L log(L)), our model achieves fast execution by using pre-selected Fourier basis sets, thus reducing the complexity of query–key interactions to O(L). For the SelectSeg computation, the complexity grows linearly with the length of the prediction due to the fixed number and length of randomly selected patterns, so the overall complexity is O(L).
4. Results
4.1. Datasets
The dataset utilized in this study was provided by the Space Environment Center of the National Aeronautics and Space Administration (NASA) and comprise a multitude of parameters pertaining to the solar wind and the Earth’s magnetic field. Data on solar wind from the omni dataset were chosen, and correlation analyses were carried out between the IMF and Field Magnitude data for the time frame spanning from 1 January 2006 to 31 December 2015. The omni-Field-Magnitude dataset is more concerned with changes in the Earth’s magnetic field, which is important for geophysical research and navigation. In contrast, the omni-IMF dataset is more closely related to solar activity and can be used to predict the effects of solar activity on the Earth, such as geomagnetic storms.
The elements included in the omni-Field-Magnitude dataset are DST Index, Bulk flow longitude, Flow Pressure, MAC, Proton density, IMF, and Proton temperature. A visual presentation of the dataset is illustrated in
Figure 4. By analyzing the variations in mean magnetic field strength, scientists can gain insight into the behavior of the solar wind’s magnetic field, including the occurrence of magnetic storms, the presence of magnetic structures, and the effect of solar activity on Earth’s magnetosphere. The features exhibit significant fluctuations and extreme instability, including significant temporal irregularities.
The elements included in the omni-IMF dataset are Plasma beta, Field Magnitude, MAC, Proton density, Alfven mach number, Kp*10, and Proton temperature. A visual presentation of the dataset is illustrated in
Figure 4. The omni-IMF dataset represents a valuable resource for the study of the behavior and properties of the solar wind, with a particular focus on the interplanetary magnetic field. The dataset provides an opportunity to explore the complex relationship between the solar wind and Earth’s magnetic environment, contributing to the advancement of understanding of space weather and its impact on Earth. It can be observed that these features exhibit significant volatility, and their time series show clear instability and strong temporal irregularities.
To more effectively illustrate the irregularity of the solar wind sequence, it is compared to common time series, as illustrated in
Figure 5. The Electricity dataset contains hourly electricity consumption data from 321 customers from 2012 to 2014. The Exchange_rate dataset collects daily exchange rate data from 1990 to 2016 for eight countries. The ETT dataset contains load characteristic data for seven types of oil and power transformers from July 2016 to July 2018. A comparison of these datasets reveals that the solar wind sequence exhibits a higher temporal irregularity than other common sequences.
4.2. Evaluation Metrics
Three key evaluation metrics for regression problems—namely, MAE, MAPE, and RMSE—are used to provide a comprehensive assessment of the model’s fit performance. The mathematical definitions of these metrics are provided below:
where
represents the observed values and
represents the predicted values, RMSE and MAE are used to measure the magnitude of the error between the predicted series and the actually observed series, and MAPE is used as a measure of the percentage of error relative to the observed values. RMSE focuses on the overall size of the prediction error and is more sensitive to large errors, while MAE is concerned with the average level of prediction error, is insensitive to outliers, and places more emphasis on the overall average performance. MAPE emphasizes the average level of relative error, expresses error as a percentage, and is suitable for comparing data on different scales. For a given model, the smaller the values of these metrics, the higher the predictive accuracy of the model and the better it fits the real data.
4.3. Multivariate Results
Table 1 compiles the evaluation outcomes for multivariate prediction on the two solar wind time series datasets. The five main methods included in the evaluation are the empirical solar wind method WSA [
10]; the common deep learning algorithms LSTM [
37]; GRU [
38]; and the Transformer-based long-term series prediction methods Transformer [
40], Autoformer [
50], and FEDformer [
51], along with the method MoCoformer proposed in this study. The experimental configuration maintains a consistent input length of 96, with fixed prediction lengths of 96, 192, 336, and 720 for both training and evaluation, enhancing comparability across the methods.
In the context of varying prediction durations, it is crucial to acknowledge that the training and test datasets will differ. For instance, the randomized time window of Case96 (eight days) and of Case192 (twelve days) could serve as reasonable baselines for training various quarterly cases. It is crucial to address the issue of continuity, as time series are inherently distinctive. The sliding window method is employed, whereby the final 30% of a continuous data segment is designated as the test set while the initial 70% is utilized as the training set. This method ensures temporal continuity between the training and test sets.
The investigation presents an analysis of MAE, RMSE, and MAPE metrics in different models across varying prediction intervals. Across all metrics, MoCoformer demonstrated the highest level of performance. Remarkably, in the omni-Field-Magnitude dataset, MoCoformer surpassed the second-ranked model with an average improvement of roughly 0.76% in MAE, 1.51% in RMSE, and 3.59% in MAPE. Likewise, within the omni-IMF dataset, MoCoformer displayed an average enhancement of about 1.40% in MAE, 1.19% in RMSE, and 1.37% in MAPE compared to its closest competitor. These findings underscore the exceptional predictive power of MoCoformer in long-term sequence forecasting, showcasing its superior adaptability to temporal irregularities in solar wind sequences and resulting in more precise predictions compared to alternative models.
4.4. Univariate Results
Table 2 compiles univariate prediction evaluation results for various methods on the two solar wind time series datasets. Experimental settings maintain a fixed input length of 96, with prediction lengths at 96, 192, 336, and 720 for training and evaluation.
The research presents the evaluation results of MAE, RMSE, and MAPE metrics across different models and prediction horizons. MoCoformer demonstrated superior performance across a range of prediction horizons. Specifically, on the omni-Field-Magnitude dataset, MoCoformer exhibited a performance boost of around 2.16% in MAE, 1.92% in RMSE, and 2.30% in MAPE compared to the second-ranked model. Likewise, on the omni-IMF dataset, MoCoformer showcased a significant improvement of about 13.54% in MAE, 4.27% in RMSE, and 10.36% in MAPE over the second-ranked model. These results highlight the exceptional predictive capabilities of MoCoformer in long-term sequence forecasting.
To validate the effectiveness of the proposed model in long-term solar wind sequence prediction applications, a comprehensive case study is conducted.
Figure 6 illustrates the visual results of univariate prediction experiments conducted on the omni-Field-Magnitude dataset, showcasing comparisons between the outputs of multiple models and ground truth curves. This experiment aims to evaluate the predictive capability and accuracy of the proposed model in real-world applications.
Figure 6 illustrates that MoCoformer accurately reflects the trends and fluctuations in the true values, exhibiting superior performance in capturing the peaks and troughs of the time series compared to other models. In particular, in Cases 96 and 196, MoCoformer demonstrates enhanced sensitivity in capturing changes in the true values compared to other models, enabling accurate predictions of increases and decreases in the corresponding periods. However, with regard to numerical values, it appears that MoCoformer tends to be relatively conservative in its predictions. For example, in Case 720, although MoCoformer is able to identify changes in true values, it still encounters difficulties in modeling large fluctuations in them. This observation is consistent with the results of the three evaluation metrics, which indicate that while there have been notable improvements in MAE and MAPE, RMSE values remain substantial.
The results of a univariate prediction experiment conducted on the omni-IMF dataset are presented in
Figure 7. The figure depicts comparative curves between the ground truth and the outputs of various models.
Figure 7 illustrates that the MoCoformer model outperforms the other models in capturing trends and changes in the true values, which is particularly evident in Cases 192 and 720. It is also noteworthy that although the MoCoformer model excels in trend capture, its numerical predictions tend towards conservatism, implying a tendency for more conservative numerical outputs. This becomes evident when modeling prominent exceptional values, as illustrated in Case 336. Moreover, the limited size of the omni-IMF dataset contrasts with the relatively higher prominence of exceptional values. No model demonstrates superior performance in predicting these exceptional values. This observation is consistent with the trend observed in the three assessment metrics. The MAE and MAPE demonstrate a more pronounced improvement, while the RMSE shows a comparatively slower improvement. Overall, there is considerable scope for further enhancement and improvement of the models’ performance on the omni-IMF dataset, particularly in the prediction of exceptional values.
In comparison to other models, MoCoformer demonstrates superior performance in identifying trends and fluctuations in actual values. It is able to recognize trends with exceptional precision. Nevertheless, its numerical predictions frequently exhibit a tendency towards conservatism, as evidenced by the table, which shows that there are notable improvements in MAE and MAPE, yet RMSE values remain relatively high. In conclusion, it can be stated that MoCoformer makes a significant contribution to long-term sequence prediction in the solar wind domain. However, the complexity of solar wind data presents numerous challenges for long-term solar wind sequence prediction research.
4.5. Ablation Studies
For module validation, univariate prediction experiments were conducted on the omni-Field-Magnitude dataset. The proposed models and their variants are summarized below for comparative analysis.
MoCo-rM: removed the Multi-Mode Decomp Block from the model.
MoCo-rMrSTL: replaced the Multi-Mode Decomp Block with the STL decomposition module.
MoCo-rCrT: substituted the Mode Independence Attention proposed in this chapter with the auto-correlation module from the Transformer model.
MoCoformer: The original model proposed in this chapter.
The experimental results of each module combined with the backbone network structure are shown in
Table 3.
The table illustrates the significance of the Multi-Mode Decomp Block in enhancing model performance across various prediction lengths, thereby underscoring its positive influence on long-term solar wind forecasting. Higher numerical values were observed upon replacement of the STL module with this block, indicating the challenge of capturing temporal irregularities through conventional trend–seasonal decomposition for comprehensive pattern comprehension. Furthermore, the substitution of Mode Independence Attention with the Transformer’s auto-correlation module resulted in inferior outcomes in shorter prediction lengths, thereby underscoring the beneficial impact of the proposed Mode Independence Attention in solar wind prediction tasks.
However, as the length of the predicted time series increases, the predictions of the two models converge. This indicates that as the complexity of the solar wind increases in longer time series predictions, the predictions of both modules gradually become more similar.
4.6. Multi-Mode Decomp Block Decomposition Experiment
The number of layers in the Multi-Mode Decomp Block is of significant importance in influencing the prediction process and the resulting outcomes. An increase in the number of layers allows for a more precise capture of diverse frequency components and modal features, thereby enhancing the modal information extracted from the original signal. The selection of the optimal layer count necessitates a delicate balance in order to ensure optimal model performance. A deficiency in the number of layers may result in the loss of information, whereas an excess could introduce noise or irrelevant details. The results of the experiments conducted on the layer count in the Multi-Mode Decomp Block are presented in
Figure 8, which illustrates the impact of this variable on the performance of the model.
The experimental results, corroborated by
Figure 9, highlight the significance of choosing an optimal number of decomposition layers. The figure illustrates that when the number of IMF layers is set to six, optimal outcomes are achieved, with the essential information captured while avoiding noise and unnecessary details. The acquisition of information may be inadequate with five or fewer layers, resulting in less accurate predictions. Conversely, the introduction of seven or more layers may result in the inclusion of excessive details or noise, which could have a detrimental impact on the performance of the model. This underscores the pivotal importance of determining an optimal number of layers for specific conditions.
4.7. Subsequence Length Experiment
The accuracy of the predicted time series in the SelectSeg module of Mode Independence Attention is intricately linked to the subdivision size for the prediction length. This size determines the proportion of irregular segments in the final forecast and is crucial for overall accuracy. An inadequate subdivision length may be insufficient to effectively reduce the proportion of highly uncertain segments, while an excessively large length could inaccurately diminish the weight of effective segments. Experiments were conducted on the impact of the subdivision length in the SelectSeg module, and the findings are presented in
Figure 10.
The results demonstrate a significant improvement in model performance resulting from strategic parameter selection. Although the impact of subsequence length selection in SelectSeg is relatively minor in short-term sequence prediction, it becomes more pronounced as the prediction sequence length increases. This underscores the importance of meticulously selecting the subsequence length in a manner that is conducive to enhanced prediction efficacy.
4.8. Limitations
In this study, we have conducted preliminary explorations and research into the prediction method of long-term solar wind series, considering the irregularity and uncertainty of time. However, due to the limitations of the authors’ abilities, the optimal approach has not yet been identified. Further exploration and research are required to address the remaining issues.
(1) The datasets included in this study encompass seven feature variables. It should be noted, however, that the solar wind is affected by a multitude of factors that extend beyond the scope of this study. Moreover, the datasets employed in this study have an hourly temporal resolution. Nevertheless, there is a plethora of more granular information on solar wind features currently available. Consequently, future studies may wish to consider incorporating this additional detail on solar wind forecasts as an input variable that affects the forecast results.
(2) During the experimental process, it was observed that the long-term series prediction exhibited reduced sensitivity to outliers. This indicates that the dataset contains a certain number of outliers, and the proposed model is unable to predict these outliers effectively. This is an issue that requires further in-depth research and exploration. Consequently, future research should focus on differentiating between erroneous values and outliers in solar wind data and investigate how to enhance the detection and processing capabilities of the model in the presence of a considerable number of outliers in the dataset.
(3) Solar wind features encompass a diverse array of information types, including data, images, and frequencies. Future research will explore the potential of multimodal studies to fully leverage the diverse feature resources available and facilitate information sharing across different modalities. Integrating multiple information sources can lead to more accurate prediction and analysis of the solar wind sequence, thereby enhancing the performance and reliability of prediction models.
5. Conclusions
In this study, a long-term solar wind correlation coefficient prediction method based on Multi-Mode Decomp Block and Mode Independence Attention is employed to construct a deep learning model for long-term series prediction. The Multi-Mode Decomp Block exhibits outstanding performance in extracting information from multiple time modes, providing a more comprehensive feature analysis that aids in better understanding the complexity of the features. The Mode Independence Attention calculations on a per-mode basis enhance the feature analysis and contribute to a deeper understanding of feature complexity.
The experimental section predicts solar wind correlation coefficients for two datasets, encompassing multivariate as well as univariate sequences with different prediction lengths of 96, 192, 336, and 720 steps, respectively. The experimental comparison of the proposed model with existing models demonstrates that the proposed model, MoCoformer, outperforms the others in all cases and metrics. This validates the effectiveness of the model in predicting long-duration solar wind sequences. Furthermore, ablation experiments were conducted on the model components, demonstrating the efficacy of the two novel modules proposed in this chapter for solar wind dataset prediction. Moreover, adjustments and experimental comparisons of the model’s crucial parameters were conducted to optimize their fit with solar wind predictions. The experimental results presented in this series provide additional evidence of the feasibility and superior performance of the methods proposed in this chapter.
It is regrettable that, despite the capacity of Mocoformer to diminish temporal irregularities in forecasts compared to state-of-the-art models, there persists a degree of uncertainty in solar wind long-time series forecasts due to the complexity of solar wind data, among other challenges. This is true regardless of the accuracy of the measurements and models used for forecasting. This has necessitated a more comprehensive investigation of solar wind.