1. Introduction
The study of the effects of space weather on the environment is one of the prioritized areas of geophysics [
1]. One of the components of the environment is the near outer space of the Earth. Among the parts of this space is the ionosphere, which affects the operation of technological systems such as global navigation satellite systems (GNSS), satellite communications, and other space communications applications [
2]. The state of the ionosphere is described by such parameters as critical frequency foF2, maximum height hmF2 of the layer F2, total electron content TEC, and others. The study and prediction of TEC attracts special attention, since its values determine the accuracy of positioning [
3,
4]. Currently, among the methods of forecasting the ionospheric parameters, methods using neural networks are distinguished as one of the most diverse and undergoing active development [
5]. However, since each region of the globe has different properties related to the influence of space weather, it is necessary to choose a method that can provide the most accurate TEC forecast for each region. This paper attempts to select from a variety of methods the most appropriate one for the European region. For this purpose, both the conventional methods’, long short-term memory LSTM [
6], gated recurrent unit GRU [
7], and new ones including bidirectional [
8] and temporal convolutional networks TCN [
9], architectures are used. From the extensive literature, we selected those papers that include quantitative estimates of forecast accuracy in different regions.
An example of using the LSTM architecture for TEC prediction is a paper which proposed a method with an advance time of 24 h [
10]. The authors used CODE (Center for Orbit Determination in Europe) map data with a 1 h step for Beijing station, as well as an index solar radio flux at 10.7 cm (F10.7) of solar activity and geomagnetic activity ap as input parameters. The dataset consists of 1 January 1999 to 1 September 2016. The root mean square error (RMSE) was 3.5 TECU. It is interesting to note that the test years were 2001 and 2015, i.e., years of high solar activity. Sun et al. [
11] developed and applied the bidirectional long-term memory (BiLSTM) model to TEC prediction on the same database and using F10.7 and ap indices as in [
10]. The results obtained were compared with other methods. The RMSE for BiLSTM was 3.35 TECU; 3.48 TECU was obtained for single-layer LSTM and 3.67 TECU was obtained for double-layer LSTM. A feature that improved the LSTM result is the ability to utilize both the past and future data to make prediction.
Sivakrishna et al. [
12] applied the BiLSTM algorithm to one-hour-ahead forecast the Indian regional TEC map. The training data included 261 days (from November 2015 to September 2016). The validation dataset included 24 days in September 2016, and the testing data were 31 days on October 20. The maps were plotted against data from 26 Global Positioning System stations and compared with Artificial Neural Network (ANN) and LSTM model results during both geomagnetic quiet (20 October 2016) and disturbed (14 October 2016) periods. For quiet conditions, the difference between the experimental and predicted values was from 1 TECU up 2.5 TECU for BiLSTM, and from 1 TECU to 3.5 TECU for disturbed conditions and were better than the results for ANN and LSTM. The LSTM model performed better than the ANN model. A comparison between the Indian regional ionospheric forecast of TEC maps with and without solar and geomagnetic indices (F10.7, Kp) as input to BiLSTM during quiet and disturbed periods showed little improvement.
The LSTM architecture has become the reference model against which the results of new approaches are compared, and most often, the results of this comparison are given when they are better than for LSTM. However, the LSTM method continues to be modified. In Chen et al. [
13], several architectures, single-step self-prediction model, single-step auxiliary prediction model, multistep self-prediction model, and multistep auxiliary prediction (MSAP) model, were used for TEC prediction based on International GNSS Service (IGS) maps with an advance time of 24 h, 48 h, and 144 h. Their algorithms are presented in the appendix of the paper [
13]. The database covered a set of TEC values from January 2011 to December 2019 and indices included the following: the sunspot number R, F10.7, ap, and Dst. In contrast to traditional data processing methods, the cycle consisted of 90 days. Among these, the first 30 days are considered as the training set, the middle 30 days are considered as the validation set, and the last 30 days are considered as the test set.
Comparing the interval within 24 h, the averaged MAE (RMSE) values calculated from the single-step self-prediction model within 48 h increases from 2.721 (3.806) TECU to 3.259 (4.483) TECU. For the MSAP model, these values were 2.116 (3.033) TECU for 24 h and 2.225 (3.175) TECU for 48 h. As the prediction interval increases, the differences become larger: 5.53 (7.427) TECU and 2.485 (3.511) TECU for MSAP and 144 h.
A TEC forecasting model based on deep learning was proposed by Tang et al. [
14], which consists of a convolutional neural network (CNN), a long short-term memory (LSTM) neural network, and an attention mechanism. The attention mechanism is added to the pooling layer and the fully connected layer to assign weights to improve the model. The dataset included TEC from 24 GNSS stations of China Network over the 9-year period (from 2010 to 2018) to predict the value of the TEC 24 h ahead. The data from 2010 to 2017 is used as the training set and the data from 2018 is used as the test set. Bz, Kp, Dst, and F10.7 indices are used to characterize solar and geomagnetic activity. The results were compared to the accuracy of models such as NeQuick, LSTM, and CNN-LSTM. The statistical results were obtained from the data of 24 GNSS stations, 24 h a day and 365 days a year in the test set. For all stations, the MAE was 2.6 TECU for the NeQuick, 1.53 TECU for the LSTM, 1.36 TECU for the CNN-LSTM, and 1.17 TECU for the CNN-LSTM-Attention model. The RMSE for the NeQuick was 3.59 TECU, 2.25 TECU for the LSTM, 2.07 TECU for the CNN-LSTM, and 1.87 TECU for the CNN-LSTM-Attention model. The latitudinal and longitudinal dependencies of forecast accuracy were also determined from the data of six stations with similar latitudes and longitudes. RMSE decreases with the increase in latitude near the same longitude, which is about 1 TECU in the mid-latitude region and between 1.6 and 3 TECU in six stations around 30°E. Accuracy estimates depending on the level of disturbance were performed separately for days with Kp < 3 and Kp ≥ 3. For a quiet day on 21 August 2018, the statistics showed RMSE = 4.14 TECU, MAE = 3.16 TECU, and the mean error ME = −0.12 TECU for the NeQuick model. For the LSTM model, statistical estimations have provided the following results: RMSE = 4.29 TECU, MAE = 3.0 TECU, and ME = −1.5 TECU. For the other architecture, CNN-LSTM, the following error values were obtained: RMSE = 4.14 TECU, MAE = 2.98 TECU, and ME = −1.62 TECU. For the proposed CNN-LSTM-Attention architecture, the results were better: RMSE = 3.99 TECU, MAE = 2.81 TECU, and ME = −1.46 TECU. During the perturbed day of 26 August 2018, the magnetic storm led to big changes of TEC, and the accuracy of all methods decreased. For the NeQuick model, RMSE = 4.99 TECU, MAE = 3.67 TECU, and ME = 1.71 TECU. The forecast for the LSTM method provided the following values: RMSE = 5.19 TECU, MAE = 3.76 TECU, and ME = −2.07 TECU. These values decreased to RMSE = 4.43 TECU, MAE = 3.06 TECU, and ME = 0.21 TECU for CNN-LSTM and to RMSE = 4.11 TECU, MAE = 2.81 TECU, and ME = −0.64 TECU for the CNN-LSTM-Attention architecture.
Boulch et al. [
4] proposed the ConvRNN method and compared the results with the data presented in the papers [
15,
16,
17] for specific Chinese stations and on a global scale, with a lead time of 2 and 48 h. In the range 22°N–39°N, an increase in the forecast accuracy with increasing latitude was obtained. On a global scale, the method in [
17] provided RMSE = 3.1 TECU and the method in [
4] gave 0.89 TECU for a lead time of 48 h and 0.38 TECU for 2 h ahead.
Iluore and Lu [
18] analyzed the data from the equatorial station MAL2 (Kenya) and determined that GRU was more accurate than LSTM, Multilayer Perceptron (MLP), GIM_TEC, and the IRI-Plas 2017. The data covered 9 years from 1 January 2010 to 31 December 2018. The data from the year 2010 to 2016 were used for the training, while data from the year 2017 were used for validation and the data in the year 2018 were used to estimate the performance of the models. The GRU unit showed a prediction error of 2.004 TECU while the LSTM, MLP, GIM_TEC, and IRI-Plas 2017 models showed a prediction error of 2.055 TECU, 2.336 TECU, 5.913, and 16.183 TECU, respectively.
Kaselimi et al. [
19] proposed a spatiotemporal deep learning architecture, CNN-GRU, combining a convolutional neural network to capture the spatial variability of TEC and a gated recurrent unit for temporal variability modeling. Real TEC measurements were used by determining the slant STEC and its conversion to TEC for six stations in different regions of the globe, with F10.7, solar sunspot number SSN, Kp, ap, and Dst as the input parameters. The TEC values obtained by the proposed method were compared not only with the results of the neural network methods, LSTM [
20], BiLSTM [
11], and RNN [
21], but also with the GIM map CODE and IRI model values. The dataset included data from 2014 to 2018 with the test period being during the second half of 2018. Specific estimates of MAE ranged from 0.7–1.8 TECU for CNN-GRU, 0.9–1.8 TECU for Recurrent Neural Network (RNN), and 0.8–2.2 TECU for LSTM and BiLSTM.
Unfortunately, many papers do not provide Mean Absolute Percentage Error (MAPE) values, while a comparison of MAE and RMSE does not always give the full picture. As shown in Table III in Kaselimi et al. [
19], different methods can give the best results for different stations. As for the dependence of the results on latitude, they were better for the middle-latitude station Graz than for the high-latitude station Tixi. This paper also compares the processing time in minutes and the number of the required trainable parameters per method. The most efficient was the RNN architecture, which requires 23 min (for 200 training epochs), while the heaviest is the BiLSTM model which requires 280 min (for the same 200 epochs). The proposed CNN-GRU architecture requires 67 min for its training.
However, other approaches are also possible. Using the advantages of neural networks combined with the Global Ionosphere Map GIM, an empirical model to predict TEC, 24 h in advance at the global scale was developed in the work of Cesaroni et al. [
22]. A nonlinear autoregressive neural network with eXternal input NARX was used as a neural network. The Polytechnic University of Catalonia’s (UPC) products for single point forecasting were used as GIM. To extend the forecasting at a global scale, a NeQuick2 model adapted to an effective sunspot number R12eff was used. TEC and Kp data for 11 years (2005–2015) divided into training (70%), validation (15%), and testing (15%) datasets with a time resolution of 2 h were used. Examples for four points and two quiet days (30 May 2016 and 21 August 2016) yielded an RMSE of 1.5 TECU, 0.32 TECU, 2.3 TECU, and 3.0 TECU. Accumulated statistics showed that the RMSE was 3.5 TECU for the period June 2017 to May 2018. For the disturbed period of 7–11 September 2017, the RMSE = 3.8 TECU. The RMSE for six per disturbed periods of different intensities (from 1 to −142 nT) was in the range 3.4–5.1 TECU and showed no dependence on Dst. This 24 h empirical approach was implemented on the Ionosphere Prediction Service (IPS), a prototype platform to support different classes of GNSS users.
The work of Natras et al. [
23] provides an overview of most of the methods used for TEC forecasting: ANN, LSTM, LSTM-CNN, Encoder-Decoder LSTM, Extended ED-LSTME, NARX, conditional Generative Adversial Network cGAN, and others, and gives statistical characteristics of these methods. In particular, for 1 h forecasting and up to 1-day and 2-day forecasting, the RMSE for a 1 h TEC forecast in low latitudes ranges from 2 to 5 TECU for different learning algorithms and different levels of solar activity. For the mid-latitude 1 h VTEC forecast, the RMSE is about 1.5 TECU. For a 1-day TEC forecast, the RMSE was 4 TECU in high solar activity and 2 TECU in low solar activity. However, noting the large complexity of using deep learning methods as a drawback of these methods, the authors used three algorithms: Decision Tree and ensemble learning of Random Forest, Adaptive Boosting (AdaBoost), and eXtreme Gradient Boosting (XGBoost) for 1 h and 24 h VTEC forecasts. According to the authors, the advantages of these methods are that they are simple, non-parametric, fast to optimize, computationally efficient, and can be used on a limited dataset. The dataset for training and cross-validation included the period January 2015–December 2016. The test dataset included the period January–December 2017. Test results of six methods are given for three latitudes 70°N, 40°N, and 10°N (longitude not shown) and two periods: year 2017 and 7–10 September 2017. The results for the method showing the highest accuracy (Random Forest) are as follows. For 1 h ahead and latitude 70°N, the RMSE was 0.54 TECU, for 40°N the RMSE was 0.92 TECU, and for 10°N the RMSE was 1.2 TECU. For 24 h ahead and a latitude of 70°N, the RMSE was 1.06 TECU, for 40°N, the RMSE was 1.86 TECU, and for 10°N, the RMSE was 2.2 TECU. For the disturbed period 7–10 September 2017 and 1 h ahead, the RMSE were 0.73 TECU, 1.31 TECU, and 1.29 TECU for 70°N, 40°N, and 10°N, respectively. For 24 h ahead, these values were 1.77 TECU, 3.95 TECU, and 3.95 TECU.
Thus, an analysis of the literature data shows that there is a certain sequence of using neural network methods LSTM to GRU to bidirectional to TCN, which allows gradually increasing the accuracy of the TEC forecast. However, the results can be highly dependent on the combination of architectures, region, and space weather conditions.
In a previous work [
24], the following results for single hidden layer LSTM and gated recurrent unit neural networks were obtained for the Juliusruh station in 2015: MAE = 1.5 TECU, RMSE = 1.9 TECU, MAPE = 17% for GRU, MAE = 1.39 TECU, RMSE = 1.85 TECU, and MAPE = 14% for LSTM.
In this paper, a set of multilayer neural network architectures, including both LSTM and GRU neural networks and temporal convolutional networks, are proposed and considered. The main aim of this paper is to complement them with bidirectional architectures and to determine the method that provides the highest prediction accuracy in the European region.
In
Section 2, experimental data is described and the behavior of the basic indices of solar and geomagnetic activity and the total electron content is illustrated for the chosen years. Additionally,
Section 2 presents the neural networks-based forecast methods. The results of the proposed methods are given in
Section 3.
Section 4 contains a discussion of the results.
Section 5 provides the conclusions.
3. Results
The statistical results on accuracy of TEC forecasting for a set of 10 considered architectures and the two sets of indices (including and excluding Vsw) are presented in
Table 1.
The metrics considered were MAE, MAPE, RMSE, and
ρ, averaged over half a year during single step 2 h ahead TEC forecasting for three years separately and four selected stations. The results for station Juliusruh are presented in
Table 1 for 2015.
The results for every station considered are shown in
Figure 11 as a dependence of MAE, MAPE, and RMSE on the architecture’s variant number. Thus, the results are presented for the following architectures, indicated by numbers: (1) LSTM-FRF, (2) LSTM-RFF, (3) GRU-FRF, (4) GRU-RFF, (5) TCN, (6) BiLSTM-FRF, (7) BiLSTM-RFF, (8) BiGRU-FRF, (9) BiGRU-RFF, and (10) BiTCN. Due to the fact that there is no unambiguous trend in the influence of the Vsw index on the accuracy of the forecast, only the option with the best result (with and without Vsw) was chosen from the two options.
From
Figure 11, it can be seen that among the prediction accuracy metrics, three sections can be distinguished with similar characteristics, depending on the architectures used. The first flat area (numbers 1–5) with the highest values for errors refers to traditional LSTM and GRU architectures with FRF and RFF modifications. The second section (numbers 6–8), which includes results for bidirectional networks, has a minor slope and error values are 1.5–2 times smaller compared to the first section. Finally, the third section, which includes BiGRU and BiTCN architectures, provides the lowest error values, i.e., the highest forecasting accuracy. It should be noted that although there is a certain latitude dependence of the statistical characteristics in the first two groups, for the BiTCN architecture, it practically ceases to exist.
It can be seen that MAPE and RMSE are related to the absolute values of TEC, depending on the level of solar activity. Conventional LSTM and GRU with FRF and RFF modifications and conventional TCN provide results at the same level of accuracy. The usage of bidirectional architecture essentially improves the accuracy of the forecast for almost all architectures and across all stations. It should be noted that during high solar activity, bidirectional architectures of the FRF line (BiLSTM-FRF and BiGRU-FRF) do not improve the accuracy of the forecast for a low-latitude station even compared to conventional methods. The highest accuracy is achieved using BiGRU-RFF and BiTCN architectures: MAPE values are less than 0.6 TECU, RMSE values are less than 0.8 TECU, and MAPE values are less than 7.5%. The best forecasting accuracy is achieved using BiTCN architecture, with less than 0.3 TECU for MAE, less than 0.6 TECU for RMSE, and less than 5% for MAPE.
The forecast errors during the second half of each year for the Moscow station are shown in
Figure 12. Each panel for the 2015, 2020, and 2022 years shows the variations of TEC between DoY 185 and 365 along with the Dst index and compares the prediction accuracy for two architectures: LSTM-FRF and BiTCN. All the cases are provided with the Vsw index included in the training data (10 Kp, Np, F10.7, Dst, and Vsw).
Figure 12 shows the instantaneous TEC and δTEC values for the two selected architectures from among the unidirectional and bidirectional ones, using the Moscow station as an example. Unlike the median values, the TEC values (plots a, d, and g on the left) show the response to disturbed conditions. The average δTEC values for the first and third plots (b,c) are 1.15 TECU and 0.22 TECU in 2015, an improvement of ~5 times can be observed. In 2020 (
Figure 12e,f), these values were 0.52 TECU and 0.2 TECU, i.e., an improvement of ~2.5 times, and for the year 2022 (
Figure 12h,i), higher values of 1.32 TECU and 0.27 TECU were obtained and an improvement of ~5 times was achieved. Regardless of the year, the accuracy for BiTCN is 2–5 times higher than for LSTM-FRF: the error is in the range of ±1 TECU. The δTEC outlier in the last panel refers to 14 October 2022, when a positive disturbance occurred during the main phase of the magnetic storm (minimum Dst = −62 nT) due to a sharp increase in Vsw, which was minimal and equal to 300 km/s in UT0 on 14 October and gradually increased to 600 km/s in UT0 on 16 October. It is interesting to note that during the main phase of the second magnetic storm (minimum Dst = 76 nT), there was also a positive deviation δTEC, but of less intensity, since the velocity Vsw was constant, 355 km/s.
4. Discussion
In this paper, a set of architectures of multilayer neural networks based on GRU, LSTM, and temporal convolution is proposed and considered. The implementation of the RFF and FRF architectures improves the results compared to [
24]. The further modification of those architectures using bidirectional processing led to a 2-fold improvement in the forecast accuracy. The greatest improvement in accuracy of the forecast was achieved with the employment of a bidirectional temporal convolutional network architecture, though its initial variant provides results at a level comparable to traditional methods. Its bidirectional enhancement allows one to increase the accuracy of the forecast by 5–10 times. The observed dependence of the results on solar activity, a combination of indices, and the latitude shows that to determine the best method in each region, it is necessary to examine not only a set of architectures but also indices, as evidenced by the example with the Vsw index.
It is important to note that the number of cells in bidirectional layers is two times higher than in unidirectional ones, but at the same time, a twofold increase in the number of cells in unidirectional networks did not lead to an increase in prediction accuracy but, on the contrary, reduced it. This result is due to overfitting. The bidirectional neural network includes two separate backpropagation passes: one is for the forward direction of inputs, and the second one is for the reverse direction. During the forward pass, the forward layer processes the input sequence in the original input direction and makes predictions for the output sequence. The backward layer processes the input sequence in reverse order during the backward pass and predicts the output sequence. When both passes are completed, their predictions are compared to the target output sequences and the error is propagated back to each respective layer through the network to update the weights. The forward and backward layer weights are updated based on the errors computed during the forward and backward passes, respectively. This process allows you to optimize the weights on a double amount of data—the original inputs and reverse inputs. For each of the layers of a bidirectional network, optimization is carried out on its own subspace of parameters.
As for the comparison with the results of other works, as noted in the introduction, in many works, only MAE and RSME are mentioned as statistical characteristics, but MAPE is not indicated. The results obtained show that this is not sufficient. This is especially evident from the results for the Nicosia station, which has the highest TEC values. This can be demonstrated by the example of [
23], where RMSE ~0.5 TECU were obtained without specifying MAPEs. In this work, even smaller RMSE values were obtained regardless of latitude, indicating that the capabilities of the LSTM, GRU, and TCN methods for TEC prediction have not been exhausted.
Apart from the differences in accuracy, it is important to note that significant discrepancies between the architectures in terms of computational costs are also observed. For bidirectional architectures, the computational costs are significantly higher than for unidirectional ones. In particular, among unidirectional architectures, for the FRF line of architectures, the training time was about 11 s, for the RFF line, the training time was about 9 s, and for the TCN, it was 2 s on average. The difference between the LSTM and GRU variants did not exceed 2 s on average, and the GRU-based architectures were converging at a higher rate. With the transition to bidirectional architectures, the training time for FRF and RFF architectures increased by more than 2 times and amounted to 27 and 22 s on average. For BiTCN, the training time increased to 4 s. The training was performed using a single RTX4080 and AMD Ryzen 9 7900X.
It is important to note that the training was carried out on a relatively small amount of data that did not contain complete information about the annual behavior of TEC and the indices used; nevertheless, the proposed architectures of bidirectional networks completed the forecasting task with consistently high accuracy.