Next Article in Journal
In Silico Comparison of Quantum and Bioactivity Parameters of a Series of Natural Diphenyl Acetone Analogues, and In Vitro Caco-2 Studies on Three Main Chalcone Derivatives
Previous Article in Journal
EDSCVD: Enhanced Dual-Channel Smart Contract Vulnerability Detection Method
Previous Article in Special Issue
Small-Signal Modeling and Frequency Support Capacity Analysis of Power Load Considering Voltage Variation Effect
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Top-Oil Temperature Prediction of Power Transformer Based on Long Short-Term Memory Neural Network with Self-Attention Mechanism Optimized by Improved Whale Optimization Algorithm

1
School of Electrical Engineering, Chongqing University, Chongqing 400044, China
2
Electric Power Research Institute, China Southern Power Grid Yunnan Power Grid Co., Ltd., Kunming 650217, China
3
School of Automation, Nanjing University of Science and Technology, Nanjing 210094, China
*
Author to whom correspondence should be addressed.
Symmetry 2024, 16(10), 1382; https://doi.org/10.3390/sym16101382
Submission received: 7 August 2024 / Revised: 30 September 2024 / Accepted: 13 October 2024 / Published: 17 October 2024
(This article belongs to the Special Issue Symmetry/Asymmetry Studies in Modern Power Systems)

Abstract

:
The operational stability of the power transformer is essential for maintaining the symmetry, balance, and security of power systems. Once the power transformer fails, it will lead to heightened instability within grid operations. Accurate prediction of oil temperature is crucial for efficient transformer operation. To address challenges such as the difficulty in selecting model hyperparameters and incomplete consideration of temporal information in transformer oil temperature prediction, a novel model is constructed based on the improved whale optimization algorithm (IWOA) and long short-term memory (LSTM) neural network with self-attention (SA) mechanism. To incorporate holistic and local information, the SA is integrated with the LSTM model. Furthermore, the IWOA is employed in the optimization of the hyper-parameters for the LSTM-SA model. The standard IWOA is improved by incorporating adaptive parameters, thresholds, and a Latin hypercube sampling initialization strategy. The proposed method was applied and tested using real operational data from two transformers within a practical power grid. The results of the single-step prediction experiments demonstrate that the proposed method significantly improves the accuracy of oil temperature prediction for power transformers, with enhancements ranging from 1.06% to 18.85% compared to benchmark models. Additionally, the proposed model performs effectively across various prediction steps, consistently outperforming benchmark models.

1. Introduction

Power transformers undertake a vital role in the symmetrical operation of power systems [1]. They serve as critical infrastructure for power transmission and distribution, with extensive applications in various other fields, such as transportation [2]. Once the power transformer fails, it can severely disrupt the normality of the power system operation, potentially causing widespread power outages and significant economic losses [3]. As a vital component of the power system, the stable operation of the transformer is fundamental to maintaining the symmetry and balance of the power system [4,5].
Top oil temperature is significant for determining whether the transformer can maintain normal operation. In practice, the transformer internal faults rely on the trend of the oil temperature to make judgments [6,7]. Therefore, the good performance of oil temperature prediction helps professionals find problems promptly in the transformer’s daily operation and maintenance. By reliably forecasting oil temperature, we can not only prevent unexpected failures but also optimize maintenance schedules, reduce operational risks, and extend the transformer’s lifespan. Effective oil temperature prediction enhances the overall reliability and efficiency of the power system, making it an essential component in maintaining the symmetrical operation of the electrical grid.
Researchers generally study the prediction of transformer oil temperatures through mathematical and data-driven models [8,9,10]. Zhao et al. used the least squares method to establish a parameter identification algorithm [11], and this mathematical model can effectively predict the top oil temperature but lacks strong generalization ability. Wang et al. establish a thermal circuit model to simulate the changes in the transformer temperature over time, but it has a lengthy computation time [12].
With the development of intelligent algorithms, artificial intelligence technologies have been applied to the field of power system forecasting. Interesting studies can be found in the fields of load forecasting [13], vehicle-to-grid (V2G) scheduling prediction [14], and solar irradiance forecasting [15]. There have been some research efforts focused on predicting transformer oil temperature using these algorithms. Qing et al. developed a model based on artificial neural networks for forecasting the top oil temperature of transformers [16], and this model significantly reduces the computational time but ignores the selection of optimal hyperparameters. Tan et al. proposed a forecast model that considers path analysis and similar moments [17], but the validation dataset is small and the adaptability is difficult to confirmed. Li et al. introduced a regression model with enhanced particle swarm optimization (PSO) for transformer top oil temperature forecast [18]. However, the large sampling interval of data caused the substandard performance. Based on a similar day, Tan et al. introduced a method to predict top oil temperature. The above approach relies solely on single-day similarity for prediction and deteriorates the model prediction performance [19]. To sum up, these studies do not fully consider temporal information of different input features, thus failing to combine global and local information within transformer operational data. In addition, the optimal hyper-parameters of the model are difficult to determine.
To tackle the issues mentioned, this paper introduces a novel method: an improved whale optimization algorithm (IWOA) optimized long short-term memory (LSTM) neural network with self-attention (SA) mechanism model. The proposed method comprehensively addresses challenges related to the difficulty in selecting hyperparameters for the oil temperature prediction model and the insufficient consideration of temporal information. It integrates SA with LSTM and utilizes the IWOA to obtain the optimal hyper-parameters for the LSTM-SA model, resulting in high prediction accuracy. Finally, the proposed method is tested with actual operating data in a practical power grid. The results demonstrate that the proposed method has better forecasting performance.
The remaining sections of this paper are as below: Section 2 discusses the power transformer and top-oil temperature. Section 3 introduces the LSTM-SA model and the IWOA. Section 4 presents a case study that shows the superiority of the IWOA for optimization and the effectiveness of the proposed method for predicting top-oil temperature. Finally, conclusions and discussions are presented in Section 5.

2. Power Transformer and Top-Oil Temperature

The top oil temperature of a transformer is a crucial indicator for measuring the reliability of transformer operation, monitoring the internal insulation status. Accurately predicting the top oil temperature of the power transformer is of great significance for analyzing potential faults, carrying out transformer operation and maintenance, maintaining the symmetry and balance of the power system, and achieving early warning of transformer failures. It is a key factor in limiting the transformer’s load capacity and assessing its operational lifespan.
There are two merits to considering top oil temperature as the subject of study. First, researchers can easily access real-time monitoring data for the transformer’s top oil temperature, thanks to advanced sensor technologies and the widespread implementation of smart grids. This accessibility facilitates continuous monitoring and data collection, which are essential for accurate prediction and timely intervention. Second, the hot spot temperature that is difficult to obtain can be calculated from the transformer top oil temperature. Hot spot temperature is crucial, as it represents the highest temperature within the transformer and is a direct indicator of the condition of the transformer’s insulation. Accurate estimation of this temperature is vital for predicting the remaining life of the insulation and planning maintenance activities.
The above advantages have made the top oil temperature highly favored by researchers, and it has now become a hot research topic [20]. The basic construction of an oil-immersed transformer is graphically represented in Figure 1. This paper focuses on improving the accuracy of oil temperature prediction, particularly in addressing the challenges posed by the nonlinearity and time-series characteristics of the data.

3. The Proposed IWOA-LSTM-SA Method for Top-Oil Temperature Prediction

3.1. Framework

In this study, IWOA-LSTM-SA has been developed for transformer oil temperature forecasting, in which IWOA has been employed to precisely search optimal input hyper-parameters and LSTM-SA as the forecasting model to combine global and local information. The flowchart is presented in Figure 2.
The main phases of the IWOA-LSTM-SA will be detailed in the following sections.

3.2. LSTM Integrated by SA

LSTM is a specialized type of recurrent neural network (RNN), specifically designed to process temporal data sequences. On the basis of traditional RNN, LSTM introduces the concept of “gating”, which not only overcomes the gradient vanishing but also selects samples. Therefore, LSTM is more suitable for solving nonlinear temporal structure problems. Each memory block of an LSTM comprises one or more self-connected memory cells and three gating units: the input gate, the output gate, and the forget gate. The specific structure of the gate is shown in Figure 3. The forgetting gate is responsible for deciding which information should be discarded from the cell state, effectively determining the extent to which the previous cell state is preserved within the current cell state. The calculation equation is as below:
m t = σ ( W m × [ r t 1 , x t ] + p m )
The input gate controls which the current input is stored in the unit state. The formulas for input gates and candidate cell states is as below:
s t = σ ( W s × [ r t 1 , x t ] + p s )
The output gate regulates the current output and decides the output information. The formula for calculation is given below:
g t = σ ( W g [ r t 1 , x t ] + p g )
r t = o t tanh ( C t )
The formula for calculating the cell state is as below:
C t ˜ = tanh ( W C × [ r t 1 , x t ] + p C )
C t = m t C t 1 + s t C t ˜
In summary, LSTM is suitable for processing time series data, so this paper uses LSTM to establish a temperature prediction model. Furthermore, it is difficult to process long sequence data for the LSTM model that we introduce SA to solve this problem. This method considers both local and global information.
It consists of three components. Firstly, the data that come from the LSTM model is the input of the SA layer. Secondly, the matrices q , k , and v are calculated using the weight matrices W q , W k , and W v . Thirdly, a 1 , 2 is the dot product between q 1 and k 2 , and a 2 , 2 is the dot product between q 2 and k 2 . The attention matrix M means the correlation between different time steps. The structure is shown in Figure 4.

3.3. Hyper-Parameters Optimization by IWOA

The Whale Optimization Algorithm (WOA) was introduced to deal intricate optimization problems by Mirjalili et al. [21,22]. The WOA can be formulated as the following steps: encircling prey, bubble-net attacking method and search for prey.

3.3.1. Encircling Prey

Humpback whales can identify and encircle their prey. In the population, the remaining whales will try to adjust their positions towards the direction of the best search agent as defined by the equation:
G ( t + 1 ) = G * ( t ) A | C G * ( t ) G ( t ) |
where t denotes the current iteration; G is a vector indicating the position; G * is the place vector of the best solution acquired yet, A and C are calculated from the following:
A = 2 a r 1 a
C = 2 r 2
where a is an adjustment vector and a is linearly decreasing from 2 to 0; the vectors r 1 and r 2 are random vectors that fall within the range of [0, 1].

3.3.2. Bubble-Net Attacking Method

Humpback whale predation consists of two main mechanisms: shrinkage bracketing mechanism and the spiral updating location.
(1)
Shrinkage bracketing mechanism: As a decreases, A represents an any value within the range of [−1, 1]. The new position is determined by the distance between its original position and the position of the currently best-so-far whale. The equation for calculation is as below:
a = 2 × ( 1 t t max )
(2)
Spiral updating location: the WOA uses spiral updating location to launch attacks on prey, and the spiral hunting equation is as below:
G ( t + 1 ) = e b l cos ( 2 π l ) | G * ( t ) G ( t ) | + G * ( t )
where l is a random count within the interval [−1, 1] and b represents a constant. They approach the prey using two mechanisms: a shrinking circle and a spiral-shaped path. The updated equations are as follows.
G ( t + 1 ) = { G * ( t ) A | C G * ( t ) G ( t ) | , p   <   0.5 e b l cos ( 2 π l ) | G * ( t ) G ( t ) | + G * ( t ) , p 0.5
where p falls within the range of [0,1].

3.3.3. Search for Prey

Humpback whales search for their prey randomly, with their locations varying relative to each other. In this stage, the position of a searching whale is modified according to the position of a randomly selected whale, as opposed to being updated based on the current best whale. The calculation formula is as listed below:
G ( t + 1 ) = G rand ( t ) A | C G rand ( t ) G ( t ) |
where G rand denotes the random location of a whale.

3.3.4. Improved Whale Optimization Algorithm

The original WOA faces certain limitations, particularly in terms of inadequate local search capabilities and insufficient population diversity. Therefore, it is necessary to further improve the strategy and adjust the algorithm [23]. For example, Naderi et al. proposed a Whale Optimization Algorithm enhanced by wavelet mutation, aimed at improving the algorithm’s convergence characteristics to address the complex trade-off between generation costs and water consumption [24]. In this study, an approach takes a different direction by introducing three key improvements: Latin Hypercube Sampling for more diverse and uniform population initialization, an adaptive selection threshold to dynamically adjust the whale’s movement strategy, and a nonlinear parameter adjustment to enhance local search capabilities. These modifications are designed to address different aspects of the original WOA’s limitations. The specific improvements are as follows:
(1)
Latin Hypercube Sampling (LHS) initialization of population: as stated in [25], population initialization plays a crucial role in swarm intelligence optimization algorithms. In WOA, population initialization follows a random approach. However, it can lead to uneven population distribution and individual overlap [26]. Therefore, it is necessary to optimize the population initialization. IWOA incorporates LHS to increase the diversity of initial population, and this method can initialize population more uniformly and efficiently.
(2)
Adaptive selection threshold: in WOA, the whales choose either encircling activity or spiral movement with 50% probability. However, this method prevents the whale population from choosing the appropriate movement for the current population [27,28]. In this paper, an adaptive selection threshold is used to replace the fixed threshold. The method automatically adjusts the threshold according to the problem’s characteristics throughout the search process. The calculation is given by the following formula:
p a = 1   [ t ( L + f ) t max × ( L × e t e t max + f × t f t max f ) ]
where t denotes the current iteration, while t max denotes the maximum iteration count; L, f are control parameters, and their values are 2 and 4, respectively.
In our method, when the threshold is larger in the initial stage, the whale will preferentially choose the encircling movement strategy. With the increasing of iterations, the threshold decreases, thus the whale is more likely to choose the spiral motion strategy. Equation (12) is updated to Equation (15).
G ( t + 1 ) = { G * ( t ) A | C G * ( t ) G ( t ) | , p   <   p a   e b l cos ( 2 π l ) | G * ( t ) G ( t ) | + G * ( t ) , p p a
(3)
Adaptive parameter: in traditional method, a decreases linearly from 2 to 0. In order to enhances local searching ability, this study uses a nonlinear strategy to adjust b in Equation (16), which influences the shape of the logarithmic spiral. It can significantly improve the effectiveness of local search and the speed of global search, thereby enhancing overall accuracy [29]. At the same time, we establish a relationship between b and t to achieve adaptive adjustment. Equation (10) is updated to Equation (16).
{ a ( t ) = 2 × ( 1 tanh ( t t m a x k ) ) b ( t ) = v ( v t m a x ) × t
where k, v are control parameters, and their values are 4 and 10, respectively.
The IWOA flowchart is illustrated in Figure 5.

4. Case Studies and Results Analysis

4.1. Data Source

This study includes two datasets. Dataset 1 consists of transformer operation data collected from a 500 kV substation from 1 April to 30 June in 2022, with a sampling period of half an hour. In total, there are 4368 samples. The characteristic parameters include high-voltage-side three-phase current (AI, BI, CI), active and reactive power (P, Q), high-voltage-side three-phase voltage (AU, BU, CU), and top-oil temperature (T). This paper used the Pearson correlation coefficient method to select features, and the results are shown in Table 1. Dataset 2 consists of transformer operation data collected from a 220 kV substation from 10 February 2021 to 10 February 2022, with a sampling period of half an hour. In total, there are 17,518 samples.
As shown in Table 1, the correlation coefficient between the top-oil temperature and the high-voltage side three-phase current is 0.371, and the correlation coefficients with active power and reactive power are 0.369 and 0.372, respectively, indicating a positive correlation. The correlation coefficients between the top-oil temperature and the high-voltage side three-phase voltage are −0.346, −0.342, and −0.339, respectively, indicating a negative correlation with the top-oil temperature. This also suggests that the high-voltage side three-phase voltage, current, and active and reactive power have some influence on the transformer oil temperature. Similarly, a correlation analysis of the input features of Dataset 2 based on the Pearson correlation coefficient method is conducted. Ultimately, this paper selects high-voltage-side current, active and reactive power, voltage, and top-oil temperature as input features. The dataset is split into training and test sets, in which 80% is used for training and 20% for testing.

4.2. Comparison of Algorithm Optimization Results

This paper compared the performance of IWOA with traditional methods, which consist of GA, PSO, and the original WOA. Appendix A, Table A1 presents the ten test functions employed for evaluation, which are derived from the studies conducted in [30,31].
In Appendix A, Table A1: Each function has a dimension of 30, and the minimum value is 0. To ensure the fairness of the comparison, the iteration is set to 500. The crossover probability of GA is set to 1, and the variance probability is 0.1. Meanwhile, the learning factor c1 = c2 = 2 for PSO, and b is 10 for WOA. Each algorithm runs independently 30 times. The average and the best results are utilized for comparison, as shown in Table 2. The average convergence curve of each algorithm is shown in Figure 6.
In Table 2, the optimal value reaches 0 in the F5, F6 and F8 functions, and the average values also show significant improvement. As shown in Figure 6, IWOA exhibits better convergence performance compared to traditional algorithms. These findings confirm the effectiveness of the enhancement strategies for WOA.

4.3. One-Step Prediction

Single-step oil temperature prediction involves forecasting the transformer’s top oil temperature for the next time step using historical data. In this experiment, the prediction is for 30 min into the future. To balance the training and testing errors, we introduced L2 regularization and dropout during the model training. Specifically, a dropout rate of 0.1 was applied, along with L2 regularization using a factor of 0.01. The prediction results for Dataset 1, demonstrating the effectiveness of the method, are presented in Figure 7. To further illustrate the trade-off between training and testing errors, Figure 8 provides a comparison of the training and testing errors.
Theoretically, when there is a significant gap between training and test errors, it usually indicates over-fitting, where the model performs well on the training data but struggles to generalize to unseen data. As illustrated in Figure 8, both the training and test losses decrease rapidly during the initial epochs and then converge to similar values as training progresses. This suggests that we have achieved a well-balanced trade-off between training and testing errors. This balance was successfully attained by applying regularization techniques, such as L2 regularization and dropout, which helped control model complexity, mitigate over-fitting, and enhance the model’s generalization capabilities.
To assess the performance of this method, this paper compared it with benchmark methods, including BP, gate recurrent unit (GRU), convolutional neural networks (CNN), LSTM, LSTM-SA, and WOA-LSTM-SA models. In order to reduce the accidental error, this paper conducted 10 repeated experiments and averaged the results to show the forecasting performance. Figure 9 displays the prediction results for each model on Dataset 1. It is evident that the proposed model shows the best prediction result compared to all benchmark models. The reason is that the proposed approach not only combines both local and global information but also utilizes IWOA to determine the optimal hyper-parameters. Table 3 presents the comparative results.
From Table 3, it is evident that our method does not have an advantage in terms of computation time compared to traditional machine learning models. Therefore, in scenarios where prediction accuracy is not a primary concern, traditional machine learning models can still be considered for top oil temperature prediction of transformers. The prediction model proposed in this paper, however, places a greater emphasis on improving prediction accuracy. To analyze and compare each model more comprehensively, this paper includes a residual plot. Using Dataset 1 as an example, in the residual plot (Figure 10), the true values are shown on the horizontal axis, while the vertical axis represents the residual values (percentage).
The residual percentage is relatively higher for the data between 30 and 43 °C and 55 to 60 °C. The reason is as follows: there are about 4000 sample points within the temperature range of 43 to 55 °C, whereas the temperature ranges of 30~43 °C and 55~60 °C each contain approximately 200 sample points. This unbalanced distribution leads to low accuracy on sparse samples.

4.4. Ablation Experiment

To comprehensively validate the effectiveness of each component of the proposed method (IWOA-LSTM-SA), ablation experiments were conducted. Specifically, the experiments compared the following models: LSTM, LSTM-SA, WOA-LSTM, IWOA-LSTM, and WOA-LSTM-SA, with the LSTM model serving as the benchmark for comparison and analysis. Results are shown in Table 4.
As shown in Table 4, the proposed model demonstrates higher prediction accuracy compared to the baseline model LSTM and other comparative models. Compared to LSTM, the RMSE of LSTM-SA decreased by 5.88% on Dataset 1 and by 7.44% on Dataset 2; the MAPE increased by 3.59% on Dataset 1 but decreased by 11.23% on Dataset 2. This validates the effectiveness of combining the SA algorithm with LSTM. Compared to LSTM-SA, the RMSE of WOA-LSTM-SA and IWOA-LSTM-SA decreased by 4.88% and 6.44% on Dataset 1, and by 6.43% and 7.42% on Dataset 2, respectively. The MAPE decreased by 6.66% and 7.28% on Dataset 1, and by 7.99% and 9.89% on Dataset 2, respectively. This validates the effectiveness of the optimization algorithms proposed in the models. Additionally, compared to WOA-LSTM and IWOA-LSTM, the RMSE of the proposed model decreased by 9.89% and 5.21% on Dataset 1, and by 10.51% and 4.22% on Dataset 2, respectively. The MAPE decreased by 2.43% and 0.81% on Dataset 1, and by 16.60% and 6.12% on Dataset 2, respectively.
In summary, compared to using optimization algorithms or SA individually, combining them results in a greater improvement in the performance of the prediction model.

4.5. Multi-Step Forecasting

The multi-step prediction model refers to a model that predicts a series of values rather than a single value. Multi-step prediction is more important in real-world power system operations because it provides longer-term temperature trend forecasts, which help to identify potential issues in advance. Therefore, this section conducts a multi-step prediction analysis, where the prediction steps are set to 3 steps (90 min) and 5 steps (150 min). The evaluation metrics are shown in Table 5, and the prediction results (for one week) are presented in Figure 11.
From Table 5, it can be seen that the error increases as the prediction step increases across all models. By comparing the RMSE metric, it can be concluded that the proposed model exhibits better accuracy across different prediction steps compared to the baseline model. Specifically, in Dataset 1 and Dataset 2, for the 3 step prediction, the RMSE of the proposed model is 1.537 and 1.015, respectively. This represents reductions of 12.83% and 38.65% compared to the BP model, 6.98% and 20.89% compared to the CNN model, 3.75% and 13.62% compared to the GRU model, 4.24% and 27.16% compared to the LSTM model, 1.60% and 17.93% compared to the LSTM-SA model, and 1.16% and 4.34% compared to the WOA-LSTM-SA model. For the 5 step prediction, the RMSE of the proposed model is 1.714 and 1.634, representing reductions of 12.60% and 11.11% compared to the BP model, 7.61% and 15.89% compared to the CNN model, 6.49% and 17.30% compared to the GRU model, 5.19% and 14.14% compared to the LSTM model, 4.56% and 12.82% compared to the LSTM-SA model, and 3.06% and 1.80% compared to the WOA-LSTM-SA model. By analyzing the multi-step prediction metrics, we conclude that the proposed model demonstrates good performance across different prediction steps compared to traditional models.

5. Conclusions

Oil temperature prediction can effectively prevent symmetrical and asymmetrical faults in transformers. This paper adopts a novel approach to improve the performance of top-oil temperature prediction during transformer operations. The proposed model has been tested using actual data, and some conclusions can be obtained as follows:
(1)
To verify the efficacy of the IWOA, this paper conducts tests with eight test functions. The findings demonstrate that the IWOA outperforms GA, PSO, and WOA in terms of convergence speed and accuracy.
(2)
To verify the effectiveness of the proposed model, extensive experiments were conducted using actual operating data. The experimental results indicate that the proposed approach outperforms current state-of-the-art methods. On Dataset 1, the model achieved reductions in RMSE of 15.31%, 12.64%, 7.41%, 11.94%, 6.44%, and 1.98% compared to the BP, CNN, GRU, LSTM, LSTM-SA, and WOA-LSTM-SA methods, respectively. Similarly, on Dataset 2, the model demonstrated significant improvements, with RMSE reductions of 18.85%, 9.09%, 1.19%, 14.29%, 7.42%, and 1.06% compared to the same benchmark methods.
(3)
The proposed model performs effectively across various prediction steps compared to benchmark models. Specifically, for the 3-step prediction, the RMSE of the proposed model is 1.537 and 1.015 for Dataset 1 and Dataset 2, respectively, reflecting reductions of 12.83% and 38.65% compared to the BP model, 6.98% and 20.89% compared to the CNN model, 3.75% and 13.62% compared to the GRU model, 4.24% and 27.16% compared to the LSTM model, 1.60% and 17.93% compared to the LSTM-SA model, and 1.16% and 4.34% compared to the WOA-LSTM-SA model. For the 5-step prediction, the RMSE of the proposed model is 1.714 and 1.634, representing reductions of 12.60% and 11.11% compared to the BP model, 7.61% and 15.89% compared to the CNN model, 6.49% and 17.30% compared to the GRU model, 5.19% and 14.14% compared to the LSTM model, 4.56% and 12.82% compared to the LSTM-SA model, and 3.06% and 1.80% compared to the WOA-LSTM-SA model.

Author Contributions

D.Z. led the conceptualization, methodology, software development, and original draft preparation. Validation was carried out by D.Z., H.X. and H.Q., while H.X. and D.Z. handled formal analysis. H.Q. managed the investigation, and Z.H. and W.D. provided resources. S.W. was responsible for data curation. Writing—review and editing involved D.Z., H.X., H.Q., Q.P. and J.Y., with visualization by D.Z., H.X. and J.Y. Supervision was provided by D.Z. and H.Q., project administration by Q.P. and S.W., and funding acquisition by D.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Electric Power Research Institute of Yunnan Power Grid Co., Ltd., Kunming, Yunnan, China (No. YNKJXM20220009).

Data Availability Statement

Data are contained in the article.

Conflicts of Interest

Authors Dexu Zou, Qingjun Peng, Shan Wang, Weiju Dai, and Zhihu Hong were employed by the company China Southern Power Grid Yunnan Power Grid Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Appendix A

Table A1 displays the ten test functions used in this study.
Table A1. Test functions.
Table A1. Test functions.
FunctionRange
F 1 ( x ) = n = 1 k x n 2 [ 100 , 100 ]
F 2 ( x ) = n = 1 k | x n | + n = 1 k | x n | [ 10 , 10 ]
F 3 ( x ) = n = 1 k ( i 1 n x i ) 2 [ 100 , 100 ]
F 4 ( x ) = n = 1 k n x n 4 + r a n d o m [ 0 , 1 ) [ 1.28 , 1.28 ]
F 5 ( x ) = 1 + 1 4000 ( x n 2 ) ( cos ( x n n ) ) [ 600 , 600 ]
F 6 ( x ) = [ x n 2 10 cos ( 2 π x n ) + 10 ] [ 5.12 , 5.12 ]
F 7 ( x ) = 20 20 exp ( 0.2 1 k n = 1 k x n 2 ) exp [ 1 k n = 1 k cos ( 2 π x n ) ] + e [ 32 , 32 ]
F 8 ( x ) = π k { 10 sin ( π y 1 ) + n = 1 k 1 ( y n 1 ) 2 [ 1 + 10 sin 2 ( π y n + 1 ) ] + ( y n 1 ) 2 } + n = 1 k μ ( x n , 10 , 100 , 4 ) [ 50 , 50 ]
F 9 ( x ) = i = d d ( x i × sin ( | x i | ) ) + 418.98288727243369 × d [ 500 , 500 ]
F 10 ( x ) = i = 1 d ( ( ln ( x i 2 ) ) 2 + ( ln ( 10 x i ) ) 2 ) ( i = 1 10 x i ) 0.2 [ 2 , 10 ]

References

  1. Xu, X.; He, Y.; Li, X.; Peng, F.; Xu, Y. Overload Capacity for Distribution Transformers with Natural-Ester Immersed High-Temperature Resistant Insulating Paper. Power Sys. Technol. 2018, 42, 1001–1006. [Google Scholar]
  2. Wang, S.; Gao, M.; Zhuo, R. Research on high efficient order reduction algorithm for temperature coupling simulation model of transformer. High Volt. Appar. 2023, 59, 115–126. [Google Scholar]
  3. Liu, X.; Xie, J.; Luo, Y. A novel power transformer fault diagnosis method based on data augmentation for KPCA and deep residual network. Energy Rep. 2023, 9, 620–627. [Google Scholar] [CrossRef]
  4. Chen, T.; Chen, Y.; Li, X. Prediction for dissolved gas concentration in power transformer oil based on CEEMDAN-SG-BiLSTM. High Volt. Appar. 2023, 59, 168–175. [Google Scholar]
  5. Zang, C.; Zeng, J.; Li, P. Intelligent diagnosis model of mechanical fault for power transformer based on SVM algorithm. High Volt. Appar. 2023, 59, 216–222. [Google Scholar]
  6. Ji, H.; Wu, X.; Wang, H. A New Prediction Method of Transformer Oil Temperature Based on C-Prophet. Adv. Power Syst. Hyd. Eng. 2023, 39, 48–55. [Google Scholar]
  7. Tan, F.; Xu, G.; Zhang, P. Research on Top Oil Temperature Prediction Method of Similar Day Transformer Based on Topsis and Entropy Method. Elect. Power Sci. Eng. 2021, 37, 62–69. [Google Scholar]
  8. Amoda, O.A.; Tylavsky, D.J.; McCulla, G.A.; Knuth, W.A. Acceptability of three transformer hottest-spot temperature models. IEEE Trans. Power Deliv. 2011, 27, 13–22. [Google Scholar] [CrossRef]
  9. Zhou, L.; Wang, J.; Wang, L.; Yuan, S.; Huang, L.; Wand, D.; Guo, L. A Method for Hot-Spot Temperature Prediction and Thermal Capacity Estimation for Traction Transformers in High-Speed Railway Based on Genetic Programming. IEEE Trans. Transp. Electrif. 2019, 5, 1319–1328. [Google Scholar] [CrossRef]
  10. Deng, Y.; Ruan, J.; Quan, Y.; Gong, R.; Huang, D.; Duan, C.; Xie, Y. A Method for Hot Spot Temperature Prediction of a 10 kV Oil-Immersed Transformer. IEEE Access 2019, 7, 107380. [Google Scholar] [CrossRef]
  11. Zhao, B.; Zhang, X. Parameter Identification of Transformer Top Oil Temperature Model and Prediction of Top Oil Tempeature. High. Volt. Eng. 2004, 30, 9–10. [Google Scholar]
  12. Wang, H.; Su, P.; Wang, X. Prediction of Surface Temperatures of Large Oil-Immersed Power Transformers. J. Tsinghua Univ. Sci. Technol. 2005, 45, 569–572. [Google Scholar]
  13. Tan, M.; Hu, C.; Chen, J.; Wang, L.; Li, Z. Multi-node load forecasting based on multi-task learning with modal feature extraction. Eng. Appl. Artif. Intell. 2022, 112, 104856. [Google Scholar] [CrossRef]
  14. Shang, Y.; Li, S. FedPT-V2G: Security enhanced federated transformer learning for real-time V2G dispatch with non-IID data. Appl. Energy 2024, 358, 122626. [Google Scholar] [CrossRef]
  15. Bai, M.; Yao, P.; Dong, H.; Fang, Z.; Jin, W.; Yang, X.; Liu, J.; Yu, D. Spatial-temporal characteristics analysis of solar irradiance forecast errors in Europe and North America. Energy 2024, 297, 131187. [Google Scholar] [CrossRef]
  16. Qing, H.; Jennie, S.; Daniel, J. Prediction of top-oil temperature for transformers using neural network. IEEE Trans. Power Deliv. 2000, 15, 1205–1211. [Google Scholar]
  17. Tan, F.; Chen, H.; He, J. Top oil temperature forecasting of UHV transformer based on path analysis and similar time. Elect. Power Autom. Equip. 2021, 41, 217–224. [Google Scholar]
  18. Li, S.; Xue, J.; Wu, M.; Xie, R.; Jin, B.; Zhang, H.; Li, Q. Prediction of Transformer Top-oil Temperature with the Improved Weighted Support Vector Regression Based on Particle Swarm Optimization. High Volt. Appar. 2021, 57, 103–109. [Google Scholar]
  19. Tan, F.L.; Xu, G.; Li, Y.F.; Chen, H.; He, J.H. A method of transformer top oil temperature forecasting based on similar day and similar hour. Elect. Power Eng. Tech. 2022, 41, 193–200. [Google Scholar]
  20. Yi, Y. Research on Prediction Method of Transformer Top-Oil Temperature Based on Assisting Dispatchers in Decision-Making. Master’s Thesis, Southwest Jiaotong University, Chengdu, China, 2017. [Google Scholar]
  21. Gharehchopogh, F.S.; Gholizadeh, H. A comprehensive survey: Whale Optimization Algorithm and its applications. Swarm Evol. Comput. 2019, 48, 1–24. [Google Scholar] [CrossRef]
  22. Brodzicki, A.; Piekarski, M.; Jaworek-Korjakowska, J. The whale optimization algorithm approach for deep neural networks. Sensors 2021, 21, 8003. [Google Scholar] [CrossRef] [PubMed]
  23. Mostafa Bozorgi, S.; Yazdani, S. IWOA: An improved whale optimization algorithm for optimization problems. J. Comput. Des. Eng. 2019, 6, 243–259. [Google Scholar] [CrossRef]
  24. Naderi, E.; Azizivahed, A.; Asrari, A. A step toward cleaner energy production: A water saving-based optimization approach for economic dispatch in modern power systems. Electr. Power Syst. Res. 2022, 204, 107689. [Google Scholar] [CrossRef]
  25. Gao, W.; Liu, S.; Huang, L. Inspired artificial bee colony algorithm for global optimization problems. Acta Electron. Sin. 2012, 40, 2396. [Google Scholar]
  26. Shi, X.; Li, M.; Wei, Q. Application of Quadratic Interpolation Whale Optimization Algorithm in Cylindricity Error evaluation. Metrol. Meas. Tech. 2019, 46, 58–60. [Google Scholar]
  27. He, Q.; Wei, K.; Xu, Q. Mixed strategy based improved whale optimization algorithm. Appl. Res. Comput. 2019, 36, 3647–3651. [Google Scholar]
  28. Qiu, X.; Wang, R.; Zhang, W.; Zhang, Z.; Zhang, Q. Improved Whale Optimizer Algorithm Based on Hybrid Strategy. Comput. Eng. Appl. 2022, 58, 70–78. [Google Scholar]
  29. Chen, Y.; Han, B.; Xu, G.; Kan, Y.; Zhao, Z. Spatial Straightness Error Evaluation with Improved Whale Optimization Algorithm. Mech. Sci. Technol. Aero. Eng. 2022, 41, 1102–1111. [Google Scholar]
  30. Xu, J.; Yan, F. The Application of Improved Whale Optimization Algorithm in Power Load Dispatching. Oper. Res. Manag. Sci. 2020, 29, 149–159. [Google Scholar]
  31. Naderi, E.; Mirzaei, L.; Pourakbari-Kasmaei, M.; Cerna, F.V.; Lehtonen, M. Optimization of active power dispatch considering unified power flow controller: Application of evolutionary algorithms in a fuzzy framework. Evol. Intell. 2024, 17, 1357–1387. [Google Scholar] [CrossRef]
Figure 1. The basic construction of an oil-immersed transformer.
Figure 1. The basic construction of an oil-immersed transformer.
Symmetry 16 01382 g001
Figure 2. Flow chart of IWOA-LSTM-SA.
Figure 2. Flow chart of IWOA-LSTM-SA.
Symmetry 16 01382 g002
Figure 3. LSTM structure diagram.
Figure 3. LSTM structure diagram.
Symmetry 16 01382 g003
Figure 4. LSTM-SA structure.
Figure 4. LSTM-SA structure.
Symmetry 16 01382 g004
Figure 5. Flow chart of the IWOA.
Figure 5. Flow chart of the IWOA.
Symmetry 16 01382 g005
Figure 6. Average convergence curves for each algorithm.
Figure 6. Average convergence curves for each algorithm.
Symmetry 16 01382 g006aSymmetry 16 01382 g006b
Figure 7. The prediction results of IWOA-LSTM-SA.
Figure 7. The prediction results of IWOA-LSTM-SA.
Symmetry 16 01382 g007
Figure 8. Training and testing errors over iterations.
Figure 8. Training and testing errors over iterations.
Symmetry 16 01382 g008
Figure 9. Performance comparison across models.
Figure 9. Performance comparison across models.
Symmetry 16 01382 g009
Figure 10. Model residuals.
Figure 10. Model residuals.
Symmetry 16 01382 g010
Figure 11. Multi-step prediction performance comparison across models (one week).
Figure 11. Multi-step prediction performance comparison across models (one week).
Symmetry 16 01382 g011
Table 1. Correlation matrix.
Table 1. Correlation matrix.
AIBICIPQAUBUCUT
AI1.0000.9990.9990.9990.925−0.862−0.866−0.8350.371
BI0.9991.0000.9990.9990.924−0.863−0.866−0.8350.371
CI0.9990.9991.0000.9990.925−0.862−0.866−0.8350.371
P0.9990.9990.9991.0000.925−0.857−0.859−0.8280.369
Q0.9250.9240.9250.9251.000−0.842−0.844−0.8230.372
AU−0.862−0.863−0.862−0.857−0.8421.0000.9790.964−0.346
BU−0.866−0.866−0.866−0.859−0.8440.9791.0000.981−0.342
CU−0.835−0.835−0.835−0.828−0.8230.9640.9811.000−0.339
T0.3710.3710.3710.3690.372−0.346−0.342−0.3391.000
Table 2. Comparison of test results for each algorithm.
Table 2. Comparison of test results for each algorithm.
FunctionEvaluation IndexGAPSOWOAIWOA
F 1 Mean3602.3110.0357.21 × 10−101.46 × 10−19
Best1454.9550.0013.32 × 10−131.17 × 10−24
F 2 Mean21.19732.0135.16 × 10−91.73 × 10−13
Best13.9360.0815.12 × 10−92.24 × 10−15
F 3 Mean3477.9580.0478.98 × 10−104.16 × 10−20
Best1771.2410.0011.68 × 10−121.42 × 10−22
F 4 Mean1.4325.1760.0150.00075
Best0.4130.0650.0030.00014
F 5 Mean28.47451.15200
Best5.522000
F 6 Mean91.831127.2570.4621.78 × 10−16
Best64.79569.1706.78 × 10−110
F 7 Mean11.3372.0283.9361.49 × 10−11
Best9.1970.0238.06 × 10−71.35 × 10−12
F 8 Mean77.000551.9760.9880
Best35.494185.62500
F 9 Mean75.910727.867−0.898−0.829
Best28.593479.302−0.967−0.986
F 10 Mean73.449596.665−0.890−0.796
Best26.910332.989−0.980−0.899
Table 3. Model prediction evaluation indexes.
Table 3. Model prediction evaluation indexes.
ModelRMSEMAEMAPE (%)R2Time (s)
Dataset 1BP1.6981.2282.5810.82513.287
CNN1.6461.1702.4620.83632.317
GRU1.5531.0112.1440.85496.109
LSTM1.6331.0222.1750.838129.666
LSTM-SA1.5371.0312.2530.861174.497
WOA-LSTM-SA1.4620.9982.1030.87011,058.906
IWOA-LSTM-SA1.4380.9892.0890.87310,083.375
Dataset 2BP0.9230.7152.4280.97438.216
CNN0.8240.5961.9290.97980.746
GRU0.7580.5441.7720.982165.984
LSTM0.8740.6432.1290.977234.946
LSTM-SA0.8090.5761.8900.980383.995
WOA-LSTM-SA0.7570.5351.7390.98213,016.477
IWOA-LSTM-SA0.7490.5241.7030.98311,075.689
Table 4. Ablation experiment evaluation metrics.
Table 4. Ablation experiment evaluation metrics.
LSTMLSTM-SAWOA-LSTMIWOA-LSTMWOA-LSTM-SAIWOA-LSTM-SA
Dataset 1RMSE1.6331.5371.5961.5171.4621.438
MAPE2.1752.2532.1412.1062.1032.089
Dataset 2RMSE0.8740.8090.8370.7820.7570.749
MAPE2.1291.8902.0421.8141.7391.703
Table 5. Multi-step prediction evaluation metrics.
Table 5. Multi-step prediction evaluation metrics.
StepModelRMSEMAEMAPE (%)Time (s)
Dataset 11 (30 min)BP1.6981.2282.58113.287
CNN1.6461.1702.46232.317
GRU1.5531.0112.14496.109
LSTM1.6331.0222.175129.666
LSTM-SA1.5371.0312.253174.497
WOA-LSTM-SA1.4620.9982.10311,058.906
IWOA-LSTM-SA1.4380.9892.08910,083.375
3 (90 min)BP1.7631.3822.87314.082
CNN1.6521.2212.55722.572
GRU1.5971.1332.40995.775
LSTM1.6051.1642.453179.898
LSTM-SA1.5621.1622.448229.012
WOA-LSTM-SA1.5551.1022.31111,746.135
IWOA-LSTM-SA1.5371.0882.30810,149.217
5 (150 min)BP1.9611.6113.35113.617
CNN1.8551.4112.97321.579
GRU1.8331.3872.94398.763
LSTM1.8081.3672.878197.507
LSTM-SA1.7961.3452.832240.519
WOA-LSTM-SA1.7681.3522.85912,212.086
IWOA-LSTM-SA1.7141.2942.70210,778.976
Dataset 21 (30 min)BP0.9230.7152.42838.216
CNN0.8240.5961.92980.746
GRU0.7580.5441.772165.984
LSTM0.8740.6432.129234.946
LSTM-SA0.8090.5761.890383.995
WOA-LSTM-SA0.7570.5351.73913,016.477
IWOA-LSTM-SA0.7490.5241.70311,075.689
3 (90 min)BP1.6541.1244.22537.313
CNN1.2831.0123.16679.190
GRU1.1750.8312.821229.788
LSTM1.3941.0803.674320.336
LSTM-SA1.2370.9233.111433.645
WOA-LSTM-SA1.0610.8332.74613,623.563
IWOA-LSTM-SA1.0150.7502.53711,284.158
5(150 min)BP1.8381.5684.85437.081
CNN1.9431.4034.93377.883
GRU1.9761.3874.801264.860
LSTM1.9031.4144.765171.239
LSTM-SA1.8741.3654.810414.213
WOA-LSTM-SA1.6641.2494.29812,823.645
IWOA-LSTM-SA1.6341.2294.16210,984.776
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zou, D.; Xu, H.; Quan, H.; Yin, J.; Peng, Q.; Wang, S.; Dai, W.; Hong, Z. Top-Oil Temperature Prediction of Power Transformer Based on Long Short-Term Memory Neural Network with Self-Attention Mechanism Optimized by Improved Whale Optimization Algorithm. Symmetry 2024, 16, 1382. https://doi.org/10.3390/sym16101382

AMA Style

Zou D, Xu H, Quan H, Yin J, Peng Q, Wang S, Dai W, Hong Z. Top-Oil Temperature Prediction of Power Transformer Based on Long Short-Term Memory Neural Network with Self-Attention Mechanism Optimized by Improved Whale Optimization Algorithm. Symmetry. 2024; 16(10):1382. https://doi.org/10.3390/sym16101382

Chicago/Turabian Style

Zou, Dexu, He Xu, Hao Quan, Jianhua Yin, Qingjun Peng, Shan Wang, Weiju Dai, and Zhihu Hong. 2024. "Top-Oil Temperature Prediction of Power Transformer Based on Long Short-Term Memory Neural Network with Self-Attention Mechanism Optimized by Improved Whale Optimization Algorithm" Symmetry 16, no. 10: 1382. https://doi.org/10.3390/sym16101382

APA Style

Zou, D., Xu, H., Quan, H., Yin, J., Peng, Q., Wang, S., Dai, W., & Hong, Z. (2024). Top-Oil Temperature Prediction of Power Transformer Based on Long Short-Term Memory Neural Network with Self-Attention Mechanism Optimized by Improved Whale Optimization Algorithm. Symmetry, 16(10), 1382. https://doi.org/10.3390/sym16101382

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop