Anomaly Identification of Wind Turbine Yaw System Based on Two-Stage Attention–Informer Algorithm

Shen, Xu; Wang, Haiyun; Huang, Xiaofang; Chen, Yang

doi:10.3390/app14198746

Open AccessArticle

Anomaly Identification of Wind Turbine Yaw System Based on Two-Stage Attention–Informer Algorithm

by

Xu Shen

¹,

Haiyun Wang

^1,*,

Xiaofang Huang

² and

Yang Chen

³

¹

Engineering Research Center of Education Ministry for Renewable Energy Power Generation and Grid Connection, College of Electrical Engineering, Xinjiang University, Urumqi 830017, China

²

Beijing Gold Wind Science and Creation Wind Power Equipment Company Ltd., Beijing 100176, China

³

Xinjiang Science and Technology Project Service Center, Urumqi 830017, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(19), 8746; https://doi.org/10.3390/app14198746

Submission received: 27 August 2024 / Revised: 20 September 2024 / Accepted: 24 September 2024 / Published: 27 September 2024

(This article belongs to the Topic Advances in Wind Energy Technology)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

In response to the problems that abnormal yaw position causes during the yawing process—on the one hand leading to the accumulation of yaw position errors, affecting the accuracy of yawing to the wind or safety due to excessive cable twisting, and on the other hand, with the phenomena of frequent position jumps or frequent short-term position maintenance generating certain yaw errors, affecting the stability of yaw control, thus resulting in a high occurrence frequency of yaw system failures and high operation and maintenance costs—a data-driven fault diagnosis method is proposed to give early warnings for abnormal conditions of the yaw position of the wind turbine unit. Firstly, for the massive data in the SCADA (Supervisory Control and Data Acquisition) system, the ReliefF feature algorithm based on standardized interaction gain (Standardized Interaction Gain and ReliefF, SIG–ReliefF) is used for accurately identifying and screening the characteristic parameters that have a greater impact on the yaw system failure of wind turbines. The advantage of this method lies in its ability to effectively consider the correlation between features and retain the relevant features and interaction features of yaw system failures to the greatest extent. Then, an Informer yaw position prediction model is established, combined with the two-stage attention mechanism (two-stage attention and Informer, TSA–Informer), and the distribution of residuals is statistically analyzed through the sliding window method to determine the fault threshold. Finally, the validity and accuracy of the proposed method are verified through examples, and through comparison with other algorithms, it is verified that it has better abnormal early warning performance. Relevant conclusions can provide a reference for the fault diagnosis of the actual yaw system.

Keywords:

wind turbine; yaw system; interactive information; ReliefF; Informer; abnormal diagnosis

1. Introduction

According to the latest statistics from the National Energy Administration of China, as of the end of October 2022, the installed capacity of wind power in China reached approximately 350 million kilowatts, representing a year-on-year growth of 16.6% compared to 2021. Wood Mackenzie stated in “2022 in review for offshore wind” in January 2023 that the global offshore wind industry is expected to achieve a five-fold growth by 2030 [1]. With the steady development of renewable energy, offshore wind power will play a crucial role in the field of clean energy and demonstrate vast development prospects. The main components of a wind turbine unit include the blades, gearbox, generator, pitch system, and yaw system [2]. Among these, the yaw system is one of the key executive mechanisms for quickly and accurately aligning the wind turbine unit with the wind direction to capture the maximum wind energy [3]. The operational status of the yaw system directly affects the performance and power generation of the wind turbine unit. If the yaw system malfunctions, it may lead to loss of control or unnecessary resource wastage, and even jeopardize the stable operation of the unit. Additionally, an unstable yaw system can reduce the safety and reliability of the wind turbine unit, while increasing operation and maintenance costs.

Wind turbine units often operate in harsh natural environments, enduring prolonged exposure to extreme and complex conditions such as temperature variations, snow accumulation, salt spray, and sand and dust particles. Moreover, the constantly changing wind conditions require frequent adjustments of the yaw system between different operational states. Consequently, the turbine units unavoidably experience multiple loads, leading to a gradual decline in insulation strength, fatigue resistance, and operational performance over time, ultimately resulting in system failures [4]. According to incomplete statistics, the operating and maintenance costs account for 25% to 50% of the total generation cost of wind turbine units [5,6]. Among different component failure rates, the yaw system failure rate is about 6.7%, and the downtime caused by it accounts for 13.3% [7,8]. Therefore, accurately identifying the abnormal status of the yaw system of wind turbine units and issuing early warnings to ensure safe and stable operation of the units is of great practical significance [9,10,11].

Some methods have been proposed for fault diagnosis and optimization control of the yaw system in the current stage. Reference [12] extracts the wind direction characteristics as elite factors and combines them with an optimized ant colony intelligence algorithm. This approach overcomes the limitations of the backpropagation (BP) neural network becoming stuck in local optima and improves the accuracy of processing valid data, significantly reducing the diagnosis time for yaw gearbox faults. Reference [13] conducted aerodynamic analysis of the blades and combined it with simulation methods to analyze the loads on wind turbine units under different yaw angles. The study concluded that a positive yaw deviation angle would result in larger hub loads for wind turbine units. This provides an effective reference for yaw control and protection of large-scale wind turbine units. Reference [14] utilized the ReliefF algorithm and kernel density-mean method to extract seven SCADA parameters that can reflect the operational conditions of the yaw gearbox. By combining these parameters with a neural network, the study diagnosed three operating states of the yaw gearbox: normal state, wear state, and broken tooth fault. Experimental results demonstrated that this method can effectively differentiate between different types of faults. Reference [15] analyzed the yaw data of wind turbine units using the SCADA system and proposed an economic model-based predictive control (EMPC) yaw strategy based on laser ranging. This strategy aims to evaluate the power generation and cumulative damage-equivalent load of the yaw bearing during the yaw process. The effectiveness of the EMPC strategy was ultimately validated using the XE112-2000 wind turbine model in the simulation software Bladed. Reference [16] utilized the GEZ gearbox fault diagnosis experimental platform to extract vibration experimental data. Through time-domain and frequency-domain analysis methods, the MATLAB software was used to effectively analyze the signals based on feature values and waveform-related information. The faulty signals were obtained by comparing them with the waveform and numerical parameters of normal signals, and this finally gives the fault result through the experiment. Some new diagnostic methods presented in the literature [7,17,18,19,20,21] have also provided new research directions for the diagnosis of faults in the yaw system.

Most of the prevailing fault diagnosis approaches for yaw systems predominantly concentrate on the identification and categorization of yaw gear-related faults, yet there is a notable scarcity of relevant research regarding the identification of yaw system abnormalities through the anomalous jump of the yaw position. Undoubtedly, the precision of the yaw position serves as the bedrock for the unit to attain effective yaw control. In the actual operational trajectory of the wind turbine unit, there have been occurrences of unit cable twisting faults induced by abnormal yaw positions (cam gear leaping). Furthermore, the abnormality of the yaw position also mirrors certain yaw drive mechanical glitches, such as imprecise meshing due to gear wear, yaw motor malfunctions, and gear tooth breakage resulting from wear and tear.

Drawing on the transient data from SCADA systems, this paper introduces a cutting-edge, data-driven diagnostic method specifically tailored for the abnormal yaw positions in wind turbines. This innovative approach not only identifies operational anomalies with high precision but also establishes an early warning system designed to predict and mitigate potential yaw system failures. By integrating historical data into a dynamic big data platform, the method undergoes continuous iteration and optimization, enhancing its diagnostic capabilities over time. The diagnostic process is streamlined into a seamless, closed-loop system that includes Data Collection, Anomaly Diagnosis and Early Warning, Verification and Investigation, Feedback and Model Refinement, and Event Handling.

This method has proven to significantly reduce operational risks, enhance the safety and reliability of wind turbines, and lower maintenance costs. The model’s exceptional performance is evidenced by a 92% accuracy rate in identifying anomalies over a three-month period, as demonstrated by the platform’s operational results.

The structure of this paper is as follows. In Section 2, the basic mechanism of the yaw system and its common fault types are briefly described, and we also explain the principle of the SIG–ReliefF algorithm, with yaw position being considered as the prediction parameter of the yaw system. In Section 3, the validity of SIG–ReliefF algorithm is verified. Additionally, based on the principle of the two-stage attention (TSA) mechanism, we optimize the structure of the Informer model and build the Adam–TSA–Informer anomaly identification model. In Section 4, the effectiveness of the model is validated through practical case studies. Finally, Section 5 provides a summary of the work.

2. Preparatory Knowledge

2.1. Basic Mechanism of Yaw System

Yaw is the process by which the nacelle of a wind turbine adjusts horizontally to face the wind, ensuring that the rotor always faces the direction of the wind. From a control perspective, the yaw system is divided into passive and active types. The passive yaw system utilizes the torque generated by the wind acting on the turbine blades to automatically align the wind turbine with the wind direction through aerodynamic effects. When the wind direction changes, the wind force on the blades will produce a rotating torque, driving the entire wind turbine to change direction. Active yaw is the use of electric or hydraulic drive to complete the yaw mode of wind action. At present, most of the yaw drive structures of wind turbines are horizontal axes, active types, and electric-driven rolling yaws [22,23]. The yaw system is usually located at the top of the engine room of the wind turbine, near the wind wheel and the tower, mainly composed of yaw gear rings, yaw motors including electromagnetic brakes, yaw calipers, yaw pinions, yaw reducers, and other parts. During power generation, the yaw system coordinates with the wind turbine’s control system to keep the rotor constantly facing the wind, maximizing the utilization of wind energy, and improving the power generation efficiency of the wind turbine. When the wind and storm exceed the cutting wind speed, the unit stops, and the load of the unit is reduced by yawing to ensure the safety of the unit.

2.2. Common Fault Types of the Yaw System

During the long-term operation of wind turbines, coupled with prolonged exposure to various harsh environmental conditions, the key components of the yaw system are prone to fatigue damage, inevitably leading to certain mechanical failures. The fault types of the yaw system are as follows:

(1) Yaw drive mechanical failures: The yaw drive motor is responsible for steering the nacelle to align with the wind, and its failure can lead to inaccurate adjustment of the yaw position. Examples of such failures include yaw gearbox gear tooth damage, yaw motor malfunctions, yaw bearing gear tooth breakage and detachment, etc.

(2) Yaw brake mechanical failures: excessive wear of friction pads, brake oil leakage, excessive noise during yawing, insufficient yaw braking force, etc.

(3) Yaw counter mechanical failures: loose connecting bolts, foreign object intrusion, damaged connecting cables, etc.

(4) Anemoscope failure.

According to the actual operation of the wind turbine, it can be observed that the failure and shutdown of the yaw system components is an accumulative process, which is particularly evident when analyzing the SCADA data. Initial damage to the yaw components is typically manifested as gear wear, electromagnetic brake pad wear, abnormal motor temperature, etc. This is followed by intermediate-level wear, such as bearing cracking, cam tooth skipping, etc. If timely inspection and maintenance of the yaw system components are not carried out at this stage, it will lead to more serious system failures and even jeopardize the overall operational safety of the turbine [24,25].

In the event of primary damage to the yaw system, the change in yaw position is more sensitive. Therefore, this paper considers the yaw position as a prediction parameter of the yaw system. By analyzing the patterns of its variations, it is possible to identify anomalies in the yaw system.

2.3. Feature Parameter Selection Based on the Improved ReliefF Algorithm

2.3.1. ReliefF Feature Selection Theory

The ReliefF algorithm is a widely used feature selection technique aimed at selecting features with high correlation and significance from a given feature set. Its core concept involves assessing feature importance by computing distances between features and samples. In essence, it assigns weights based on the capacity of feature variables to distinguish between various samples: boosting the weight for features with strong discriminatory power and reducing it for those with weaker discriminatory capability. The process entails randomly selecting a sample A from the training set S, computing the nearest neighbors of different classes (NearMiss) and the same class (NearHit) for A, and then iteratively updating the weight of each feature according to predefined rules [26].

W (X) = w (X) - \frac{\sum_{j = 1}^{k} d i f f (X, A_{i}, H_{j})}{n k} + \sum_{C^{1} C l a s s (A)} \frac{P (C)}{1 - P (C l a s s (A_{i}))} * \frac{\sum_{j = 1}^{k} d i f f (X, A_{i}, S_{j})}{n k}

(1)

In the formula,

X

represents a certain feature parameter;

ω (X)

represents the weight of feature

X

, typically initialized to 0;

n

represents the number of iterations;

A_{i}

represents the randomly selected sample in the

i

th iteration;

k

represents the number of nearest neighbors selected;

H_{j}

represents the nearest neighbor sample in the same class as sample

A_{i}

;

S_{j}

represents the nearest neighbor sample in a different class from sample

A_{i}

;

C l a s s (A_{i})

represents the class of sample

A_{i}

;

P (C)

represents the proportion of samples in class

C

;

d i f f (X, A_{i}, H_{j})

represents the difference in feature parameter

X

for faulty samples in the same class.

2.3.2. Feature Interaction

Interaction Information is a concept used to measure the degree of interaction between random variables. It is based on the principles of information theory and is used to describe the amount of information contained when variables occur together, as well as the gain in information relative to when they occur separately. Therefore, Interaction Information is also known as Interaction Gain (IG). In this article, IG refers to the measure used in feature selection to assess the extent to which interactions among multiple features improve the predictive performance of the target variable. The definition of three-way Interaction Gain is given as follows:

I G (f_{i}; f_{j}; C) = I (f_{i}; f_{j}; C) - I (f_{i}; C) - I (f_{j}; C)

(2)

In the formula,

I (f_{i}; f_{j}; C)

represents the interaction information between feature variables

f_{i}

and

f_{j}

within category

C

. When

I G (f_{i}; f_{j}; C) > 0

, it indicates that the information provided by the joint occurrence of feature variables

f_{i}

and

f_{j}

is greater than the sum of the information provided by the two individual feature variables. This suggests the presence of interaction between the two variables. Otherwise, it indicates that the information provided by the joint occurrence of the two variables is independent of category

C

, or there is redundant information.

2.3.3. SIG–ReliefF

By using Standardized Interaction Gain (SIG), the interaction gains between different features can be standardized to enhance comparability and better reflect their contributions to the improvement of model performance. The definition is as follows:

S I G (f_{i}; f_{j}; C) = \frac{I G (f_{i}; f_{j}; C)}{H (f_{i}) + H (f_{j})}

(3)

By combining Equations (1) and (3), we can derive the improved weight formula for feature

X_{i}

as follows:

\bar{w} (X_{i}) = \frac{1}{n} \sum_{j = 1}^{N} S I G (X_{i}; X_{j}; C) + w (X_{i})

(4)

By using Equation (4) as the basis for weight updates, we define it as the SIG–ReliefF algorithm. Compared to the traditional ReliefF algorithm, the SIG–ReliefF algorithm (Algorithm 1) can eliminate the influence of different scales among features, improve the accuracy of feature weight calculation, and take into account the interaction between features more comprehensively. This leads to improved feature selection and ultimately enhances the performance of the model.

Algorithm 1. SIG–ReliefF algorithm flow

Input: Construct feature data set

S = {f_{1}, f_{2}, f_{3}, \dots, f_{n}}

, category set

C = {C_{1}, C_{2}, C_{3}, \dots, C_{m}}

, and the threshold is

k

Output: Target set

G

1. Initialize the target set

2. For

i = 0

to

m

3. Calculate the SIG weights among all feature parameters in Equation (3)

4. Calculate the weights of all feature parameters in Equation (1)

5. End for

6. For

i = 0

to

m

7. The total weight values of the feature parameters are retained according to Equation (4)

8. End for

9. The weights of all the calculated feature parameters in the target set were sorted in descending order and combined with expert experience. The feature parameters with greater influence on the target state and greater weight were selected to build a new sample set

3. Yaw Position Anomaly Recognition Model

3.1. Parameter Selection Effect of Model SIG–ReliefF

Select the direct-drive turbine data of a certain wind farm in service. The turbines are labeled F1–F6, and their basic parameters are as follows: rated power of 5200 KW, blade diameter of 165 m, cut-in wind speed of 3 m/s, cut-out wind speed of 24 m/s, with a data sampling interval of 7 s. The recorded operational parameters include time, wind speed, power output, yaw position, wind direction angle, and yaw speed.

To verify the effectiveness of the feature-mining algorithm, this paper conducts performance comparisons on a total of four feature selection methods: ReliefF, KPCA, AE–ReliefF, and SIG–ReliefF. The experimental results are mainly comprehensively considered in dimensions such as the prediction accuracy and F1 index of the Informer model. The number of iterations of the model is set to six; the learning rate is set to 0.0001; the bath size is set to 64; the number of attention heads is set to eight; the activation function is Gelu; and dropout is set to 0.05.

As can be seen from Table 1 and Table 2, compared with the other three feature selection algorithms, the SIG–ReliefF algorithm has a significant improvement in the average accuracy and F1 index on the selected ten data sets. This indicates that the proposed method has better feature selection performance compared to traditional single-feature selection algorithms and is more suitable for the prediction task of the Informer model.

3.2. Model TSA–Informer

The Informer model is a deep-learning architecture designed for time series prediction [27]. It was proposed by Zhou et al., Beijing University of Aeronautics and Astronautics in 2021, aiming to address the limitations of Transformer-based deep-learning models in long-sequence time prediction tasks. In the analysis of time series data, capturing long-term dependencies is the key to improving prediction accuracy [28]. The Informer effectively solves this problem by introducing a distillation mechanism and a multi-head probabilistic sparse attention mechanism; especially in scenarios where multiple future time points need to be predicted, the Informer model shows its superiority [29].

The two-stage attention (TSA) mechanism is an advanced attention allocation strategy designed to improve the performance of the model when processing multi-dimensional time series data [30]. The TSA mechanism enhances the model’s understanding of complex relationships in the data through two consecutive stages.

First stage: Cross-time stage. In this stage, the TSA mechanism focuses on capturing the interdependencies between time series. It is given a two-dimensional input vector

Z \in R

which contains the feature information of time series. In this stage, the model applies the MPSA mechanism along the time dimension. MPSA effectively identifies and strengthens the correlations between time series by analyzing the data on different time scales. The time complexity of this process is related to the sequence length and feature dimension, and is mainly determined by the number of segments in each feature dimension, which is specifically expressed as follows:

\bar{w} (X_{i}) = \frac{1}{n} \sum_{j = 1}^{N} S I G (X_{i}; X_{j}; C) + w (X_{i})

(5)

{\hat{Z}}_{:, d}^{t} = LayerNorm (Z_{:, d} + M_{PSA}^{t} ({\bar{Z}}_{:, d}, Z_{:, d}, Z_{:, d}))

(6)

In the formula,

{\hat{Z}}_{:, d}^{t}

is the output sequence of the MPSA layer at a certain moment;

LayerNorm

is the normalization operation;

Z_{:, d}

is the vector of all time steps in the feature dimension d;

M_{PSA}^{t}

is the operation of the MPSA layer at time

t

;

{\bar{Z}}_{:, d}

is a random vector of all time steps in the feature dimension d; MLP is a two-layer feedforward network;

Z^{t}

and

Z^{t - 1}

are the output sequences of the MPSA layer at times

t

and

t - 1

, respectively.

Second stage: Cross-feature stage. The purpose of this stage is to reveal the intrinsic connections between different feature sequences. The TSA mechanism introduces an intermediate routing mechanism: at each time step

θ

, the model uses a learnable vector

λ_{θ, :}

of fixed dimension as the intermediate routing vector. This vector is responsible for summarizing the information of each feature dimension. In this way, the model can focus on those features that are most critical for the current prediction. Subsequently, the model separates the information of each feature sequence from the intermediate routing and further analyzes and mines the mutual influences between features, thus achieving the cross and fusion of information at the feature level, specifically expressed as follows:

λ_{θ, :} = M_{PSA 1}^{d} (R_{θ, :}, Z_{θ}, Z_{θ, :}), 1 \leq θ \leq \frac{T}{L}

(7)

{\bar{Z}}_{θ, :}^{d} = M_{PSA 2}^{d} (Z_{θ, :}, λ_{θ, :}, λ_{θ, :}), 1 \leq θ \leq \frac{T}{L}

(8)

In the formula,

R_{θ, :} \in R^{(T / L) \times c \times d}

is the intermediate routing scientific vector matrix;

Z_{θ, :}

is the two-dimensional vector at time step

θ

;

{\bar{Z}}_{θ, :}^{d}

is the output of the intermediate routing mechanism at time step

θ

;

T

is the time sliding window;

L

is the sequence length of feature latitude

d

.

Through the interaction of the above two stages, the TSA mechanism not only improves the model’s ability to capture time dependencies in time series data, but also enhances the recognition of complex relationships between features, thereby achieving more accurate and comprehensive predictions in multi-dimensional time series analysis tasks. This hierarchical attention allocation strategy enables the model to process high-dimensional time series data more flexibly and efficiently, providing strong support for improving the performance of diagnostic models.

To enhance the Informer model’s ability to capture the interrelationships between feature dimensions in time series prediction tasks, an Informer model based on the Adam optimization algorithm and the TSA mechanism (hereinafter referred to as Adam–TSA–Informer) is proposed. The TSA mechanism improves the model’s sensitivity to complex data by capturing short-term dependencies within time series and aggregating long-term relationships between different features. The adaptive learning rate adjustment strategy of the Adam optimizer makes model training more efficient and shows good stability, especially when dealing with long-term dependencies. This method enhances the generalization ability of the Informer model, enabling it to provide more accurate prediction results when facing variable time series data. Figure 1 shows the structure of the TSA–Informer model.

The algorithm flow of the TSA–Informer model based on Adam optimization can be summarized as the following steps:

(1) Initialization: for each parameter

w

in the model, set the first matrix

m_{0}

and the second matrix

v_{0}

to 0, and configure the initial learning rate

α

as well as the decay rates

β_{1}

and

β_{2}

.

(2) Forward propagation: perform a forward propagation to obtain the model output and calculate the loss function

L

.

(3) Gradient calculation: determine the gradient

\nabla_{w} L

of each parameter through backpropagation.

(4) First matrix update: update the first matrix

m_{t}

to be the exponential moving average of the gradient.

m_{t} = β_{1} \cdot m_{t - 1} + (1 - β_{1}) \cdot \nabla_{w} L

(9)

(5) Second matrix update: update the second matrix

v_{t}

to be the exponential moving average of the squared gradient.

v_{t} = β_{2} \cdot v_{t - 1} + (1 - β_{2}) \cdot {(\nabla_{w} L)}^{2}

(10)

(6) Bias correction: calculate the corrected first and second moments.

{\hat{m}}_{t} = \frac{m_{t}}{1 - β_{1}^{t}}

(11)

{\hat{v}}_{t} = \frac{v_{t}}{1 - β_{2}^{t}}

(12)

(7) Adaptive learning rate: based on the corrected first and second moments, calculate the adaptive learning rate for the parameters.

α_{t} = α \cdot \frac{{\hat{m}}_{t}}{\sqrt{{\hat{v}}_{t} + ε}}

(13)

In the formula,

ε

is a constant used to ensure numerical stability.

(8) Parameter update: apply the adaptive learning rate to update the model parameters.

w_{t + 1} = w_{t} - α_{t} \cdot {\hat{m}}_{t}

(14)

(9) Iteration: repeat steps (2) to (8) until the model performance converges or a predetermined number of iterations is reached.

Select the direct-drive unit data of a certain wind farm in service. Its basic parameters are as follows: rated power 4.2 MW; impeller diameter 136 m; cut-in wind speed 2.5 m/s; cut-out wind speed 25 m/s. The model is built using the deep-learning framework PyTorch based on Python 3.10. Some of its parameter settings are shown in Table 3.

To verify the effectiveness of the model’s prediction results, based on the model performance evaluation indicators of MAE, RMSE, and R², this paper uses SGD and Adam optimization algorithms to compare the performance of TSA–Informer with six models such as BPNN, RNN, GRU, CNN, Transformer, and Informer (with unified parameter settings). The performance comparison results are shown in Table 4.

As can be seen from Table 3 and Table 4, when the prediction step length (pred_len) of the TSA–Informer model is 300, its MAE, RMSE, and R² are all better than the other six basic models, and the optimization performance of the Adam algorithm is significantly higher than that of the SGD algorithm, indicating that the Adam optimization algorithm is more suitable for the construction of the diagnostic model in this paper. Specifically, compared with the basic Informer model, the MAE of TSA–Informer is reduced by 26.02%, RMSE is reduced by 25.2%, and R² is increased by 4.9%.

3.3. Sliding Window Residual Analysis

The residual distribution characteristics of the state parameters of the yaw system under a normal operating state reflect its normal features. When the yaw system is in normal operation, the residuals of these state parameters should fluctuate within a small range [31]. However, if an abnormality occurs in the yaw system, resulting in a larger difference between the actual and expected values, the residuals will become larger. Therefore, by monitoring the residuals of the yaw system’s state parameters, abnormal characteristics can be identified. This paper primarily focuses on the residual distribution characteristics of the yaw position and combines them with an auxiliary analysis of other state parameters to identify abnormal patterns in the yaw system.

As shown in Figure 2, the residual range is divided using the 0.025 and 0.975 quantiles (0.025 and 0.975 correspond to the critical values in the standard normal distribution, representing the upper and lower 2.5% cumulative probabilities in the normal distribution. This approach allows for a more comprehensive consideration of outlier situations and ensures the robustness and effectiveness of the model under different conditions).

Compared to methods such as the sequential probability ratio test, time series analysis, neural network residual analysis, and other residual analysis methods, the sliding window method offers advantages in terms of real-time capability, adaptability, simplicity in principle, and practical engineering applicability in residual analysis. This paper utilizes the sliding window method to achieve early warning for generator anomalies through the analysis of residuals.

We assume that the residual sequence of the model over a period of time is as follows:

ε = [\begin{matrix} ε_{1} & ε_{2} & \dots & ε_{n} & \dots \end{matrix}]

(15)

Assuming the length of the window is N, calculate the average of the consecutive N residuals within the window:

\bar{ε} = \frac{1}{N} \sum_{i = 1}^{N} ε_{1}

(16)

Based on this, the window has a width of 10 min and moves in steps of 30 s. Define the anomaly degree formula:

e_{AI} = \frac{N 1 + N 3}{N 1 + N 2 + N 3}

(17)

In the formula, N1 represents the number of residuals falling into area 1; it is the same with N2 and N3.

The degree of abnormality in yaw position is directly proportional to

e_{AI}

(AI: Anomaly Index). Based on expert experience, this paper defines yaw position as abnormal when

e_{AI}

exceeds 0.6.

3.4. Anomaly Detection Algorithm Process

The specific process for identifying abnormal yaw position is as follows:

(1) Data Acquisition: Retrieve the yaw position data from the SCADA system, along with the determined parameters through the SIG–ReliefF algorithm. These will serve as the input for the TSA–Informer model.

(2) Predicting State Parameters: utilize the input parameters to predict the yaw position based on the TSA–Informer model.

(3) Calculate

e_{AI}

and identify the abnormal timing of the yaw position.

(4) Secondary feature processing for constructing true and specific abnormal features.

(5) Based on the final model, train and obtain anomaly diagnosis results through the big data platform.

(6) Verification and Analysis of Anomalous Results: if anomalies are confirmed, prompt on-site alerts should be issued, accompanied by corresponding solutions in a timely manner.

4. Case Analysis

4.1. Prediction Effect Analysis Based on TSA–Informer Model

The effectiveness of the prediction performance based on the TSA–Informer model was analyzed using SCADA data, with the time series of residuals exceeding the threshold marked as abnormal segments. Figure 3 shows the time series of actual values and predicted values for a specific day (00:00–24:00) of normal operation in a wind turbine unit. Most data points have residuals within 1.8°, indicating that the TSA–Informer anomaly detection model can effectively track the actual operating state. Based on the data from six units operating normally for two months, the F1 score (a comprehensive evaluation metric combining accuracy and recall) of the model was found to be 0.921, indicating a high level of recognition accuracy.

4.2. Abnormal Case Analysis and Effectiveness Verification

Through checking the historical fault data of wind turbines, the yaw gear ring of Unit 3 of a wind farm was damaged on 6 October 2022. To verify the effectiveness of the model, it can be seen from the calculation of anomaly index

e_{AI}

of the yaw position that the value of

e_{AI}

was greater than 0.6 four times since September 2022 and reached the maximum value of 0.841 on 5 October. Therefore, it can be preliminarily judged that the yaw position has had an abnormal trend since September 2022. Due to the partial “errors” between the abnormal timing sequence initially screened and the actual anomaly, the verification of the field working conditions found that some “anomalies” were only affected by manual operation, resulting in data inconsistent with the actual situation, which is called “abnormal false reporting” in this paper. Therefore, the abnormal pattern of the yaw position was obtained through secondary feature construction of specific abnormalities. Figure 4 shows the situation of the yaw position under normal conditions. The vertical axis of Figure 4 displays the yaw position and the fluctuation of the yaw change rate, while the horizontal axis represents the length of the data segment.

Figure 5 illustrates three distinct abnormal yaw position patterns identified by our model following extensive training on the big data platform. These patterns were corroborated through rigorous on-site inspections, revealing specific damage to various components of the yaw system under the conditions depicted by the time series data.

Abnormal Pattern (a): This pattern is characterized by the yaw position being persistently fixed around −435°, with the yaw speed experiencing a sudden leap from 0°/s to approximately 0.2°/s. The diagnostic logic for this anomaly is based on the yaw position being static for an extended period, recurring more than twice within a month, each episode lasting over 60 s. Upon verification, it was determined that the yaw gear and the torsion cable switch small gear were excessively tight in their mesh, leading to an undue force on the yaw gear. This tight meshing not only increases friction and stress but also accelerates the wear and tear of the yaw gear, potentially leading to a complete failure over time.

Abnormal Patterns (b) and (c): These patterns suggest issues such as potentiometer sticking or the wear of the resistance wire. The sticking of the potentiometer can cause the yaw drive to become unresponsive, while the wear of the resistance wire can lead to inaccurate position readings, both of which can disrupt the yaw system’s alignment capabilities.

The analysis of these patterns carries significant practical benefits for the operational management of wind turbines. Early detection and prompt resolution of these anomalies are crucial in preventing minor issues from developing into severe faults (Table 5). This proactive approach helps to avoid not only costly downtime but also expensive repair costs. Furthermore, the diagnostic strategy facilitates a more precise and efficient allocation of maintenance resources. With a clear understanding of when anomalies in the yaw system occur, maintenance crews can be better equipped to address these issues at the optimal times, ensuring that corrective actions are timely and effective. This enhances the overall reliability and longevity of wind turbine operations while minimizing the financial impact of maintenance activities.

5. Conclusions

Currently, most diagnostic methods for the yaw system focus on studying faults related to the yaw gearbox. However, there is a lack of relevant abnormal diagnostics for other faults such as insufficient yaw braking force, severe brake pad wear, and counter malfunctions leading to excessive torsion cable tension. Considering that the yaw system accumulates abnormal data in its relevant state parameters when components experience abnormalities, this paper proposes a data-driven abnormal diagnosis method of wind turbine yaw position based on the SCADA big data platform. To sum up, the following conclusions can be drawn.

(1) Compared with traditional data dimensionality reduction methods, using the improved SIG–ReliefF algorithm for feature selection can effectively reduce the impact of redundant features on model performance, thereby improving the accuracy and reliability of the model.

(2) The Adam optimizer performs parameter optimization on the Informer model combined with the two-stage attention mechanism (TSA), effectively improving the performance of the model when processing time series data. It shows significant advantages in practical applications of unit fault diagnosis and helps to achieve more accurate and timely fault prediction and diagnosis.

(3) In the future, data-driven anomaly detection technology is expected to play a critical role in wind turbine fault diagnosis. Firstly, there is a continuous effort to enhance the accuracy and reliability of anomaly detection models. While machine learning and deep-learning-based methods have been utilized in diagnosing wind turbine faults, there remains potential for improving model performance through algorithm integration and model architecture optimization. Future research endeavors can focus on bolstering the precision and robustness of anomaly detection models. Secondly, there is a growing emphasis on integrating these models into practical applications and formulating comprehensive warning and maintenance strategies. By analyzing and identifying abnormal data in wind turbines, potential faults can be proactively identified, leading to the development of corresponding maintenance plans. Furthermore, leveraging accumulated fault data and experiences can optimize operational and maintenance processes, curtail operational costs, and enhance the reliability and efficiency of wind turbines. In summary, data-driven wind turbine anomaly diagnostic models hold substantial promise for practical application. Subsequent research will continue to refine algorithm and model performance, fortify data processing and analysis capabilities, and furnish more precise and reliable fault diagnosis and maintenance support for the wind energy industry.

Author Contributions

Validation, X.H.; Writing—original draft, X.S.; Writing—review & editing, H.W.; Visualization, Y.C.; Supervision, H.W.; Project administration, H.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Tianshan Talent Program, grant number 2022TSYCJC0030, the Key Research and Development Program of the Autonomous Region, grant number 2022B03031, the Science and technology project of the Hami High-tech Zone, grant number HGX2023KJXM008, and the central government will guide local science and technology development in 2024, grant number ZYYD2024CG13.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

Author Xiaofang Huang was employed by the company Beijing Gold Wind Science and Creation Wind Power Equipment Company Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Santos, A.C.; Souza, W.A.; Barbara, G.V.; Castoldi, M.F.; Goedtel, A. Diagnostics of Early Faults in Wind Generator Bearings Using Hjorth Parameters. Sustainability 2023, 15, 14673. [Google Scholar] [CrossRef]
Hu, Y.; Liu, H.; Shi, P.; Chen, X.; Fang, C.; Li, H.; Ran, J. Overview of Fault Diagnosis and Life Prediction for Wind Turbine Yaw System. Proc. CSEE 2022, 42, 4871–4884. [Google Scholar]
Shen, X.; Du, W. Expectation and Review of Control Strategy of Large Wind Turbines Yaw System. Trans. China Electrotech. Soc. 2015, 30, 196–203. [Google Scholar]
Li, H.; Hu, Y.G.; Li, Y. Overview of Condition Monitoring and Fault Diagnosis for Grid-Connected High-Power Wind Turbine Unit. Electr. Power Syst. Res. 2016, 36, 6–16. [Google Scholar]
Badihi, H.; Zhang, Y.; Jiang, B.; Pillay, P.; Rakheja, S. A Comprehensive Review on Signal-Based and Model-Based Condition Monitoring of Wind Turbines: Fault Diagnosis and Lifetime Prognosis. Proc. IEEE 2022, 110, 754–806. [Google Scholar] [CrossRef]
Attallah, O.; Ibrahim, R.A.; Zakzouk, N.E. CAD system for inter-turn fault diagnosis of offshore wind turbines via multi-CNNs & feature selection. Renew. Energy 2023, 203, 870–880. [Google Scholar]
Yan, X.; Jin, Y.; Xu, Y.; Li, R. Wind Turbine Generator Fault Detection Based on Multi-Layer Neural Network and Random Forest Algorithm. In Proceedings of the 2019 IEEE Innovative Smart Grid Technologies—Asia 2019, Chengdu, China, 21–24 May 2019; pp. 4132–4136. [Google Scholar]
Tian, S.; Qian, Z.; Chen, N.; Zhou, J. Fault Diagnosis and Life Prediction of Wind Turbine Based on Site Monitoring Data. In Proceedings of the 3rd International Conference on Instrumentation, Bali, Indonesia, 28–30 August 2013; pp. 1185–1188. [Google Scholar]
Zhou, S.; Cai, B.; Chu, X.; Zhao, W.; Huang, L. A Single-Side Disc Motor with Independent Controllable Excitation Magnetic Poles for Wind Turbine Yaw System. In Proceedings of the 22nd International Conference on Electrical Machines and Systems, ICEMS 2019, Harbin, China, 11–14 August 2019; pp. 1–4. [Google Scholar]
Pandit, R.; Infield, D.; Dodwell, T. Operational Variables for Improving Industrial Wind Turbine Yaw Misalignment Early Fault Detection Capabilities Using Data-Driven Techniques. IEEE Trans. Instrum. Meas. 2021, 70, 1–8. [Google Scholar] [CrossRef]
Kabir, M.J.; Oo, A.M.T.; Rabbani, M. A brief review on offshore wind turbine fault detection and recent development in condition monitoring based maintenance system. In Proceedings of the Australasian Universities Power Engineering Conference, AUPEC 2015, Wollongong, NSW, Australia, 27–30 September 2015; pp. 1–7. [Google Scholar]
Zhang, H.T.; Gao, J.H.; Wu, G.X. Ant Colony Optimization Applied in The Fault Detection of Wind Yaw. Renew. Energy 2013, 31, 48–50. [Google Scholar]
Feng, J.H.; Liu, X.H.; Xu, B.F. The Studies of The Influence of Yaw Deviation Angle on Hub Loads of Wind Turbine. Renew. Energy 2023, 41, 221–226. [Google Scholar]
Deng, Z.H.; Li, L.P.; Liu, R. Research on Diagnosis Method of Wind Turbine Yaw Gearbox Based on SCADA Data Feature Extraction. J. Chin. Soc. Power Eng. 2021, 41, 43–50. [Google Scholar]
Zhao, H.; Zhou, L.; Zhang, S.; Liang, Y. XE112-2000 Wind Turbine Yaw Strategy with Adaptive Yaw Speed Using DEL Look-Up Table. IEEE Access 2021, 9, 125724–125738. [Google Scholar] [CrossRef]
Zhao, G.L. Research on fault diagnosis and evaluation method of wind turbine gearbox. In Proceedings of the IEEE 3rd International Conference on Electronic Technology, Changchun, China, 26–28 May 2023; pp. 976–980. [Google Scholar]
Xiang, G.; Wei, Q. Current-based online bearing fault diagnosis for direct-drive wind turbines via spectrum analysis and impulse detection. In Proceedings of the 2012 IEEE Power Electronics and Machines in Wind Applications, Denver, CO, USA, 16–18 July 2012; pp. 1–6. [Google Scholar]
Yilmaz, O.; Yüksel, T. Artificial Neural Network Based Fault Diagnostic System for Wind Turbines. In Proceedings of the 30th Signal Processing and Communications Applications Conference (SIU), Safranbolu, Turkey, 15–18 May 2022; pp. 1–4. [Google Scholar]
Shi, Y.; Hou, Y.; Qian, S.; Liu, W.; Li, Z. Research on predictive control and fault diagnosis of wind turbine based on MLD. In Proceedings of the 32nd Chinese Control Conference, Xi’an, China, 26–28 July 2013; pp. 6166–6173. [Google Scholar]
Cheng, S.; Tao, W.; Zhao, Y. Research on Bearing Fault Identification of Wind Turbine Based on Deep Belief Network. In Proceedings of the IEEE 4th Conference on Energy Internet and Energy System Integration (EI2), Wuhan, China, 30 October–1 November 2020; pp. 4076–4080. [Google Scholar]
Zheng, X.; Zhou, G.; Dai, J.; Ren, H.; Li, D. Drive system reliability analysis of wind turbine based on fuzzy fault tree. In Proceedings of the 35th Chinese Control Conference (CCC), Chengdu, China, 27–29 July 2016; pp. 6761–6765. [Google Scholar]
Ning, W.; Jiang, H.; Wang, Y. Common Faults Analysis of Wind Turbine Yaw System. Mech. Manag. Dev. 2018, 33, 67–68. [Google Scholar]
Xiao, J.; Ni, W.; Jiang, T. Modeling and Simulation of The Yaw System of An Up Wind Turbine unit. Acta Energ. Solaris Sin. 1997, 18, 10. [Google Scholar]
Jin, X.; Sun, Y.; Shan, J.; Wu, G. Fault diagnosis and prognosis for wind turbines: An overview. Chin. J. Sci. Instrum. 2017, 38, 1041–1053. [Google Scholar]
Feng, Z.; Chu, F. Frequency Demodulation Analysis Method for Fault Diagnosis of Planetary Gearboxes. Proc. CSEE 2013, 33, 112–117. [Google Scholar]
Li, Y.P.; Wang, X.L. Improved RELIEF-BPNN classification model. Comput. Era 2023, 6, 20–24. [Google Scholar]
Liu, H.D.; Li, X.C.; Zhang, W.H. Research on the application of improved Adam training optimizer in gas emission prediction. J. Mine Autom. 2023, 49, 25–32. [Google Scholar]
Sun, Y.; Ning, L.; Zhao, B.; Yan, J. Tomato Leaf Disease Classification by Combining EfficientNetv2 and a Swin Transformer. Appl. Sci. 2024, 14, 7472. [Google Scholar] [CrossRef]
Li, H.Y.; He, X.W.; Wang, B.; Wu, H.; You, Q. Cloud Computing Resource Load Prediction Based on Improved Informer. Comput. Eng. 2024, 50, 43–50. [Google Scholar]
Zhou, N.; Zheng, Z.; Zhou, J. Prediction of the RUL of PEMFC based on multivariate time series forecasting model. In Proceedings of the 3rd International Symposium on Computer Technology and Information Science (ISCTIS), Chengdu, China, 7–9 July 2023; pp. 87–92. [Google Scholar]
Xu, S.; Deng, A.; Yang, H. Rotating Machinery Fault Diagnosis Method Based on Improved Residual Neural Network. Acta Energ. Solaris Sin. 2023, 44, 409–418. [Google Scholar]

Figure 1. TSA–Informer model structure.

Figure 2. The interval division of state parameter prediction residuals.

Figure 3. The prediction effect of the model.

Figure 4. Yaw position under normal conditions.

Figure 5. Yaw position under abnormal conditions.

Table 1. Prediction accuracy of different feature selection algorithms based on the Informer model.

Data Set	Number of Features	ReliefF	KPCA	AE–ReliefF	SIG–ReliefF
S1	12	82.31	83.22	92.60	94.76
S2	17	83.53	89.31	89.01	92.44
S3	6	73.26	79.13	92.54	90.37
S4	11	76.77	81.54	90.36	91.06
S5	24	85.12	82.35	94.72	93.83
S6	9	77.20	77.98	89.10	92.11
S7	31	86.44	82.14	93.86	97.28
S8	5	72.97	78.60	94.32	96.04
S9	46	87.05	89.01	91.92	93.29
S10	82	88.73	85.39	93.45	97.88

Table 2. F1 index based on different feature selection algorithms of the Informer model.

Data Set	Number of Features	ReliefF	KPCA	AE–ReliefF	SIG–ReliefF
S1	12	0.832	0.732	0.801	0.898
S2	17	0.841	0.814	0.866	0.902
S3	6	0.794	0.803	0.849	0.914
S4	11	0.812	0.846	0.890	0.942
S5	24	0.867	0.811	0.877	0.926
S6	9	0.831	0.791	0.896	0.837
S7	31	0.857	0.894	0.896	0.863
S8	5	0.790	0.726	0.887	0.927
S9	46	0.794	0.703	0.825	0.874
S10	82	0.767	0.681	0.891	0.946

Table 3. Hyperparameter settings of the TSA–Informer model.

Hyperparameter	Parameter Value	Hyperparameter	Parameter Value
freq	7 s	e_layers	2
seq_len	900	d_layers	1
pred_len	300	enc_in	3
d_model	468	dec_in	3
n_heads	8	d_ff	2632
activation	Gelu	dropout	0.05
bath_size	64	itr	6
learning_rate	0.0001	train_epochs	200

Table 4. Performance evaluation of different models.

Prediction Model	SGD Optimization Algorithm			Adam Optimization Algorithm
Prediction Model	MAE	RMSE	R²	MAE	RMSE	R²
BPNN	432.06	487.51	0.693	411.26	426.60	0.728
RNN	379.21	413.89	0.714	363.08	387.59	0.753
GRU	274.46	306.70	0.829	234.53	281.02	0.851
CNN	296.30	320.47	0.803	259.01	299.74	0.839
Transformer	244.77	267.91	0.837	221.60	239.83	0.874
Informer	201.92	232.44	0.859	170.92	201.07	0.903
TSA–Informer	165.38	198.62	0.894	126.45	150.36	0.947

Table 5. Example of abnormal diagnostic results.

Wind Turbine Number	Date	Start Time	End Time	Start Position	End Position	Abnormal Pattern
12#	2022 November 4	06:48:33	06:51:07	−8.62	24.12	a
51#	2022 December 21	09:47:04	09:49:04	−7.69	−13.04	c
80#	2022 December 31	20:02:20	20:02:34	10.1	60.7	b
12#	2022 November 4	06:48:33	06:51:07	−8.62	24.12	a

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Shen, X.; Wang, H.; Huang, X.; Chen, Y. Anomaly Identification of Wind Turbine Yaw System Based on Two-Stage Attention–Informer Algorithm. Appl. Sci. 2024, 14, 8746. https://doi.org/10.3390/app14198746

AMA Style

Shen X, Wang H, Huang X, Chen Y. Anomaly Identification of Wind Turbine Yaw System Based on Two-Stage Attention–Informer Algorithm. Applied Sciences. 2024; 14(19):8746. https://doi.org/10.3390/app14198746

Chicago/Turabian Style

Shen, Xu, Haiyun Wang, Xiaofang Huang, and Yang Chen. 2024. "Anomaly Identification of Wind Turbine Yaw System Based on Two-Stage Attention–Informer Algorithm" Applied Sciences 14, no. 19: 8746. https://doi.org/10.3390/app14198746

APA Style

Shen, X., Wang, H., Huang, X., & Chen, Y. (2024). Anomaly Identification of Wind Turbine Yaw System Based on Two-Stage Attention–Informer Algorithm. Applied Sciences, 14(19), 8746. https://doi.org/10.3390/app14198746

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Anomaly Identification of Wind Turbine Yaw System Based on Two-Stage Attention–Informer Algorithm

Abstract

1. Introduction

2. Preparatory Knowledge

2.1. Basic Mechanism of Yaw System

2.2. Common Fault Types of the Yaw System

2.3. Feature Parameter Selection Based on the Improved ReliefF Algorithm

2.3.1. ReliefF Feature Selection Theory

2.3.2. Feature Interaction

2.3.3. SIG–ReliefF

3. Yaw Position Anomaly Recognition Model

3.1. Parameter Selection Effect of Model SIG–ReliefF

3.2. Model TSA–Informer

3.3. Sliding Window Residual Analysis

3.4. Anomaly Detection Algorithm Process

4. Case Analysis

4.1. Prediction Effect Analysis Based on TSA–Informer Model

4.2. Abnormal Case Analysis and Effectiveness Verification

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI