1. Introduction
Under the background of actively promoting the goal of “carbon peak and carbon neutrality”, the proportion of wind power in power systems is increasing, and the proportion of traditional synchronous generators driven by fossil fuels is gradually decreasing. The addition of wind turbines reduces the equivalent inertia of a power system and threatens the frequency safety of the system. In order to ensure the frequency stability of the system, wind turbines have certain frequency regulation capabilities through various control means, such as virtual inertia control, overspeed, and variable pitch control [
1,
2,
3,
4]. However, relying only on wind turbines for a single frequency regulation still faces the problem of frequency instability due to insufficient frequency regulation capacity [
5,
6,
7,
8]. The rapid development of energy storage technology and the advantages of energy storage itself have created new ideas for frequency control of power systems, such as compensation of virtual inertia [
9] and synchronous generator control [
10]. Energy storage itself has the characteristics of fast response speed, high response accuracy, and flexible control, which can better meet the scheduling requirements of minute or even second level of frequency regulation. Compared with traditional synchronous generator sets, power grids with higher slope rate requirements and less power demand of frequency regulation power supply have better adaptability and can better meet the frequency regulation indicators of the system [
11]. Therefore, on the basis of exploring the frequency regulation capability of wind power, relevant research has proposed to connect energy storage to grid-connected wind power systems to provide active power support so as to increase the moment of inertia of the system and improve the frequency regulation performance of the grid [
12,
13]. Wang T and Xiang YW [
14] used an energy storage system to compensate for the virtual inertia of a wind farm, determined the output power of the wind storage system based on fuzzy control logic, and formulated a virtual inertia compensation strategy to improve the inertia level of the wind storage system. However, this method only improved the participation of energy storage in frequency regulation and did not give full play to the frequency support capability of wind turbines. Fuzzy control was used to design and implement the frequency regulation control of a wind turbine operating without load [
15]. According to the frequency change rate of different scenarios and the maximum output power of the wind turbine, a strategy of battery energy storage coordination for wind turbines to provide frequency regulation response was developed. The control strategy proposed by the existing research enabled the wind turbine to respond to the frequency change of the system, but it did not effectively coordinate the two frequency regulation resources of wind power and energy storage, resulting in a lack of dynamic coordination within the system and a waste of frequency regulation potential. Therefore, it is necessary to further study how to establish a flexible, coordinated control strategy for combined wind storage systems.
Wind power and energy storage are connected to the receiving power network through a power electronic converter, and relevant strategies have been adopted to control the DC power transmission, which can reduce the burden of the frequency regulation system. DC frequency regulation controls the active power of the connected alternating current (AC) system and effectively improves the frequency stability of the system due to its fast adjustability and good short-term overload capacity. DC frequency regulation has been widely used in existing asynchronous networking DC projects. It mainly adopts the form of DC frequency limiting control to improve the frequency of the connected AC system. A power grid adopts DC frequency limiting control in combination with traditional measures [
16], which can effectively solve the problem of frequency stability in a network. Zhang Q, Mccalley J D, et al. [
17] analyzed the frequency characteristics of the Yunnan power grid after asynchronization by combining the control principle of the DC frequency limiter and established a frequency regulation strategy for the AC system in which the DC frequency limiter participates. Considering complex operating conditions, the back-to-back DC frequency limiting controller of Chongqing and Hubei was optimized to improve the controller adjustment effect [
18]. A DC frequency control system was proposed in [
19], which applied a multi-variable control to the frequency control of the DC system. The frequency of the AC system can be better adjusted while maintaining the stability of the two AC systems. DC frequency regulation also plays an important role in absorbing new energy power fluctuations and the stability of multi-terminal DC cross-region systems. Papalexopoulos A D and Andrianesis P E [
20] investigated the frequency dynamic coupling relationship of AC systems on both sides of a high-voltage direct current (HVDC) rectification and inverter based on large wind farms and proposed an emission angle correction strategy to decouple frequency–voltage to improve the frequency dynamics of the rectifier end. However, wind farms and DC did not form an effective coordination relationship in the proposed method. Zhang Yi, et al. [
21] used a fuzzy logic controller to allocate the reference values of DC and wind farm frequency regulation power and proposed a frequency regulation power distribution method for units in wind farms so that grid-connected wind turbines can allocate frequency regulation power according to their own rotational kinetic energy and adjustable capacity.
A large number of scholars have studied the parameter optimization of DC frequency regulation but have not considered its synergistic effect with wind power and energy storage participating in frequency regulation. Therefore, in a power system where wind power is transmitted through DC, it is of great significance to study the frequency control strategy after the installation of energy storage devices. To improve the frequency stability of power systems with a high proportion of wind power penetration, it is crucial to coordinate various frequency regulation resources. In this paper, the primary frequency regulation optimization strategy of the combined wind storage system is studied. Furthermore, the frequency regulation requirements of the system are considered from the perspective of the inertia response of the power system: primary frequency regulation. The operating mechanism and cooperative control strategy under DC transmission of the combined wind storage system are also constructed. In this research, the goal is to improve the frequency stability of power systems under high proportion wind power penetration. In the process of research, only the frequency response characteristics of each part of the system are considered. Compared with existing frequency regulation methods, this study considers the participation of multiple frequency regulation resources, including energy storage, DC FLC, and wind turbine. The energy storage can adjust the output power adaptively by fuzzy logic control. Furthermore, the PI parameters of DC FLC are optimized using the DDQN algorithm in order to avoid the randomness problems of heuristic algorithms.
The scope of this investigation is divided into seven main sections.
Section 2 gives the control strategy for wind power and DC transmission. Furthermore,
Section 2 presents a control strategy for energy storage.
Section 3 gives the system simplification model considering variable resources of frequency regulation such as wind power, energy storage, and DC.
Section 4 and
Section 5 present the methodology for the optimization strategy, including the DDQN algorithm and the entropy method. In
Section 6, the simulation results are presented and analyzed. The conclusion is drawn in
Section 7.
2. Control Strategy for Wind Storage System and DC Transmission
2.1. Control Strategy for Wind Power
At present, the main ways that wind turbines participate in frequency regulation are by rotor virtual inertia, droop control, virtual synchronous generator, overspeed, and variable pitch control. Overspeed and variable pitch control actuate on load reduction, which makes wind power no longer use maximum power point tracking (MPPT) control and reduces the utilization rate of wind energy. Rotor virtual inertia control and sag control actuate on rotor kinetic energy. Although the wind turbine passes through the power electronic converter, its rotor frequency is decoupled from the system frequency and cannot respond to the system frequency change. However, the wind turbine blades are connected to the generator rotor through the gearbox, and the weight of the wind turbine blades is several times or even dozens of times larger than the weight of the generator rotor, and the rotating kinetic energy stored in the wind turbine blades is considerable when the unit is running normally. When the system frequency changes, the frequency support is provided by converting the kinetic energy and electric energy of the rotor. In order to ensure the maximum utilization of wind energy, a comprehensive inertia control method combining virtual inertia control and sag control is adopted to make wind power participate in frequency regulation.
When the system frequency changes, the virtual inertia control can change the reference value of the active power of the wind turbine according to the frequency change rate and magnitude and then change the electromagnetic power of the rotor to achieve additional power support for the system frequency change. The rotor virtual inertia control method does not require the reserve power of the wind turbine and can still realize the maximum utilization of wind energy. However, at the end of the inertia response process, the rotor speed returns to the initial value, which may cause a secondary droop or increase in the system frequency, and when the system is stable again, it will not be able to continue to provide frequency support.
The principle of virtual inertia control is as follows [
22]:
where
is the compensated active power of virtual inertia control;
is the system frequency deviation;
is the inertial response time constant.
Droop control is a steady-state support process that is mainly used to reduce the frequency deviation of the system and can simulate the differential adjustment process of the synchronous generator’s primary frequency regulation, which is conducive to maintaining the stability of the system frequency. The root cause of the frequency fluctuation of the power system is the mismatch of power, that is, the mismatch between the active power and the load power of the generator. When the active power output of the system power supply is missing or increasing, the system frequency will decrease or increase. The traditional synchronous generator set can respond to the change in system frequency; the governor controls the speed down or up and changes the output power of the synchronous machine so that the power in the system gradually reaches a new equilibrium state. A typical droop control is shown in
Figure 1.
Emulating the sagging characteristics of a synchronous generator, frequency deviation control can be added to the control strategy of the wind turbine so that it has similar sagging characteristics to a synchronous generator and then responds to the frequency change in the system. Droop control is differential regulation. After the adjustment, the system enters a new steady state, and the frequency at this time is not rated frequency. The principle of sag control is as follows [
23]:
where
is the droop control active power involved in a frequency regulation;
is the droop control factor.
The principle of comprehensive inertia control is shown in
Figure 2. The comprehensive inertia control obtains the compensation power
and
according to the frequency change rate and frequency deviation, respectively, which determines the reference value of active power together with the output power
controlled by wind power MPPT and then participates in the frequency adjustment of the system. The addition of droop control can avoid the secondary drop in the frequency and improve the stability of the system frequency. The function of the high-pass filter is to prevent the virtual inertia control from working in the steady state of the system and only acting on the frequency change rate [
24].
The output power
of wind power MPPT control is calculated as follows:
where
is the air density;
is the projected area of the wind flowing vertically through the blade;
is the wind speed;
is the rotor power coefficient.
Rotor power coefficient
is a function of blade tip speed ratio
and pitch angle
, and blade tip speed ratio is defined as the ratio of the blade tip linear speed of the wind turbine blades to the incoming wind speed, as follows:
where
is the rotor speed;
is the radius of the blade tip;
is the angular velocity of the wind turbine blades.
2.2. Control Strategy for DC Transmission
DC participates in the system frequency adjustment through the DC frequency limiting controller, and its control link adopts PI control mode. The input signal is the frequency deviation signal of the AC system
, and includes the frequency difference limiting link and filtering link. The specific control structure is shown in
Figure 3. When the system frequency difference exceeds the dead zone limit of FLC, the signal
becomes the active power output deviation signal through PI control, then through power limiting, and finally forms the active power regulation instruction. When the frequency of the sending system decreases, FLC reduces the DC transmission power according to the deviation signal. When the frequency of the sending system increases, the DC transmission power is increased to maintain the active power balance of the system. In the figure,
and
respectively are the upper limits of the frequency difference limiting link;
is the low-pass filter time constant;
and
respectively are the maximum and minimum values of the FLC control dead zone;
and
are the proportional coefficient and integral coefficient of PI controller;
and
are the upper and lower limits of the DC power regulation instruction.
2.3. Control Strategy for Energy Storage
Based on expert system, fuzzy set theory, and control theory, fuzzy control comes into being. When the mathematical process of the controlled process is more complicated, artificial criteria prevail to simplify the controller design. When implemented by computer, membership function and approximate inference perform numerical processing for this qualitative relationship.
The diagram of fuzzy logic control is shown in
Figure 4. The input variable is fuzzy by membership function, then the output fuzzy set is obtained by fuzzy logic reasoning, and the specific value of the output is obtained by deblurring.
Mamdani fuzzy reasoning was first proposed, and Mamdani established a control system by synthesizing different language control rules [
25]. The design of fuzzy logic control includes the determination of the fuzzy subset, membership degree function logic operation method, fuzzy logic reasoning criterion, and defuzzification method.
The methods of fuzzy logic control are centroid, Bisector, max–minimum membership (smallest of maximum), middle–maximum membership (middle of maximum), and max–maximum membership (largest of maximum). In this study, the gravity center method is used to deblur the output fuzzy set.
By calculating the membership function of the output and the center of gravity of the area surrounded by the x-axis, the fuzzy logic inference result is obtained. The calculation method is as follows [
26]:
where
is the domain of membership degree function;
is the value of the output membership function.
The structure of the fuzzy logic controller is shown in
Figure 5. The input of the fuzzy logic controller designed in this study is the frequency deviation
and frequency change rate of the AC system
, where the values range from −0.01 to 0.004 p.u. and −0.016 to 0.016 p.u. Because the frequency deviation of high frequency is smaller than that of low frequency in practical engineering applications, the upper limit of the frequency deviation is smaller than the lower limit. The output is the change in the active power of the BESS (battery storage energy)
, and the value ranges from −0.3 to 0.3 p.u.
Membership functions of input and output variables are shown in
Figure 6,
Figure 7 and
Figure 8. The fuzzy subsets of input variables are NL (negative large), NM (negative medium), NS (negative small), Z (zero), PS (positive small), PM (positive medium), and PL (positive large). The fuzzy subsets of the output variables are NL (negative large), NM (negative medium), NS (negative small), Z (zero), PS (positive small), PM (positive medium), and PL (positive large) [
27].
The basic criterion of fuzzy logic reasoning is the following: When the frequency deviation
or frequency change rate
is large, it indicates that the system needs a larger controller output
to participate in frequency regulation. When the absolute value of the frequency change rate
is close to 0 and the frequency deviation
is small, the system is relatively stable, the output of the controller
should be as small as possible. The specific fuzzy logic reasoning table is shown in
Table 1.
The output of fuzzy logic reasoning is obtained by using the “max-min” principle of the Mamdani algorithm, and the fuzzy logic reasoning results obtained by using the centroid de-fuzzification process are shown in
Figure 9.
3. System Simplification Model
The system frequency response model (SFR) is a general method for analyzing the frequency response of the system. In order to investigate the control strategy, the SFR is used to analyze and calculate the frequency problem of the power system after disturbance. The system frequency response model is equivalent to the actual physical system so as to obtain a feasible solution to the frequency response problem. In order to consider the frequency regulation effects of wind storage and DC, the wind power, battery energy storage system (BESS), and DC FLC model are added on the basis of the traditional SFR model, as shown in
Figure 10. Furthermore, the BESS uses lithium-ion batteries.
In the model, is the inertia corresponding time constant of the system; is the frequency regulation power for the hydro power; is the frequency regulation power for the thermal power; is frequency regulation power for wind power; is the frequency regulation power for FLC; is the frequency regulation power for BESS ; , , , , are proportional gains.
In this research, a fuzzy logic controller is added to BESS, and a comprehensive inertia control is added to wind power. The frequency response characteristics of the system under load disturbance are investigated by formulating an optimization strategy.
4. DDQN Optimization Strategy
4.1. DDQN Algorithm
Reinforcement learning is usually based on a Markov decision process (MDP), in which the outcome of an interaction between an agent and the environment at the next moment is only relevant to the current state of the environment, not the previous state of the environment. The traditional MDP process consists of four elements, given by the quadruple, where represents the set of environmental states; represents the set of actions that the agent can take; represents the reward function, which is the reward that the agent gets after taking a certain action in a certain state; represents the policy set, which corresponds to the status and action.
Q-learning is a typical reinforcement learning algorithm that records the learned experience in the form of a
Q-value table. The algorithm uses Formula (6) to update the quality value [
28]. When the formula converges, the optimal action strategy can be obtained.
where
is the learning rate;
is the discount factor;
is the
Q value for status
and action
.
In order to solve the problem that the Q-learning algorithm is difficult to deal with high-dimensional states and control action sets, a neural network model is adopted in the DQN method to directly predict the value of the Q function so as to replace the use of the Q table. The current target Q value is calculated according to Formula (5), and then the mean square error between the current target Q value and the Q value predicted by the neural network is used to update the parameter of the neural network.
DDQN algorithm makes two improvements on the basis of DQN. Firstly, two networks are introduced: one is the target network to calculate the target Q value, and the other is the update network to update the Q value, which can reduce the dependency between the target Q value and the update network parameters. The target network has exactly the same structure as the Q-value network and synchronizes parameters from the Q-value network periodically.
Secondly, we decouple the action selection of the target Q value and the calculation process of the target Q value in objective Equation (5) so as to reduce the overestimation caused by the greedy algorithm. When calculating the target Q value, the maximum Q value corresponding to the action is not found from the target Q network, but the action corresponding to the maximum Q value is found in the update network first. This action is then used to calculate Q values on the target network.
4.2. Selection of Objective Function
In this section, the integrated time square error (ITSE) index of the system frequency and DC FLC power regulation under high-power disturbance is taken as the minimum objective, and the objective function is established as follows:
where
is the controller parameter of the DC FLC;
,
are respectively the weight of the two ITSE indicators;
is the simulation duration;
is regulation power of DC FLC.
4.3. Constraints
The proportional and integral gain of the DC FLC control link must satisfy the following inequality constraints:
where
and
are respectively the initial value of the proportional link gain and the integral link gain, and their size needs to be selected according to the situation of system standardization;
and
are respectively the minimum and maximum values of the proportional gain coefficients;
and
are respectively the minimum and maximum values of the integral gain coefficients.
Due to the limited rated capacity and transmission capacity of the HVDC transmission system, the DC FLC power regulation should meet the upper and lower limits:
where
and
respectively are the upper and lower limits of the DC FLC power regulation.