Advanced Primary Frequency Regulation Optimization in Wind Storage Systems with DC Integration Using Double Deep Q-Networks

Liu, Xiaojiang; Zou, Peng; You, Jin; Wang, Yuhong; Wu, Jiabao; Zheng, Zongsheng; Gao, Shilin; Hao, Wei

doi:10.3390/electronics13122249

Open AccessArticle

Advanced Primary Frequency Regulation Optimization in Wind Storage Systems with DC Integration Using Double Deep Q-Networks

by

Xiaojiang Liu

¹,

Peng Zou

¹,

Jin You

¹,

Yuhong Wang

^2,*

,

Jiabao Wu

²,

Zongsheng Zheng

²,

Shilin Gao

² and

Wei Hao

³

¹

Southwest Electric Power Design Institute Co., Ltd. of China Power Engineering Consulting Group, Chengdu 610056, China

²

School of Electrical Engineering, Sichuan University, Chengdu 610065, China

³

SPIC Yunnan International Power Investment Co., Ltd., Kunming 650228, China

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(12), 2249; https://doi.org/10.3390/electronics13122249

Submission received: 8 May 2024 / Revised: 3 June 2024 / Accepted: 4 June 2024 / Published: 7 June 2024

(This article belongs to the Special Issue Advances in Modeling, Control and Protection of Power System Containing a High Proportion of Power Electronics)

Download

Browse Figures

Versions Notes

Abstract

:

With the gradual increase in wind power installation capacity, the proportion of traditional synchronous generators driven by fossil fuel is gradually declining. Due to the fact that wind turbines are connected to the grid through power electronic converters, which decouple rotor speeds from the system frequency and reduce system inertia levels, inadequate inertia levels can pose a threat to frequency stability when disturbances occur. To address this issue, this paper proposes a frequency regulation optimization strategy for the direct current (DC) transmission of a wind storage system. This strategy incorporates virtual inertia control and virtual droop control to adjust wind power output based on frequency deviation and rate of change. Fuzzy logic control is employed for energy storage, adaptively adjusting active power based on frequency deviation and the rate of change. Additionally, under the context of multi-DC transmission in renewable energy systems, an optimization strategy for proportion and integration (PI) parameters of the frequency limit controller (FLC) is proposed. Considering frequency deviation and DC regulation power simultaneously, the double deep Q-network (DDQN) algorithm is adopted in the simulation model to attain the optimal parameters of FLC. Simulation results conducted using MATLAB/Simulink 2022a indicate that this strategy increases the lowest frequency by 0.28 Hz and decreases the response time by 1.04 s compared with the non-optimized strategy.

Keywords:

virtual inertia control; virtual droop control; fuzzy logic control; DDQN algorithm

1. Introduction

Under the background of actively promoting the goal of “carbon peak and carbon neutrality”, the proportion of wind power in power systems is increasing, and the proportion of traditional synchronous generators driven by fossil fuels is gradually decreasing. The addition of wind turbines reduces the equivalent inertia of a power system and threatens the frequency safety of the system. In order to ensure the frequency stability of the system, wind turbines have certain frequency regulation capabilities through various control means, such as virtual inertia control, overspeed, and variable pitch control [1,2,3,4]. However, relying only on wind turbines for a single frequency regulation still faces the problem of frequency instability due to insufficient frequency regulation capacity [5,6,7,8]. The rapid development of energy storage technology and the advantages of energy storage itself have created new ideas for frequency control of power systems, such as compensation of virtual inertia [9] and synchronous generator control [10]. Energy storage itself has the characteristics of fast response speed, high response accuracy, and flexible control, which can better meet the scheduling requirements of minute or even second level of frequency regulation. Compared with traditional synchronous generator sets, power grids with higher slope rate requirements and less power demand of frequency regulation power supply have better adaptability and can better meet the frequency regulation indicators of the system [11]. Therefore, on the basis of exploring the frequency regulation capability of wind power, relevant research has proposed to connect energy storage to grid-connected wind power systems to provide active power support so as to increase the moment of inertia of the system and improve the frequency regulation performance of the grid [12,13]. Wang T and Xiang YW [14] used an energy storage system to compensate for the virtual inertia of a wind farm, determined the output power of the wind storage system based on fuzzy control logic, and formulated a virtual inertia compensation strategy to improve the inertia level of the wind storage system. However, this method only improved the participation of energy storage in frequency regulation and did not give full play to the frequency support capability of wind turbines. Fuzzy control was used to design and implement the frequency regulation control of a wind turbine operating without load [15]. According to the frequency change rate of different scenarios and the maximum output power of the wind turbine, a strategy of battery energy storage coordination for wind turbines to provide frequency regulation response was developed. The control strategy proposed by the existing research enabled the wind turbine to respond to the frequency change of the system, but it did not effectively coordinate the two frequency regulation resources of wind power and energy storage, resulting in a lack of dynamic coordination within the system and a waste of frequency regulation potential. Therefore, it is necessary to further study how to establish a flexible, coordinated control strategy for combined wind storage systems.

Wind power and energy storage are connected to the receiving power network through a power electronic converter, and relevant strategies have been adopted to control the DC power transmission, which can reduce the burden of the frequency regulation system. DC frequency regulation controls the active power of the connected alternating current (AC) system and effectively improves the frequency stability of the system due to its fast adjustability and good short-term overload capacity. DC frequency regulation has been widely used in existing asynchronous networking DC projects. It mainly adopts the form of DC frequency limiting control to improve the frequency of the connected AC system. A power grid adopts DC frequency limiting control in combination with traditional measures [16], which can effectively solve the problem of frequency stability in a network. Zhang Q, Mccalley J D, et al. [17] analyzed the frequency characteristics of the Yunnan power grid after asynchronization by combining the control principle of the DC frequency limiter and established a frequency regulation strategy for the AC system in which the DC frequency limiter participates. Considering complex operating conditions, the back-to-back DC frequency limiting controller of Chongqing and Hubei was optimized to improve the controller adjustment effect [18]. A DC frequency control system was proposed in [19], which applied a multi-variable control to the frequency control of the DC system. The frequency of the AC system can be better adjusted while maintaining the stability of the two AC systems. DC frequency regulation also plays an important role in absorbing new energy power fluctuations and the stability of multi-terminal DC cross-region systems. Papalexopoulos A D and Andrianesis P E [20] investigated the frequency dynamic coupling relationship of AC systems on both sides of a high-voltage direct current (HVDC) rectification and inverter based on large wind farms and proposed an emission angle correction strategy to decouple frequency–voltage to improve the frequency dynamics of the rectifier end. However, wind farms and DC did not form an effective coordination relationship in the proposed method. Zhang Yi, et al. [21] used a fuzzy logic controller to allocate the reference values of DC and wind farm frequency regulation power and proposed a frequency regulation power distribution method for units in wind farms so that grid-connected wind turbines can allocate frequency regulation power according to their own rotational kinetic energy and adjustable capacity.

A large number of scholars have studied the parameter optimization of DC frequency regulation but have not considered its synergistic effect with wind power and energy storage participating in frequency regulation. Therefore, in a power system where wind power is transmitted through DC, it is of great significance to study the frequency control strategy after the installation of energy storage devices. To improve the frequency stability of power systems with a high proportion of wind power penetration, it is crucial to coordinate various frequency regulation resources. In this paper, the primary frequency regulation optimization strategy of the combined wind storage system is studied. Furthermore, the frequency regulation requirements of the system are considered from the perspective of the inertia response of the power system: primary frequency regulation. The operating mechanism and cooperative control strategy under DC transmission of the combined wind storage system are also constructed. In this research, the goal is to improve the frequency stability of power systems under high proportion wind power penetration. In the process of research, only the frequency response characteristics of each part of the system are considered. Compared with existing frequency regulation methods, this study considers the participation of multiple frequency regulation resources, including energy storage, DC FLC, and wind turbine. The energy storage can adjust the output power adaptively by fuzzy logic control. Furthermore, the PI parameters of DC FLC are optimized using the DDQN algorithm in order to avoid the randomness problems of heuristic algorithms.

The scope of this investigation is divided into seven main sections. Section 2 gives the control strategy for wind power and DC transmission. Furthermore, Section 2 presents a control strategy for energy storage. Section 3 gives the system simplification model considering variable resources of frequency regulation such as wind power, energy storage, and DC. Section 4 and Section 5 present the methodology for the optimization strategy, including the DDQN algorithm and the entropy method. In Section 6, the simulation results are presented and analyzed. The conclusion is drawn in Section 7.

2. Control Strategy for Wind Storage System and DC Transmission

2.1. Control Strategy for Wind Power

At present, the main ways that wind turbines participate in frequency regulation are by rotor virtual inertia, droop control, virtual synchronous generator, overspeed, and variable pitch control. Overspeed and variable pitch control actuate on load reduction, which makes wind power no longer use maximum power point tracking (MPPT) control and reduces the utilization rate of wind energy. Rotor virtual inertia control and sag control actuate on rotor kinetic energy. Although the wind turbine passes through the power electronic converter, its rotor frequency is decoupled from the system frequency and cannot respond to the system frequency change. However, the wind turbine blades are connected to the generator rotor through the gearbox, and the weight of the wind turbine blades is several times or even dozens of times larger than the weight of the generator rotor, and the rotating kinetic energy stored in the wind turbine blades is considerable when the unit is running normally. When the system frequency changes, the frequency support is provided by converting the kinetic energy and electric energy of the rotor. In order to ensure the maximum utilization of wind energy, a comprehensive inertia control method combining virtual inertia control and sag control is adopted to make wind power participate in frequency regulation.

When the system frequency changes, the virtual inertia control can change the reference value of the active power of the wind turbine according to the frequency change rate and magnitude and then change the electromagnetic power of the rotor to achieve additional power support for the system frequency change. The rotor virtual inertia control method does not require the reserve power of the wind turbine and can still realize the maximum utilization of wind energy. However, at the end of the inertia response process, the rotor speed returns to the initial value, which may cause a secondary droop or increase in the system frequency, and when the system is stable again, it will not be able to continue to provide frequency support.

The principle of virtual inertia control is as follows [22]:

Δ P_{1} = 2 H \frac{d f}{d t},

(1)

where

Δ P_{1}

is the compensated active power of virtual inertia control;

Δ f

is the system frequency deviation;

H

is the inertial response time constant.

Droop control is a steady-state support process that is mainly used to reduce the frequency deviation of the system and can simulate the differential adjustment process of the synchronous generator’s primary frequency regulation, which is conducive to maintaining the stability of the system frequency. The root cause of the frequency fluctuation of the power system is the mismatch of power, that is, the mismatch between the active power and the load power of the generator. When the active power output of the system power supply is missing or increasing, the system frequency will decrease or increase. The traditional synchronous generator set can respond to the change in system frequency; the governor controls the speed down or up and changes the output power of the synchronous machine so that the power in the system gradually reaches a new equilibrium state. A typical droop control is shown in Figure 1.

Emulating the sagging characteristics of a synchronous generator, frequency deviation control can be added to the control strategy of the wind turbine so that it has similar sagging characteristics to a synchronous generator and then responds to the frequency change in the system. Droop control is differential regulation. After the adjustment, the system enters a new steady state, and the frequency at this time is not rated frequency. The principle of sag control is as follows [23]:

Δ P_{2} = K_{d} Δ f,

(2)

where

Δ P_{2}

is the droop control active power involved in a frequency regulation;

K_{d}

is the droop control factor.

The principle of comprehensive inertia control is shown in Figure 2. The comprehensive inertia control obtains the compensation power

Δ P_{1}

and

Δ P_{2}

according to the frequency change rate and frequency deviation, respectively, which determines the reference value of active power together with the output power

P_{m}

controlled by wind power MPPT and then participates in the frequency adjustment of the system. The addition of droop control can avoid the secondary drop in the frequency and improve the stability of the system frequency. The function of the high-pass filter is to prevent the virtual inertia control from working in the steady state of the system and only acting on the frequency change rate [24].

The output power

P_{m}

of wind power MPPT control is calculated as follows:

P_{m} = \frac{1}{2} ρ S_{w} V^{3} C_{P} (λ, θ),

(3)

where

ρ

is the air density;

S_{w}

is the projected area of the wind flowing vertically through the blade;

V

is the wind speed;

C_{P}

is the rotor power coefficient.

Rotor power coefficient

C_{P}

is a function of blade tip speed ratio

λ

and pitch angle

θ

, and blade tip speed ratio is defined as the ratio of the blade tip linear speed of the wind turbine blades to the incoming wind speed, as follows:

λ = \frac{2 π R n}{V} = \frac{ω R}{V},

(4)

where

n

is the rotor speed;

R

is the radius of the blade tip;

ω

is the angular velocity of the wind turbine blades.

2.2. Control Strategy for DC Transmission

DC participates in the system frequency adjustment through the DC frequency limiting controller, and its control link adopts PI control mode. The input signal is the frequency deviation signal of the AC system

Δ f

, and includes the frequency difference limiting link and filtering link. The specific control structure is shown in Figure 3. When the system frequency difference exceeds the dead zone limit of FLC, the signal

Δ f

becomes the active power output deviation signal through PI control, then through power limiting, and finally forms the active power regulation instruction. When the frequency of the sending system decreases, FLC reduces the DC transmission power according to the deviation signal. When the frequency of the sending system increases, the DC transmission power is increased to maintain the active power balance of the system. In the figure,

D_{M a x}

and

D_{M i n}

respectively are the upper limits of the frequency difference limiting link;

T_{r}

is the low-pass filter time constant;

f_{d}^{+}

and

f_{d}^{-}

respectively are the maximum and minimum values of the FLC control dead zone;

K_{P}

and

K_{I}

are the proportional coefficient and integral coefficient of PI controller;

Δ P_{F L C, M a x}

and

Δ P_{F L C, M i n}

are the upper and lower limits of the DC power regulation instruction.

2.3. Control Strategy for Energy Storage

Based on expert system, fuzzy set theory, and control theory, fuzzy control comes into being. When the mathematical process of the controlled process is more complicated, artificial criteria prevail to simplify the controller design. When implemented by computer, membership function and approximate inference perform numerical processing for this qualitative relationship.

The diagram of fuzzy logic control is shown in Figure 4. The input variable is fuzzy by membership function, then the output fuzzy set is obtained by fuzzy logic reasoning, and the specific value of the output is obtained by deblurring.

Mamdani fuzzy reasoning was first proposed, and Mamdani established a control system by synthesizing different language control rules [25]. The design of fuzzy logic control includes the determination of the fuzzy subset, membership degree function logic operation method, fuzzy logic reasoning criterion, and defuzzification method.

The methods of fuzzy logic control are centroid, Bisector, max–minimum membership (smallest of maximum), middle–maximum membership (middle of maximum), and max–maximum membership (largest of maximum). In this study, the gravity center method is used to deblur the output fuzzy set.

By calculating the membership function of the output and the center of gravity of the area surrounded by the x-axis, the fuzzy logic inference result is obtained. The calculation method is as follows [26]:

x_{C e n t r o i d} = \frac{\sum_{i} μ (x_{i}) x_{i}}{\sum_{i} μ (x_{i})},

(5)

where

x_{i}

is the domain of membership degree function;

μ (x_{i})

is the value of the output membership function.

The structure of the fuzzy logic controller is shown in Figure 5. The input of the fuzzy logic controller designed in this study is the frequency deviation

Δ f

and frequency change rate of the AC system

d f / d t

, where the values range from −0.01 to 0.004 p.u. and −0.016 to 0.016 p.u. Because the frequency deviation of high frequency is smaller than that of low frequency in practical engineering applications, the upper limit of the frequency deviation is smaller than the lower limit. The output is the change in the active power of the BESS (battery storage energy)

Δ P_{E}

, and the value ranges from −0.3 to 0.3 p.u.

Membership functions of input and output variables are shown in Figure 6, Figure 7 and Figure 8. The fuzzy subsets of input variables are NL (negative large), NM (negative medium), NS (negative small), Z (zero), PS (positive small), PM (positive medium), and PL (positive large). The fuzzy subsets of the output variables are NL (negative large), NM (negative medium), NS (negative small), Z (zero), PS (positive small), PM (positive medium), and PL (positive large) [27].

The basic criterion of fuzzy logic reasoning is the following: When the frequency deviation

Δ f

or frequency change rate

d f / d t

is large, it indicates that the system needs a larger controller output

Δ P_{E}

to participate in frequency regulation. When the absolute value of the frequency change rate

d f / d t

is close to 0 and the frequency deviation

Δ f

is small, the system is relatively stable, the output of the controller

Δ P_{E}

should be as small as possible. The specific fuzzy logic reasoning table is shown in Table 1.

The output of fuzzy logic reasoning is obtained by using the “max-min” principle of the Mamdani algorithm, and the fuzzy logic reasoning results obtained by using the centroid de-fuzzification process are shown in Figure 9.

3. System Simplification Model

The system frequency response model (SFR) is a general method for analyzing the frequency response of the system. In order to investigate the control strategy, the SFR is used to analyze and calculate the frequency problem of the power system after disturbance. The system frequency response model is equivalent to the actual physical system so as to obtain a feasible solution to the frequency response problem. In order to consider the frequency regulation effects of wind storage and DC, the wind power, battery energy storage system (BESS), and DC FLC model are added on the basis of the traditional SFR model, as shown in Figure 10. Furthermore, the BESS uses lithium-ion batteries.

In the model,

T_{s y s}

is the inertia corresponding time constant of the system;

Δ P_{H}

is the frequency regulation power for the hydro power;

Δ P_{S}

is the frequency regulation power for the thermal power;

Δ P_{W}

is frequency regulation power for wind power;

Δ P_{F L C}

is the frequency regulation power for FLC;

Δ P_{E}

is the frequency regulation power for BESS ;

K_{H y d r o}

,

K_{S t e a m}

,

K_{w i n d}

,

K_{F L C}

,

K_{E s s}

are proportional gains.

In this research, a fuzzy logic controller is added to BESS, and a comprehensive inertia control is added to wind power. The frequency response characteristics of the system under load disturbance are investigated by formulating an optimization strategy.

4. DDQN Optimization Strategy

4.1. DDQN Algorithm

Reinforcement learning is usually based on a Markov decision process (MDP), in which the outcome of an interaction between an agent and the environment at the next moment is only relevant to the current state of the environment, not the previous state of the environment. The traditional MDP process consists of four elements, given by the

(S, A, R, μ)

quadruple, where

S

represents the set of environmental states;

A

represents the set of actions that the agent can take;

R

represents the reward function, which is the reward that the agent gets after taking a certain action in a certain state;

μ

represents the policy set, which corresponds to the status and action.

Q-learning is a typical reinforcement learning algorithm that records the learned experience in the form of a Q-value table. The algorithm uses Formula (6) to update the quality value [28]. When the formula converges, the optimal action strategy can be obtained.

Q^{k + 1} (S, A) = Q^{k} (S, A) + α [R^{k} + γ \max Q^{k + 1} (S^{'}, A^{'}) - Q^{k + 1} (S, A)],

(6)

where

α

is the learning rate;

γ

is the discount factor;

Q^{k} (S, A)

is the Q value for status

S

and action

A

.

In order to solve the problem that the Q-learning algorithm is difficult to deal with high-dimensional states and control action sets, a neural network model

Q (S, A | ω)

is adopted in the DQN method to directly predict the value of the Q function so as to replace the use of the Q table. The current target Q value is calculated according to Formula (5), and then the mean square error between the current target Q value and the Q value predicted by the neural network is used to update the parameter

ω

of the neural network.

DDQN algorithm makes two improvements on the basis of DQN. Firstly, two networks are introduced: one is the target network

Q^{'} (S, A | ω^{'})

to calculate the target Q value, and the other is the update network

Q (S, A | ω)

to update the Q value, which can reduce the dependency between the target Q value and the update network parameters. The target network has exactly the same structure as the Q-value network and synchronizes parameters from the Q-value network periodically.

Secondly, we decouple the action selection of the target Q value and the calculation process of the target Q value in objective Equation (5) so as to reduce the overestimation caused by the greedy algorithm. When calculating the target Q value, the maximum Q value corresponding to the action is not found from the target Q network, but the action corresponding to the maximum Q value is found in the update network first. This action is then used to calculate Q values on the target network.

4.2. Selection of Objective Function

In this section, the integrated time square error (ITSE) index of the system frequency and DC FLC power regulation under high-power disturbance is taken as the minimum objective, and the objective function is established as follows:

\min F (x) = ω_{1} \int_{0}^{t_{s}} t {(Δ f)}^{2} d t + ω_{2} \int_{0}^{t_{s}} t {(Δ P_{F L C})}^{2} d t,

(7)

where

x = {K_{P}, K_{L}}

is the controller parameter of the DC FLC;

ω_{1}

,

ω_{2}

are respectively the weight of the two ITSE indicators;

t_{s}

is the simulation duration;

Δ P_{F L C}

is regulation power of DC FLC.

4.3. Constraints

The proportional and integral gain of the DC FLC control link must satisfy the following inequality constraints:

\begin{array}{l} K_{P \min} < K_{P 0} < K_{P \max} \\ K_{I \min} < K_{L 0} < K_{I \max} \end{array},

(8)

where

K_{P 0}

and

K_{L 0}

are respectively the initial value of the proportional link gain and the integral link gain, and their size needs to be selected according to the situation of system standardization;

K_{P \min}

and

K_{P \max}

are respectively the minimum and maximum values of the proportional gain coefficients;

K_{L \min}

and

K_{L \max}

are respectively the minimum and maximum values of the integral gain coefficients.

Due to the limited rated capacity and transmission capacity of the HVDC transmission system, the DC FLC power regulation should meet the upper and lower limits:

Δ P_{F L C \min} < Δ P_{F L C} < Δ P_{F L C \max},

(9)

where

Δ P_{F L C \min}

and

Δ P_{F L C \max}

respectively are the upper and lower limits of the DC FLC power regulation.

5. Solution Process Based on Entropy Method

5.1. Entropy Method

The entropy method is a weighting method based on the amount of information contained in each index, which is objective and can avoid the interference of too many human factors [29]. If the value difference of the same index is small, that is, the probability distribution is more uniform, and the information entropy is larger, the index is assigned a lower weight. Conversely, if the value difference of the same indicator is large, the weight of the indicator is higher. Let the decision matrix

D

be:

D = [\begin{matrix} z_{11} & z_{12} \\ ⋮ & ⋮ \\ z_{m 1} & z_{m 2} \end{matrix}],

(10)

where

z_{i j}

is the value of the jth ITSE of the ith simulation result;

m

is the simulation frequency.

The steps to determine the weight of indicators by using the entropy method are as follows:

1.: The simulation result corresponding to each jth ITSE is normalized:

p_{i j} = \frac{z_{i j}}{\sum_{i = 1}^{m} z_{i j}},

(11)

2.: The entropy of the jth ITSE is calculated as follows:

e_{j} = - \frac{1}{\ln m} \sum_{i = 1}^{m} p_{i j} \ln p_{i j},

(12)

3.: The deviation degree of the jth ITSE is calculated as follows:

d_{j} = 1 - e_{j},

(13)

4.: The deviation degree of the jth ITSE is normalized, and the index weight is as follows:

ω_{j} = \frac{d_{j}}{\sum_{k = 1}^{2} d_{k}},

(14)

5.2. Solution Process

When using the DDQN algorithm, it is necessary to transform the control problem into a process of reinforcement learning. The frequency response model established in Section 3 acts as the environment of the reinforcement learning problem, and the FLC controller acts as the agent. The goal of reinforcement learning is to obtain an optimized control strategy through the continuous interaction between the agent and the environment. The Markov decision process corresponding to this optimization problem is as follows:

State variable is $S = (Δ f, Δ P_{F L C}, Δ P_{E}, Δ P_{W}, Δ P_{G})$ . These variables are the dynamic results given in real time by the simulation environment;
Action variable is $A = (K_{P}, K_{L})$ , which is the decision variables of the FLC;
The construction of the reward function needs to combine the objective function (7) and the constraints (8) and (9). The reward function from the current state $S_{t}$ to the next state $S_{t + 1}$ is defined as:

$R (S_{t}, S_{t + 1}) = - [ω_{1} \int_{t}^{t + 1} t {(Δ f)}^{2} d t + ω_{2} \int_{t}^{t + 1} t {(Δ P_{F L C})}^{2} d t + β (t)],$

(15)

where the first part is the size of the objective function for this time period; the second part is the penalty term, which is set to reflect the constraints. The goal of the optimization process is to maximize the value of the reward function. It also means that the value of the objective function will be a minimum;
The policy function $μ$ is the corresponding relationship between the state variables of the combined wind storage system and the parameters of the DC FLC, which is the deep neural network required in the DDQN algorithm.

6. Analysis of Simulation Results

6.1. Simulation Model

In this study, the wind farm (WF), BESS, and voltage source converter (VSC) stations are added to the four-machine and two-zone model of the MATLAB/Simulink simulation platform. Moreover, the active power control link is added to the converter. The topology of the system is shown in Figure 11. The computer configuration of the neural network training process is AMD Ryzen 7 5800H CPU and 16 G integrated graphics card.

In Figure 11,

G_{1}

and

G_{2}

are steam turbines with the rated power of 900 MW;

G_{3}

and

G_{4}

are hydro turbines with the rated power of 900 MW; the rated power of the wind farm is 300 MW; the rated power of the DC transmission is 1000 MW; the rated capacity of the BESS is 350 Ah.

6.2. Model Training

The training samples are input into the neural network successively, and the reward function values of the agents in every 50 s simulation cycle are counted. The reward curve obtained is shown in Figure 12. It can be seen that at the beginning of training, the reward value has been increasing during the agent’s search for actions. With the progress in training, the value of the reward function obtained by the agent converges, and the performance of the model becomes better and better. Figure 13 shows the change curve of the objective function in the training process. It can be seen that the objective function also converges with the progress of training. Finally, the objective function converges to 80.07.

6.3. Comparative Analysis

The results of different optimization methods are shown in Table 2. Compared with the parameters of the PI controller given by experience, the PI controller parameter obtained by DDQN is

K_{P} = 0.2

,

K_{L} = 0.3

, and the objective function value is 80.07. The objective function values of the DQN algorithm and the particle swarm optimization (PSO) algorithm are both larger than 80.07, which proves the effectiveness of the proposed DDQN optimization strategy. The PSO algorithm, as a typical heuristic algorithm, enables the whole group to obtain feasible solutions in the solution space through information sharing among individuals, but its results are random. Although the DQN algorithm calculates the Q value through the deep neural network, the Q value is strongly dependent on the updated neural network parameter relationship, and there is an overestimation problem.

A load disturbance of 0.08 p.u. appears in the system after 10 s. Furthermore, the wind penetration level of the system is 10%. The frequency response characteristics of different optimization methods are shown in Figure 14. According to the analysis in Figure 14, it can be seen that the lowest frequency

f_{N a d i r}

= 49.27 Hz, the inertial response time

t

= 14.37 s, and the steady frequency

f_{S t e a d y}

= 49.88 Hz when using the DDQN algorithm for optimization. When the DQN optimization algorithm is adopted, the lowest frequency

f_{N a d i r}

= 49.21 Hz, inertia response time

t

= 15 s, steady frequency

f_{S t e a d y}

= 49.79 Hz. When the PSO algorithm is adopted, the lowest frequency

f_{N a d i r}

= 49.21 Hz, the inertia response time

t

= 14.6 s, and the steady frequency

f_{S t e a d y}

= 49.89 Hz. Without optimization, the lowest frequency

f_{N a d i r}

= 48.99 Hz, the inertia response time

t

= 15.41 s, the steady frequency

f_{S t e a d y}

= 49.88 Hz, and the overshoot B = 0.16%. The optimization strategy proposed in this paper can significantly improve the inertial response process of the system. Compared with the non-optimized strategy, the lowest frequency increases by 0.28 Hz, reducing the undershooting and oscillatory transients. Also, the response time decreases by 1.04 s.

Different wind power penetration levels are set in order to investigate the effect of wind power penetration. The initial wind penetration level of the system is 10%. It is gradually increased to 40%. A load disturbance of 0.08 p.u. appears in the system after 10 s. The control strategy of the system adopts the DDQN algorithm.

The frequency response characteristics of different wind power penetration levels are shown in Figure 15. As we can see from Figure 15, the lowest frequency

f_{N a d i r}

= 49.27 Hz, the inertia response time

t

= 14.37 s when the wind power penetration level is 10%. The lowest frequency

f_{N a d i r}

= 49.24 Hz, and the inertia response time

t

= 14.71 s when the wind power penetration level is 20%. The lowest frequency

f_{N a d i r}

= 49.22 Hz, and the inertia response time

t

= 14.72 s when the wind power penetration level is 30%. The lowest frequency

f_{N a d i r}

= 49.2 Hz, and the inertia response time

t

= 14.73 s when the wind power penetration level is 40%. With the increase in wind power penetration, the inertia level of the system is reduced. Hence, the lowest frequency is reduced, and the inertia time becomes longer when wind power penetration is increased.

7. Conclusions

Aiming to address the problem of frequency stability caused by increases in wind power proportion, this paper studied the frequency regulation strategy of a combined wind storage system with DC feed. In addition, an optimization strategy that takes into account the frequency deviation and FLC frequency regulation power of DC was put forward. The frequency regulation characteristics of hydro power and thermal power were considered in the simulation model, which has a certain reference value for engineering practice. Since the DDQN algorithm does not need to specify the mathematical relationship between the control variable and the objective function, the Q function obtained by training can find the optimal control variable for the objective function so that the data obtained by the simulation model can be effectively processed. By emulating load disturbance and comparing the results of different optimization methods, the frequency response characteristic of the proposed optimization strategy is the best. Moreover, compared with the PI parameters given by experience, the PI parameters attained by the DDQN algorithm make the objective function value smaller. However, this study did not consider the change in SOC of energy storage. The control strategy for energy storage will be optimized in a further study.

Author Contributions

Conceptualization, X.L. and P.Z.; methodology, X.L. and S.G.; software, P.Z. and Y.W.; validation, X.L., P.Z. and J.Y.; formal analysis, X.L.; investigation, X.L.; resources, P.Z. and S.G.; data curation, X.L. and J.Y.; writing—original draft preparation, P.Z.; writing—review and editing, J.W.; visualization, W.H.; supervision, Y.W.; project administration, Z.Z.; funding acquisition, Z.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partially supported by the National Natural Science Foundation of China (62101362, 52307127), the Project of State Key Laboratory of Power System Operation and Control (SKLD23KZ07), the Fundamental Research Funds for the Central Universities (YJ202141, YJ202316), and the Key Scientific and Technological Program of Southwest Electric Power Design Institute (KC0029).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

Authors Xiaojiang Liu, Peng Zou and Jin You were employed by the company Southwest Electric Power Design Institute Co., Ltd. of China Power Engineering Consulting Group. Author Wei Hao was employed by the company SPIC Yunnan International Power Investment Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Liu, Y.; Song, Y.K. Optimal Emergency Frequency Control Based on Coordinated Droop in Multi-Infeed Hybrid AC-DC System. IEEE Trans. Power Syst. 2021, 36, 3305–3316. [Google Scholar] [CrossRef]
Wang, Y.; Tai, K.; Song, Y.; Kou, R.; Zheng, Z.; Zeng, Q. Research on Double-Deck Traceability Identification Method of Commutation Failure in HVDC System. IEEE Access 2021, 9, 108392–108401. [Google Scholar] [CrossRef]
Zheng, Z.; Xu, Y.; Mili, L.; Liu, Z.; Korkali, M.; Wang, Y. Observability analysis of a power system stochastic dynamic model using a derivative-free approach. IEEE Trans. Power Syst. 2021, 36, 5834–5845. [Google Scholar] [CrossRef]
Xu, Y.; Wang, Q.; Mili, L.; Zheng, Z.; Gu, W.; Lu, S.; Wu, Z. A data-driven koopman approach for power system nonlinear dynamic observability analysis. IEEE Trans. Power Syst. 2023, 39, 4090–4104. [Google Scholar] [CrossRef]
Xu, Y.; Mili, L.; Sandu, A.; von Spakovsky, M.R.; Zhao, J. Propagating uncertainty in power system dynamic simulations using polynomial chaos. IEEE Trans. Power Syst. 2018, 34, 338–348. [Google Scholar] [CrossRef]
Gao, S.; Song, Y.; Chen, Y.; Yu, Z.; Zhang, R. Fast Simulation Model of Voltage Source Converters with Arbitrary Topology Using Switch-State Prediction. IEEE Trans. Power Electron. 2022, 37, 12167–12181. [Google Scholar] [CrossRef]
Gao, S.; Tan, Z.; Song, Y.; Chen, Y.; Shen, C.; Yu, Z. Accuracy Enhancement of Shifted Frequency-Based Simulation Using Root Matching and Embedded Small-Step. IEEE Trans. Power Syst. 2022, 38, 3345–3357. [Google Scholar] [CrossRef]
Langwasser, M.; De Carne, G. Primary Frequency Regulation Using HVDC Terminals Controlling Voltage Dependent Loads. IEEE Trans. Power Deliv. 2021, 36, 710–720. [Google Scholar] [CrossRef]
Obradovic, D.; Oluic, M. Supplementary Power Control of an HVDC System and Its Impact on Electromechanical Dynamics. IEEE Trans. Power Syst. 2021, 36, 4599–4610. [Google Scholar] [CrossRef]
Guan, Z. Research on Combined Frequency Regulation Strategy of Wind Turbines and Storage System Based on Virtual Synchronous Generator. Ph.D. Thesis, Shenyang University of Technology, Shenyang, China, 2022. [Google Scholar] [CrossRef]
Ishiguro, F.; Matsumoto, T. HVDC rectifier control coordinated with generator station in radial operation. IEEE Trans. Power Syst. 1997, 12, 851–857. [Google Scholar] [CrossRef]
Sanpei, M.; Kakehi, A. Application of multi-variable control for automatic frequency controller of HVDC transmission system. IEEE Trans. Power Deliv. 1994, 9, 1063–1068. [Google Scholar] [CrossRef]
Nyman, A.; Jaaskelainen, K. The Fenno-Skan HVDC link commissioning. IEEE Trans. Power Deliv. 1994, 9, 1–9. [Google Scholar] [CrossRef]
Wang, T.; Xiang, Y.W. Coordinated multiple HVDC regulation emergency control for enhancing power system frequency stability. IET Renew. Power Gener. 2020, 14, 3881–3887. [Google Scholar] [CrossRef]
Zhang, G.; Song, L. Research on modeling and simulation for hydro-islanding terminal-HVDC system. Eng. J. Wuhan Univ. 2019, 52, 905–911. [Google Scholar]
Rao, H.; Wu, W.; Mao, T.; Zhou, B.; Hong, C.; Liu, Y.; Wu, X. Frequency control at the power sending side for HVDC asynchronous interconnections between yunnan power grid and the rest of CSG. CSEE J. Power Energy Syst. 2021, 7, 105–113. [Google Scholar]
Zhang, Q.; McCalley, J.D.; Ajjarapu, V.; Renedo, J.; Elizondo, M.A.; Tbaileh, A.; Mohan, N. Primary Frequency Support through North American Continental HVDC Interconnections with VSC-MTDC Systems. IEEE Trans. Power Syst. 2020, 36, 806–817. [Google Scholar] [CrossRef]
Kwon, D.-H.; Kim, Y.-J.; Gomis-Bellmunt, O. Optimal DC Voltage and Current Control of an LCC HVDC System to Improve Real-Time Frequency Regulation in Rectifier- and Inverter-Side Grids. IEEE Trans. Power Syst. 2020, 35, 4539–4553. [Google Scholar] [CrossRef]
Ko, K.S.; Han, S.; Sung, D.K. Performance-based settlement of frequency regulation for electric vehicle aggregators. IEEE Trans. Smart Grid 2018, 9, 866–875. [Google Scholar] [CrossRef]
Papalexopoulos, A.D.; Andrianesis, P.E. Performance-based pricing of frequency regulation in electricity markets. IEEE Trans. Power Syst. 2014, 29, 441–449. [Google Scholar] [CrossRef]
Zhang, Y.; Liu, X.; Qu, B. Distributed model predictive load frequency control of multi-area power system with DFIGs. IEEE/CAA J. Autom. Sin. 2017, 4, 125–135. [Google Scholar] [CrossRef]
Shi, K.; Ye, H.; Song, W.; Zhou, G. Virtual Inertia Control Strategy in Microgrid Based on Virtual Synchronous Generator Technology. IEEE Access 2018, 6, 27949–27957. [Google Scholar] [CrossRef]
Ma, X.; Wang, Q.; Wan, J.; Yu, D. Research on primary frequency regulation strategy with frequency division of wind turbines. In Proceedings of the 2017 IEEE Transportation Electrification Conference and Expo, Asia-Pacific (ITEC Asia-Pacific), Harbin, China, 7–10 August 2017; pp. 1–6. [Google Scholar] [CrossRef]
Yan, X.; Cui, S.; Wang, D.; Wang, Y.; Li, T. Inertia and primary frequency regulation strategy of doubly-fed wind turbine based on supercapacitor energy storage control. Power Syst. Autom. 2020, 44, 111–120. [Google Scholar]
Mamdani, E.H.; Assilian, S. An experiment in linguistic synthesis with a fuzzy logic controller. Int. J. Man-Mach. Stud. 1975, 7, 1–13. [Google Scholar] [CrossRef]
Mamdani, E.H. Applications of fuzzy logic to approximate reasoning using linguistic synthesis. IEEE Trans. Comput. 1977, 26, 1182–1191. [Google Scholar] [CrossRef]
Zhang, S.; Mishra, Y.; Shahidehpour, M. Fuzzy-Logic Based Frequency Controller for Wind Farms Augmented with Energy Storage Systems. IEEE Trans. Power Syst. 2016, 31, 1595–1603. [Google Scholar] [CrossRef]
Li, Y.; Gao, Y.; Xia, Y.; Hu, Y.; Sun, H.; Jia, R. Grid Security Constraints Based Bidding Strategy Using DDQN on Generation Side. In Proceedings of the 2022 12th International Conference on Power and Energy Systems (ICPES), Guangzhou, China, 23–25 December 2022; pp. 529–538. [Google Scholar] [CrossRef]
Qi, J.; Liu, J.; Liu, Z.; Wang, K. Evaluation Method of Power Channel Operation Quality Based on Entropy Weight Method. In Proceedings of the 2023 3rd International Conference on New Energy and Power Engineering (ICNEPE), Huzhou, China, 24–26 November 2023; pp. 1014–1017. [Google Scholar] [CrossRef]

Figure 1. Droop control principle.

Figure 2. Principle of integrated inertia control.

Figure 3. DC frequency limit controller model.

Figure 4. Fuzzy logic control diagram.

Figure 5. Fuzzy logic controller structure.

Figure 6. Frequency deviation membership function.

Figure 7. Frequency change rate membership function.

Figure 8. ESS power variation membership function.

Figure 9. Fuzzy logic inference results.

Figure 10. SFR model.

Figure 11. Topology of the system.

Figure 12. Reward value during training.

Figure 13. The value of the objective function during training.

Figure 14. Frequency response characteristics of different optimization methods.

Figure 15. Frequency response characteristics of different wind power penetration levels.

Table 1. Fuzzy logic reasoning table.

BESS Power		Frequency Deviation
BESS Power		NL	NM	NS	Z	PS	PM	PL
Rate of frequency change	NL	PL	PL	PL	PL	PS	NS	NM
	NM	PL	PL	PL	PM	Z	NS	NM
	NS	PL	PL	PM	PS	NS	NM	NM
	Z	PL	PM	PS	Z	NS	NM	NL
	PS	PM	PM	PS	NS	NM	NL	NL
	PM	PM	PS	Z	NM	NL	NL	NL
	PL	PM	PS	NS	NL	NL	NL	NL

Table 2. Comparison results of different optimization methods.

Optimization Method	Controller Parameter	Objective Function Value
Non-optimization	$K_{P} = 0.1$ $, K_{L} = 0.2$	103.45
PSO	$K_{P} = 0.21$ $, K_{L} = 0.29$	90.12
DQN	$K_{P} = 0.19$ $, K_{L} = 0.28$	82.23
DDQN	$K_{P} = 0.2$ $, K_{L} = 0.3$	80.07

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, X.; Zou, P.; You, J.; Wang, Y.; Wu, J.; Zheng, Z.; Gao, S.; Hao, W. Advanced Primary Frequency Regulation Optimization in Wind Storage Systems with DC Integration Using Double Deep Q-Networks. Electronics 2024, 13, 2249. https://doi.org/10.3390/electronics13122249

AMA Style

Liu X, Zou P, You J, Wang Y, Wu J, Zheng Z, Gao S, Hao W. Advanced Primary Frequency Regulation Optimization in Wind Storage Systems with DC Integration Using Double Deep Q-Networks. Electronics. 2024; 13(12):2249. https://doi.org/10.3390/electronics13122249

Chicago/Turabian Style

Liu, Xiaojiang, Peng Zou, Jin You, Yuhong Wang, Jiabao Wu, Zongsheng Zheng, Shilin Gao, and Wei Hao. 2024. "Advanced Primary Frequency Regulation Optimization in Wind Storage Systems with DC Integration Using Double Deep Q-Networks" Electronics 13, no. 12: 2249. https://doi.org/10.3390/electronics13122249

APA Style

Liu, X., Zou, P., You, J., Wang, Y., Wu, J., Zheng, Z., Gao, S., & Hao, W. (2024). Advanced Primary Frequency Regulation Optimization in Wind Storage Systems with DC Integration Using Double Deep Q-Networks. Electronics, 13(12), 2249. https://doi.org/10.3390/electronics13122249

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Advanced Primary Frequency Regulation Optimization in Wind Storage Systems with DC Integration Using Double Deep Q-Networks

Abstract

1. Introduction

2. Control Strategy for Wind Storage System and DC Transmission

2.1. Control Strategy for Wind Power

2.2. Control Strategy for DC Transmission

2.3. Control Strategy for Energy Storage

3. System Simplification Model

4. DDQN Optimization Strategy

4.1. DDQN Algorithm

4.2. Selection of Objective Function

4.3. Constraints

5. Solution Process Based on Entropy Method

5.1. Entropy Method

5.2. Solution Process

6. Analysis of Simulation Results

6.1. Simulation Model

6.2. Model Training

6.3. Comparative Analysis

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI