Dual−Layer Distributed Optimal Operation Method for Island Microgrid Based on Adaptive Consensus Control and Two−Stage MATD3 Algorithm
Abstract
:1. Introduction
- A two−layer distributed optimal operation framework is established for island microgrids, which develops upon the operating environment model of an island microgrid. The lower layer is a distributed control layer, and it uses the consensus control method with the goal of unifying the operating status of distributed MTs. The upper layer is the optimal scheduling layer, and it aims to maximize the economic benefits of island microgrids by using a two−stage MATD3 algorithm.
- A novel adaptive consensus control method is proposed in the lower layer, which allocates the output power of distributed MTs according to their capacities while ensuring that the total output power of the MTs follows a reference signal provided by the upper layer. Additionally, the proposed method guarantees the plug−and−play capability of distributed MTs, which means that the above functionalities are maintained even when distributed MTs are plugged in or plugged out from the system.
- A two−stage MATD3 is proposed in the upper layer, which can maximize the operational economy of island microgrids by adjusting the reference signals of distributed MTs and energy storage. Moreover, the method incorporates a pre−training stage to enhance the training effectiveness of the algorithm, while also mitigating the sensitivity of the MATD3 algorithm to the parameters, thereby reducing the complexity of parameter tuning.
2. Island Microgrid Model
- Distributed MTs
- Energy storage
- Island microgrid busbar
3. Dual−Layer Distributed Optimal Operation Method for Island Microgrids
3.1. Lower Layer Control Method
3.1.1. Fundamental Theory of Multi−Agent Consensus Control
3.1.2. Lower Layer Adaptive Consensus Control for Island Microgrids
3.1.3. Plug−and−Play Improvements for Adaptive Consensus Control
3.1.4. Stability Analysis of Adaptive Consensus Control in the Lower Layer of the Island Microgrid
3.2. Upper Layer Optimal Scheduling Method for the Island Microgrid Based on Two−Stage MATD3 Algorithm
3.2.1. Markov Model for Optimal Scheduling of the Island Microgrid
- State space
- Action space
- Reward function
3.2.2. Twin Delayed Deep Deterministic Policy Gradient Algorithm
3.2.3. MATD3 Method for Optimal Scheduling of the Island Microgrid
Algorithm 1: MATD3−Based Optimized Scheduling Method for Island Microgrids |
1 Initialize and experience replay buffer D |
2 for episode = 1 to M do |
3 Initialize random process N for action exploration |
4 for t = 1 to T do |
5 The MT agent and the ES agent observe their respective state spaces and from their own environments |
6 Choose the power actions and of distributed MTs and energy storage, respectively |
7 The island microgrid operates according to actions and and it gets the real island microgrid environmental reward via (50) |
8 The MT agent and the ES agent observe the new state spaces and from the island microgrid environment, respectively |
9 and into and store in D |
10 |
11 for MT agent and ES agent do |
12 Sample a random mini−batch of size H from D |
13 Update critic network parameters and by minimizing the loss and respectively |
14 Update actor parameter every two critic updates by maximizing |
15 end |
16 Update target network parameters for MT agent and ES agent via (53)−(55) |
17 end |
18 end |
3.2.4. Two−Stage Deep Reinforcement Learning Agent Training Method
- Stage 1: Imitation learning pre−training stage
- Stage 2: Reinforcement learning training stage
4. Numerical Simulation Analysis
4.1. Simulation Analysis of Lower Layer Adaptive Consensus Control
4.1.1. Assessment of Control Performance when the Reference Signal Changes
4.1.2. Plug−and−Play Performance Assessment of Distributed MTs
4.2. Simulation Analysis of Upper Layer Optimization Scheduling
4.2.1. Analysis of Simulation Results of Upper Layer Optimal Scheduling
4.2.2. Assessment of Plug−and−Play Control Performance of MTs during Optimal Scheduling
4.2.3. Performance Assessment of the Two−Stage Deep Reinforcement Learning Agent Training Method
4.2.4. Comparative Analysis of Algorithms
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Groppi, D.; Pfeifer, A.; Garcia, D.A.; Krajačić, G.; Duić, N. A review on energy storage and demand side management solutions in smart energy islands. Renew. Sustain. Energy Rev. 2021, 135, 110183. [Google Scholar] [CrossRef]
- Wu, Y.; Hu, M.; Liao, M.; Liu, F.; Xu, C. Risk assessment of renewable energy−based island microgrid using the HFLTS−cloud model method. J. Clean. Prod. 2021, 284, 125362. [Google Scholar] [CrossRef]
- Mimica, M.; De Urtasun, L.G.; Krajačić, G. A robust risk assessment method for energy planning scenarios on smart islands under the demand uncertainty. Energy 2022, 240, 122769. [Google Scholar] [CrossRef]
- Zhao, B.; Chen, J.; Zhang, L.; Zhang, X.; Qin, R.; Lin, X. Three representative island microgrids in the East China Sea: Key technologies and experiences. Renew. Sustain. Energy Rev. 2018, 96, 262–274. [Google Scholar] [CrossRef]
- Liu, S.; Wang, X.; Liu, P.X. Impact of communication delays on secondary frequency control in an islanded microgrid. IEEE Trans. Ind. Electron. 2015, 62, 2021–2031. [Google Scholar] [CrossRef]
- Mahmood, H.; Michaelson, D.; Jiang, J. Reactive power sharing in islanded microgrids using adaptive voltage droop control. IEEE Trans. Smart Grid 2015, 6, 3052–3060. [Google Scholar] [CrossRef]
- Espina, E.; Llanos, J.; Burgos−Mellado, C.; Cardenas−Dobson, R.; Martinez−Gomez, M.; Saez, D. Distributed control strategies for microgrids: An overview. IEEE Access 2020, 8, 193412–193448. [Google Scholar] [CrossRef]
- Yue, D.; He, Z.; Dou, C. Cloud−Edge Collaboration Based Distribution Network Reconfiguration for Voltage Preventive Control. IEEE Trans. Ind. Inf. 2023. [Google Scholar] [CrossRef]
- Nguyen, T.L.; Guillo−Sansano, E.; Syed, M.H.; Nguyen, V.H.; Blair, S.M.; Reguera, L.; Tran, Q.T.; Caire, R.; Burt, G.M.; Gavriluta, C.; et al. Multi−agent system with plug and play feature for distributed secondary control in microgrid—Controller and power hardware−in−the−loop Implementation. Energies 2018, 11, 3253. [Google Scholar] [CrossRef] [Green Version]
- Hosseinzadeh, M.; Schenato, L.; Garone, E. A distributed optimal power management system for microgrids with plug&play capabilities. Adv. Control Appl. Eng. Ind. Syst. 2021, 3, e65. [Google Scholar]
- Lai, J.; Lu, X.; Yu, X.; Monti, A. Cluster−oriented distributed cooperative control for multiple AC microgrids. IEEE Trans. Ind. Inf. 2019, 15, 5906–5918. [Google Scholar] [CrossRef]
- Lu, X.; Lai, J.; Yu, X. A novel secondary power management strategy for multiple AC microgrids with cluster−oriented two−layer cooperative framework. IEEE Trans. Ind. Inf. 2020, 17, 1483–1495. [Google Scholar] [CrossRef]
- Gao, S.; Xiang, C.; Yu, M.; Tan, K.T.; Lee, T.H. Online optimal power scheduling of a microgrid via imitation learning. IEEE Trans. Smart Grid 2021, 13, 861–876. [Google Scholar] [CrossRef]
- Shi, W.; Li, N.; Chu, C.C.; Gadh, R. Real−time energy management in microgrids. IEEE Trans. Smart Grid 2015, 8, 228–238. [Google Scholar] [CrossRef]
- Paul, T.G.; Hossain, S.J.; Ghosh, S.; Mandal, P.; Kamalasadan, S. A quadratic programming based optimal power and battery dispatch for grid−connected microgrid. IEEE Trans. Ind. Appl. 2017, 54, 1793–1805. [Google Scholar] [CrossRef]
- Tabar, V.S.; Jirdehi, M.A.; Hemmati, R. Energy management in microgrid based on the multi objective stochastic programming incorporating portable renewable energy resource as demand response option. Energy 2017, 118, 827–839. [Google Scholar] [CrossRef]
- Abdolrasol, M.G.; Hannan, M.A.; Mohamed, A.; Amiruldin, U.A.U.; Abidin, I.B.Z.; Uddin, M.N. An optimal scheduling controller for virtual power plant and microgrid integration using the binary backtracking search algorithm. IEEE Trans. Ind. Appl. 2018, 54, 2834–2844. [Google Scholar] [CrossRef]
- Yousif, M.; Ai, Q.; Gao, Y.; Wattoo, W.A.; Jiang, Z.; Hao, R. An optimal dispatch strategy for distributed microgrids using PSO. CSEE J. Power Energy Syst. 2019, 6, 724–734. [Google Scholar] [CrossRef]
- Raghav, L.P.; Kumar, R.S.; Raju, D.K.; Singh, A.R. Optimal energy management of microgrids using quantum teaching learning based algorithm. IEEE Trans. Smart Grid 2021, 12, 4834–4842. [Google Scholar] [CrossRef]
- Fan, L.; Zhang, J.; He, Y.; Liu, Y.; Hu, T.; Zhang, H. Optimal scheduling of microgrid based on deep deterministic policy gradient and transfer learning. Energies 2021, 14, 584. [Google Scholar] [CrossRef]
- Liu, J.F.; Chen, J.L.; Wang, X.S.; Zheng, J.; Huang, Q.Y. Energy Management and Optimization of Multi−energy Grid Based on Deep Reinforcement Learning. Power Syst. Technol. 2020, 44, 3794–3803. [Google Scholar]
- Zhao, J.; Li, F.; Mukherjee, S.; Sticht, C. Deep reinforcement learning−based model−free on−line dynamic multi−microgrid formation to enhance resilience. IEEE Trans. Smart Grid 2022, 13, 2557–2567. [Google Scholar] [CrossRef]
- Li, T.; Yang, D.; Xie, X.; Zhang, H. Event−triggered control of nonlinear discrete−time system with unknown dynamics based on HDP (λ). IEEE Trans. Cybern. 2021, 52, 6046–6058. [Google Scholar] [CrossRef]
- Yu, L.; Xie, W.; Xie, D.; Zou, Y.; Zhang, D.; Sun, Z.; Zhang, L.; Zhang, Y.; Jiang, T. Deep reinforcement learning for smart home energy management. IEEE Internet Things J. 2019, 7, 2751–2762. [Google Scholar] [CrossRef] [Green Version]
- Ji, Y.; Wang, J.H. Online optimal scheduling of a microgrid based on deep reinforcement learning. Control Decis. 2022, 37, 1675–1684. [Google Scholar]
- De Azevedo, R.; Cintuglu, M.H.; Ma, T.; Mohammed, O.A. Multiagent−based optimal microgrid control using fully distributed diffusion strategy. IEEE Trans. Smart Grid 2017, 8, 1997–2008. [Google Scholar] [CrossRef]
- Zhang, J.; Pu, T.; Li, Y.; Wang, X.; Zhou, X. Multi−agent deep reinforcement learning based optimal dispatch of distributed generators. Power Syst. Technol. 2022, 46, 3496–3504. [Google Scholar]
- Olfati−Saber, R.; Fax, J.A.; Murray, R.M. Consensus and cooperation in networked multi−agent systems. Proc. IEEE 2007, 95, 215–233. [Google Scholar] [CrossRef] [Green Version]
- Woo, J.; Yu, C.; Kim, N. Deep reinforcement learning−based controller for path following of an unmanned surface vehicle. Ocean Eng. 2019, 183, 155–166. [Google Scholar] [CrossRef]
- Zhang, F.; Li, J.; Li, Z. A TD3−based multi−agent deep reinforcement learning method in mixed cooperation−competition environment. Neurocomputing 2020, 411, 206–215. [Google Scholar] [CrossRef]
- Fujimoto, S.; Hoof, H.; Meger, D. Addressing function approximation error in actor−critic methods. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10 July 2018. [Google Scholar]
- Du, Y.; Wu, D. Deep reinforcement learning from demonstrations to assist service restoration in islanded microgrids. IEEE Trans. Sustain. Energy 2022, 13, 1062–1072. [Google Scholar] [CrossRef]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Hassabis, D. Human−level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef]
- Meng, L.; Gorbet, R.; Kulić, D. The effect of multi−step methods on overestimation in deep reinforcement learning. In Proceedings of the International Conference on Pattern Recognition, Milan, Italy, 10 January 2021. [Google Scholar]
- Haarnoja, T.; Zhou, A.; Abbeel, P.; Levine, S. Soft actor−critic: Off−policy maximum entropy deep reinforcement learning with a stochastic actor. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10 July 2018. [Google Scholar]
- Jiang, T.; Tang, S.; Li, X.; Zhang, R.; Chen, H.; Li, G. Resilience Boosting Strategy for Island Microgrid Clusters against Typhoons. Proc. CSEE 2022, 42, 6625–6641. [Google Scholar]
- Zhao, P.; Wu, J.; Wang, Y.; Zhang, H. Operation optimization strategy of microgrid based on deep reinforcement learning. Electr. Power Autom. Equip. 2022, 42, 9–16. [Google Scholar]
- Chen, T.; Bu, S.; Liu, X.; Kang, J.; Yu, F.R.; Han, Z. Peer−to−peer energy trading and energy conversion in interconnected multi−energy microgrids using multi−agent deep reinforcement learning. IEEE Trans. Smart Grid 2021, 13, 715–727. [Google Scholar] [CrossRef]
- Liu, Y.; Qie, T.; Yu, Y.; Wang, Y.; Chau, T.K.; Zhang, X.; Manandhar, U.; Li, S.; Lu, H.H.; Fernando, T. A Novel Integral Reinforcement Learning−Based Control Method Assisted by Twin Delayed Deep Deterministic Policy Gradient for Solid Oxide Fuel Cell in DC Microgrid. IEEE Trans. Sustain. Energy 2022, 14, 688–703. [Google Scholar] [CrossRef]
- Wu, T.; Wang, J.; Lu, X.; Du, Y. AC/DC hybrid distribution network reconfiguration with microgrid formation using multi−agent soft actor−critic. Appl. Energy 2022, 307, 118189. [Google Scholar] [CrossRef]
- Arwa, E.O.; Folly, K.A. Reinforcement learning techniques for optimal power control in grid−connected microgrids: A comprehensive review. IEEE Access 2020, 8, 208992–209007. [Google Scholar] [CrossRef]
- Samende, C.; Cao, J.; Fan, Z. Multi−agent deep deterministic policy gradient algorithm for peer−to−peer energy trading considering distribution network constraints. Appl. Energy 2022, 317, 119123. [Google Scholar] [CrossRef]
- Li, Y.; Wang, R.; Yang, Z. Optimal scheduling of isolated microgrids using automated reinforcement learning−based multi−period forecasting. IEEE Trans. Sustain. Energy 2021, 13, 159–169. [Google Scholar] [CrossRef]
Ref. No | Method | Advantages | Disadvantages |
---|---|---|---|
[5] | Centralized control | Guaranteed power distribution by capacity | Failure of the central controller will cause the whole system to be abnormal |
[6] | Adaptive droop strategy | Compensates for the impact caused by the voltage drop of the feeder to improve the reactive power distribution accuracy | Failure of the central controller will cause the whole system to be abnormal |
[8] | Cloud−edge collaboration | Alleviates the tremendous computational pressure caused by excessively centralized computation tasks | Requires the construction of a cloud−based service platform, which is expensive to build |
[9] | Multi−agent system based multi−layer architecture | Achieves the plug−and−play function of distributed MT with guaranteed frequency recovery | No consideration of the operational economics of island microgrids |
[10] | Distributed control using two sensors | Achieves the plug−and−play function of distributed MT with guaranteed optimal tide | No consideration of the operational economics of island microgrids |
[11] | Dual−layer consensus control | Achieves capacity−based allocation of MT output power within and between microgrids | No consideration of the operational economics of island microgrids |
[12] | Equal micro−increment dual−layer consensus control | Reduces the operational cost of distributed power sources | No consideration of the operating costs of other devices in the microgrid |
[22] | Improved DDQN algorithm | Greatly improves the learning ability of the algorithm | The action space of the algorithm is discrete, and the computational accuracy is low |
[23] | Model−actor−critic reinforcement learning combined with event−triggered control | Reduces the computational and communication requirements of the control process | The calculation process is more complicated |
[24] | DDPG algorithm | Provides effective scheduling strategies for household microgrids | Overlooks the overestimation problem that might occur when the DDPG algorithm updated iteratively |
[25] | Improved SAC algorithm | Provides effective scheduling strategies for microgrids | The training process can adversely affect the learning efficiency of the SAC algorithm |
Time/h | Electricity Purchase Price [CNY/(kW·h)] | Electricity Sales Price [CNY/(kW·h)] |
---|---|---|
1–6, 22–24 | 0.37 | 0.28 |
7–9, 14–17, 20, 21 | 0.82 | 0.65 |
10–13, 18, 19 | 1.36 | 0.78 |
Main Parameters | Values | Main Parameters | Values |
---|---|---|---|
Pnom | 200 kW | R2up | 60 kW |
P1min | 0 kW | R3up | 50 kW |
P2min | 0 kW | Pchmax | 100 kW |
P3min | 0 kW | Pdismax | 100 kW |
P1max | 160 kW | Ses | 1500 kW·h |
P2max | 120 kW | SOC (0) | 0.5 |
P3max | 100 kW | α | 0.0013 |
R1down | 80 kW | β | 0.553 |
R2down | 60 kW | c | 14.17 |
R3down | 50 kW | 0.5 | |
R1up | 80 kW | 10 |
Variables | Agent | Space | Lower Boundary | Upper Boundary |
---|---|---|---|---|
PMT,sum | MT | State | 0 kW | 380 kW |
PPV | MT | State | 0 kW | 200 kW |
PWT | MT | State | 0 kW | 300 kW |
PL | MT | State | 0 kW | 500 kW |
σb | MT | State | 0 CNY | 2 CNY |
σs | MT | State | 0 CNY | 2 CNY |
t | ES | State | 0:00 | 24:00 |
PES | ES | State | −100 kW | 100 kW |
SOC | ES | State | 0 | 1 |
Pgrid | ES | State | −1000 kW | 1000 kW |
PTidal | ES | State | 0 kW | 200 kW |
PWave | ES | State | 0 kW | 200 kW |
MT | Action | 0 kW | 380 kW | |
PES,ref | ES | Action | −100 kW | 100 kW |
Schemes | Actor Network Learning Rate | Critic Network Learning Rate | Standard Deviation of Noise |
---|---|---|---|
1 | 1 × 10−2 | 1 × 10−2 | 4 |
2 | 1 × 10−2 | 5 × 10−3 | 4 |
3 | 5 × 10−3 | 5 × 10−3 | 4 |
4 | 5 × 10−3 | 8 × 10−3 | 4 |
5 | 1 × 10−2 | 1 × 10−2 | 2 |
6 | 1 × 10−2 | 1 × 10−2 | 3 |
7 | 1 × 10−2 | 1 × 10−2 | 5 |
Schemes | Two−Stage Training | Single−Stage Training |
---|---|---|
1 | 779.66 | 189.06 |
2 | 821.93 | −28,863.41 |
3 | 516.11 | −2556.68 |
4 | 703.45 | 344.69 |
5 | 552.89 | −943.76 |
6 | 448.92 | −2163.31 |
7 | 391.78 | −30,037.94 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, Z.; Zhou, B.; Li, G.; Gu, P.; Huang, J.; Liu, B. Dual−Layer Distributed Optimal Operation Method for Island Microgrid Based on Adaptive Consensus Control and Two−Stage MATD3 Algorithm. J. Mar. Sci. Eng. 2023, 11, 1201. https://doi.org/10.3390/jmse11061201
Zhang Z, Zhou B, Li G, Gu P, Huang J, Liu B. Dual−Layer Distributed Optimal Operation Method for Island Microgrid Based on Adaptive Consensus Control and Two−Stage MATD3 Algorithm. Journal of Marine Science and Engineering. 2023; 11(6):1201. https://doi.org/10.3390/jmse11061201
Chicago/Turabian StyleZhang, Zhibo, Bowen Zhou, Guangdi Li, Peng Gu, Jing Huang, and Boyu Liu. 2023. "Dual−Layer Distributed Optimal Operation Method for Island Microgrid Based on Adaptive Consensus Control and Two−Stage MATD3 Algorithm" Journal of Marine Science and Engineering 11, no. 6: 1201. https://doi.org/10.3390/jmse11061201
APA StyleZhang, Z., Zhou, B., Li, G., Gu, P., Huang, J., & Liu, B. (2023). Dual−Layer Distributed Optimal Operation Method for Island Microgrid Based on Adaptive Consensus Control and Two−Stage MATD3 Algorithm. Journal of Marine Science and Engineering, 11(6), 1201. https://doi.org/10.3390/jmse11061201