A Multi-Agent Deep-Reinforcement-Learning-Based Strategy for Safe Distributed Energy Resource Scheduling in Energy Hubs
Abstract
:1. Introduction
- (1)
- The DP-MADDPG algorithm is adopted for distributed management of the EH cluster system. Each agent independently controls the operation of its local system and adjusts its local policy based on real-time observations and reward signals, enhancing the robustness and reliability of the scheduling decisions. Furthermore, through collaboration among multiple agents, the method addresses complex scheduling issues and improves the energy utilization efficiency of the system.
- (2)
- Data privacy concerns are effectively addressed using the presented method. This method dynamically introduces noise interference and utilizes an energy storage system (ESS) to attenuate noise, ensuring the preservation of external transaction data while perturbing internal network data. Additionally, an effective evaluation mechanism for EH privacy protection is established to mitigate the impact of data correlation on evaluation results, enabling intelligent agents to generate noise data that satisfy the constraint conditions within a reasonable range.
2. Integrated Energy System Structure and Equipment Model
2.1. Integrated Energy System Structure
2.2. Model of Devices
- (1)
- Combined heat and power (CHP): CHP is an efficient energy utilization system that achieves integrated utilization of energy by simultaneously generating electricity and heat through the combustion of natural gas. The model can be described as follows:
- (2)
- Electric boiler (EB): EB is a device that converts electrical energy into thermal energy, used to meet the heat network’s load requirements when CHP is not operational.
- (3)
- Power to Gas (P2G): P2G is an energy conversion technology. By converting surplus electricity into natural gas, P2G systems can store energy in the natural gas grid to meet peak energy demands or periods when renewable energy generation falls short. This helps alleviate the challenges posed by the intermittency and fluctuations in renewable energy sources. The model is as follows:
- (4)
- Energy storage model: Energy storage unit system is utilized for balancing load and supply in a network. The output model can be described as follows:
2.3. Constraints
- (1)
- Energy Balance Constraint: The power balance constraint of the entire integrated energy system is expressed as
- (2)
- Equipment Operating Constraints: For P2G, CHP, and EB devices, power constraints and ramp constraints must be adhered to during operation as follows:
- (3)
- Energy Storage Device Constraints: The capacity constraints and ramp constraints to be satisfied by energy storage devices on different networks are expressed as the following equations:
2.4. Carbon Trading Cost Model
- (1)
- CHP Carbon Trading Cost: CHP units are one of the main carbon emission sources in the energy system. Assuming that the total carbon emission intensity and quota are proportional to the actual output, the carbon-related cost can be calculated as follows:
- (2)
- P2G Carbon Trading Cost: The P2G unit can capture from power plants or biogas. As shown in Equation (21), the conversion process of P2G can be divided into two steps: electrolytic hydrogen production and methanation, where the volume of consumed in this process is equal to the volume of produced.
2.5. Objective Function
3. A Real-Time Optimal Energy Scheduling Method for EH Based on Distributed Deep Reinforcement Learning
3.1. MADDPG Algorithm
3.2. Parameter Space
- (1)
- State space: At time slot t, the state space of an EH cluster primarily encompasses the renewable energy generation (including wind power and photovoltaic generation) within each agent’s region, the load of the three energy networks, the gas consumption of CHP units, the electricity consumption of EB and P2G devices, the electricity price, gas price, and the charging and discharging actions of energy storage systems. It can be defined as follows:
- (2)
- Action space: The action space variables mainly include controllable energy conversion devices and energy storage devices, which can be indicated as follows:
- (3)
- Reward function: The reward of agent i on given state si, t, and action at i can be described as
- (4)
- Algorithm chart: The optimal energy scheduling process for EH cluster based on MADDPG is shown in Algorithm 1.
Algorithm 1: Distributed Energy Management by MADDPG. |
3.3. EH Privacy Protection Based on Differential Privacy
Algorithm Chart
4. Case Studies
4.1. Analysis of Optimized Schedule Results
4.2. Optimization Results Analysis
4.3. Privacy Protection Results Analysis
4.4. Sensitivity Studies
4.4.1. Sensitivities on the Level of Renewable Energy Sources
4.4.2. Sensitivities on the Number of Agents
4.4.3. Sensitivities on Privacy Protection Levels
4.5. Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
DDPG | Deep deterministic policy gradient |
DRL | Deep reinforcement learning |
DP | Differential privacy |
EH | Energy hub |
ESS | Energy storage system |
HE | Homomorphic encryption |
IES | Integrated energy system |
MADDPG | Multi-agent deep deterministic policy gradient |
MADRL | Multi-agent deep reinforcement learning |
MDP | Markov decision process |
MG | Microgrid |
MILP | Mixed-integer linear programming |
RL | Reinforcement learning |
Indices and Sets | |
Index/set of CHP from 1 to N. | |
Index/set of EH nodes from 1 to M. | |
Index/set of time slots from 1 to T. | |
Set of the state for agent i from time 1 to t, i.e., . | |
Set of the action for agent i from time 1 to t, i.e., . | |
The self-discharge efficiency of electricity, heat, and gas networks. | |
The self-discharge efficiency of electricity, heat, and gas networks. | |
The charge efficiency of electricity, heat, and gas networks. | |
The discharge efficiency of electricity, heat, and gas networks. | |
The the gas-to-electricity conversion efficiencies. | |
Parameters | |
The the gas-to-heat conversion efficiencies. | |
The the power-to-heat conversion efficiencies. | |
The the power-to-gas conversion efficiencies. | |
The price of carbon trading. | |
The O&M costs for CHP. | |
The O&M costs for EB. | |
The O&M costs for ES. | |
The O&M costs for P2G. | |
The price of electricity and gas at time t. | |
The small/large positive value as a reward weight. | |
The carbon emission quota associated with the unit energy generated. | |
The carbon emission intensity associated with the unit energy generated. | |
The lower/upper limits of the ESS’s power. | |
The lower/upper bounds of gas consumption. | |
The maximum ramping power of the nth CHP. | |
The maximum ramping power of the P2G. | |
The lower/upper limits of the EB’s power. | |
The maximum ramping power of the EB. | |
The lower/upper limits of the P2G’s power. | |
The lower/upper limits of charging/discharging power. | |
Variables | |
The stored energy of ESS on networks at time t. | |
The heat output of the nth CHP in node i at time t. | |
The thermal output of EB in node i. | |
The gas consumption of the nth CHP in node i at time t. | |
The gas output of P2G in node i. | |
The variation of from slot t to t + . | |
The exchanged power with the external gas network in node i at time t. | |
The values of electrical, thermal, and gas load at time t. | |
The power output of the nth CHP in node i at time t. | |
The electric power input of EB in node i. | |
The variation of from slot t to t + . | |
The electric power input of P2G in node i. | |
The variation of from slot t to t + . | |
The power generation of the photoelectric power generation unit in node i at time t. | |
The power generation of the wind power generation unit in node i at time t. | |
The charging/discharging power in node i at time t. | |
The exchanged power with the main grid in node i at time t. |
References
- Wang, Y.; Hu, J.; Liu, N. Energy Management in Integrated Energy System Using Energy–Carbon Integrated Pricing Method. IEEE Trans. Sustain. Energy 2023, 14, 1992–2005. [Google Scholar] [CrossRef]
- Liu, N.; Tan, L.; Sun, H.; Zhou, Z.; Guo, B. Bilevel Heat–Electricity Energy Sharing for Integrated Energy Systems With Energy Hubs and Prosumers. IEEE Trans. Ind. Inform. 2022, 18, 3754–3765. [Google Scholar] [CrossRef]
- Yan, C.; Bie, Z.; Liu, S.; Urgun, D.; Singh, C.; Xie, L. A reliability model for integrated energy system considering multi-energy correlation. J. Mod. Power Syst. Clean Energy 2021, 9, 811–825. [Google Scholar] [CrossRef]
- Zhang, Z.; Wang, C.; Yang, M.; Chen, X.; Lv, H. Day-ahead Optimal Dispatch for Integrated Energy System Considering Power-to-gas and Dynamic Pipeline Networks. IEEE Ind. Appl. Soc. Annu. Meet. 2021, 57, 3317–3328. [Google Scholar] [CrossRef]
- Alabdulwahab, A.; Abusorrah, A.; Zhang, X.; Shahidehpour, M. Coordination of interdependent natural gas and electricity infrastructures for firming the variability of wind energy in stochastic day-ahead scheduling. IEEE Trans. Sustain. Energy 2015, 6, 606–615. [Google Scholar] [CrossRef]
- Quelhas, A.; Gil, E.; McCalley, J.D.; Ryan, S.M. A multiperiod generalized network flow model of the US integrated energy system: Part I—Model description. IEEE Trans. Power Syst. 2007, 22, 829–836. [Google Scholar] [CrossRef]
- Rinaldi, S.M.; Peerenboom, J.P.; Kelly, T.K. Identifying, understanding, and analyzing critical infrastructure interdependencies. IEEE Control Syst. Mag. 2001, 21, 11–25. [Google Scholar]
- Shahidehpour, M.; Fu, Y.; Wiedman, T. Impact of natural gas infrastructure on electric power systems. Proc. IEEE 2005, 93, 1042–1056. [Google Scholar] [CrossRef]
- Du, X.; Wu, Z.; Zou, L.; Tang, Y.; Fang, C.; Wang, C. Optimal Configuration of Integrated Energy Systems Based on Mixed Integer Linear Programming. In Proceedings of the 2021 IEEE Asia-Pacific Conference on Image Processing, Electronics and Computers (IPEC), Dalian, China, 14–16 April 2021; Volume 2, pp. 242–246. [Google Scholar]
- Laraki, M.H.; Brahmi, B.; El-Bayeh, C.Z.; Rahman, M.H. Energy management system for a Stand-alone Wind/Diesel/BESS/Fuel-cell Using Dynamic Programming. In Proceedings of the 2021 18th International Multi-Conference on Systems, Signals & Devices (SSD), Monastir, Tunisia, 22–25 March 2021; pp. 1258–1263. [Google Scholar]
- Zheng, J.; Wu, Q.; Jing, Z. Coordinated scheduling strategy to optimize conflicting benefits for daily operation of integrated electricity and gas networks. Appl. Energy 2017, 192, 370–381. [Google Scholar] [CrossRef]
- Lei, Y.; Xu, J.; Tong, N.; Shen, M. An Economically Optimized Planning of Regional Integrated Energy System Considering Renewable Energy and Energy Storage. In Proceedings of the 2022 IEEE PES 14th Asia-Pacific Power and Energy Engineering, Melbourne, Australia, 22–23 November 2022; pp. 1–6. [Google Scholar]
- Li, C.; Yang, H.; Shahidehpour, M.; Xu, Z.; Zhou, B.; Cao, Y.; Zeng, L. Optimal Planning of Islanded Integrated Energy System With Solar-Biogas Energy Supply. IEEE Trans. Sustain. Energy 2020, 11, 2437–2448. [Google Scholar] [CrossRef]
- Liu, X.; Wu, J.; Jenkins, N.; Bagdanavicius, A. Combined analysis of electricity and heat networks. Appl. Energy 2016, 162, 1238–1250. [Google Scholar] [CrossRef]
- Shi, M.; Wang, H.; Xie, P.; Lyu, C.; Jian, L.; Jia, Y. Distributed Energy Scheduling for Integrated Energy System Clusters With Peer-to-Peer Energy Transaction. IEEE Trans. Smart Grid 2023, 14, 142–156. [Google Scholar] [CrossRef]
- Zhang, Y.; Mou, Z.; Gao, F.; Jiang, J.; Ding, R.; Han, Z. UAV-enabled secure communications by multi-agent deep reinforcement learning. IEEE Trans. Veh. Technol. 2020, 69, 11599–11611. [Google Scholar] [CrossRef]
- Tang, F.; Kawamoto, Y.; Kato, N.; Liu, J. Future intelligent and secure vehicular network toward 6G: Machine-learning approaches. Proc. IEEE 2019, 108, 292–307. [Google Scholar] [CrossRef]
- Kha, Q.H.; Ho, Q.T.; Le, N.Q.K. Identifying SNARE proteins using an alignment-free method based on multiscan convolutional neural network and PSSM profiles. J. Chem. Inf. Model. 2022, 62, 4820–4826. [Google Scholar] [CrossRef]
- Le, N.Q.K. Potential of deep representative learning features to interpret the sequence information in proteomics. Proteomics 2022, 22, 2100232. [Google Scholar] [CrossRef]
- Li, S.; Wu, Y.; Cui, X.; Dong, H.; Fang, F.; Russell, S. Robust multi-agent reinforcement learning via minimax deep deterministic policy gradient. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 4213–4220. [Google Scholar]
- Lei, W.; Wen, H.; Wu, J.; Hou, W. MADDPG-based security situational awareness for smart grid with intelligent edge. Appl. Sci. 2021, 11, 3101. [Google Scholar] [CrossRef]
- Qiu, D.; Dong, Z.; Zhang, X.; Wang, Y.; Strbac, G. Safe reinforcement learning for real-time automatic control in a smart energy-hub. Appl. Energy 2022, 309, 118403. [Google Scholar] [CrossRef]
- Zhang, T.; Sun, M.; Qiu, D.; Zhang, X.; Strbac, G.; Kang, C. A Bayesian Deep Reinforcement Learning-based Resilient Control for Multi-Energy Micro-gird. IEEE Trans. Power Syst. 2023, 38, 5057–5072. [Google Scholar] [CrossRef]
- Xu, Z.; Han, G.; Liu, L.; Martínez-García, M.; Wang, Z. Multi-Energy Scheduling of an Industrial Integrated Energy System by Reinforcement Learning-Based Differential Evolution. IEEE Trans. Green Commun. Netw. 2021, 5, 1077–1090. [Google Scholar] [CrossRef]
- Park, L.; Lee, C.; Kim, J.; Mohaisen, A.; Cho, S. Two-stage IoT device scheduling with dynamic programming for energy Internet systems. IEEE Internet Things J. 2019, 6, 8782–8791. [Google Scholar] [CrossRef]
- Zhou, Y.; Jia, L.; Zhao, Y.; Zhan, Z. Optimal dispatch of an integrated energy system based on deep reinforcement learning considering new energy uncertainty. Energy Rep. 2023, 4, 804–809. [Google Scholar]
- Wang, Y.; Yang, Z.; Dong, L.; Huang, S.; Zhou, W. Energy Management of Integrated Energy System Based on Stackelberg Game and Deep Reinforcement Learning. In Proceedings of the 2020 IEEE 4th Conference on Energy Internet and Energy System Integration (EI2), Wuhan, China, 30 October–1 November 2020; Volume 2, pp. 2645–2651. [Google Scholar]
- Wang, X.; Chen, S.; Yan, D.; Wei, J.; Yang, Z. Multi-agent deep reinforcement learning–based approach for optimization in microgrid clusters with renewable energy. In Proceedings of the 2021 International Conference on Power System Technology (POWERCON), Haikou, China, 8–9 December 2021; Volume 4, pp. 413–419. [Google Scholar]
- Ren, H.; Gao, W. A MILP model for integrated plan and evaluation of distributed energy systems. Appl. Energy 2010, 87, 1001–1014. [Google Scholar] [CrossRef]
- Nebuloni, R.; Meraldi, L.; Bovo, C.; Ilea, V.; Berizzi, A.; Sinha, S.; Tamirisakandala, R.B.; Raboni, P. A hierarchical two-level MILP optimization model for the management of grid-connected BESS considering accurate physical model. Appl. Energy 2023, 334, 120697. [Google Scholar] [CrossRef]
- Chen, Y.; Han, W.; Zhu, Q.; Liu, Y.; Zhao, J. Target-driven obstacle avoidance algorithm based on DDPG for connected autonomous vehicles. EURASIP J. Adv. Signal Process. 2022, 2022, 61. [Google Scholar] [CrossRef]
- Fan, P.; Ke, S.; Yang, J.; Li, R.; Li, Y.; Yang, S.; Liang, J.; Fan, H.; Li, T. A load frequency coordinated control strategy for multimicrogrids with V2G based on improved MA-DDPG. Int. J. Electr. Power Energy Syst. 2023, 146, 108765. [Google Scholar] [CrossRef]
- Ao, T.; Zhang, K.; Shi, H.; Jin, Z.; Zhou, Y.; Liu, F. Energy-Efficient Multi-UAVs Cooperative Trajectory Optimization for Communication Coverage: An MADRL Approach. Remote Sens. 2023, 15, 429. [Google Scholar] [CrossRef]
- Wang, S.; Duan, J.; Shi, D.; Xu, C.; Li, H.; Diao, R.; Wang, Z. A data-driven multi-agent autonomous voltage control framework using deep reinforcement learning. IEEE Trans. Power Syst. 2020, 35, 4644–4654. [Google Scholar] [CrossRef]
- Peng, H.; Shen, X. Multi-agent reinforcement learning based resource management in MEC-and UAV-assisted vehicular networks. IEEE J. Sel. Areas Commun. 2020, 39, 131–141. [Google Scholar] [CrossRef]
- Lee, H.; Jeong, J. Multi-agent deep reinforcement learning (MADRL) meets multi-user MIMO systems. In Proceedings of the 2021 IEEE Global Communications Conference (GLOBECOM), Madrid, Spain, 7–11 December 2021; Volume 4, pp. 1–6. [Google Scholar]
- Kovári, B.; Lövétei, I.; Aradi, S.; Bécsi, T. Multi-Agent Deep Reinforcement Learning (MADRL) for Solving Real-Time Railway Rescheduling Problem. In Proceedings of the the Fifth International Conference on Railway Technology, Montpellier, France, 22–25 August 2022; Volume 1, pp. 1–6. [Google Scholar]
- Zhang, C.; Ahmad, M.; Wang, Y. ADMM based privacy-preserving decentralized optimization. IEEE Trans. Inf. Forensics Secur. 2018, 14, 565–580. [Google Scholar] [CrossRef]
- Yuan, Z.P.; Li, P.; Li, Z.L.; Xia, J. A Fully Distributed Privacy-Preserving Energy Management System for Networked Microgrid Cluster Based on Homomorphic Encryption. IEEE Trans. Smart Grid 2023, 2, 1. [Google Scholar] [CrossRef]
- Nozari, E.; Tallapragada, P.; Cortés, J. Differentially private average consensus: Obstructions, trade-offs, and optimal algorithm design. Automatica 2017, 81, 221–231. [Google Scholar] [CrossRef]
- Cheng, Z.; Ye, F.; Cao, X.; Chow, M.Y. A Homomorphic Encryption-Based Private Collaborative Distributed Energy Management System. IEEE Trans. Smart Grid 2021, 12, 5233–5243. [Google Scholar] [CrossRef]
- Zhang, T.; Zhu, T.; Xiong, P.; Huo, H.; Tari, Z.; Zhou, W. Correlated Differential Privacy: Feature Selection in Machine Learning. IEEE Trans. Ind. Inform. 2020, 16, 2115–2124. [Google Scholar] [CrossRef]
- Zhu, T.; Ye, D.; Wang, W.; Zhou, W.; Yu, P.S. More Than Privacy: Applying Differential Privacy in Key Areas of Artificial Intelligence. IEEE Trans. Knowl. Data Eng. 2022, 34, 2824–2843. [Google Scholar] [CrossRef]
- Aziz, R.; Banerjee, S.; Bouzefrane, S.; Le Vinh, T. Exploring Homomorphic Encryption and Differential Privacy Techniques towards Secure Federated Learning Paradigm. Future Internet 2023, 15, 310. [Google Scholar] [CrossRef]
- He, T.; Wu, X.; Dong, H.; Guo, F.; Yu, W. Distributed Optimal Power Scheduling for Microgrid System via Deep Reinforcement Learning with Privacy Preserving. In Proceedings of the 2022 IEEE 17th International Conference on Control & Automation (ICCA), Naples, Italy, 27–30 June 2022; Volume 1, pp. 820–825. [Google Scholar]
- Wang, Z.; Wan, R.; Gui, X.; Zhou, G. Deep Reinforcement Learning of Cooperative Control with Four Robotic Agents by MADDPG. In Proceedings of the 2020 International Conference on Computer Engineering and Intelligent Control (ICCEIC), Chongqing, China, 6–8 November 2020; Volume 5, pp. 287–290. [Google Scholar]
- Mao, H.; Zhang, Z.; Xiao, Z.; Gong, Z. Modelling the dynamic joint policy of teammates with attention multi-agent DDPG. arXiv 2018, arXiv:1811.07029. [Google Scholar]
- Li, Y.; Wang, B.; Yang, Z.; Li, J.; Chen, C. Hierarchical stochastic scheduling of multi-community integrated energy systems in uncertain environments via Stackelberg game. Appl. Energy 2022, 308, 118392. [Google Scholar] [CrossRef]
- Yang, M.; Tjuawinata, I.; Lam, K.Y. K-Means Clustering With Local dx-Privacy for Privacy-Preserving Data Analysis. IEEE Trans. Inf. Forensics Secur. 2022, 17, 2524–2537. [Google Scholar] [CrossRef]
- Yang, W.; Lam, K.Y. Automated cyber threat intelligence reports classification for early warning of cyber attacks in next generation SOC. In Proceedings of the International Conference on Information and Communications Security 2019, Beijing, China, 15–17 December 2020; Volume 2, pp. 145–164. [Google Scholar]
- Lakshmi, R.; Baskar, S. Efficient text document clustering with new similarity measures. Int. J. Bus. Intell. Data Min. 2021, 18, 49–72. [Google Scholar] [CrossRef]
- Wang, S.; Zhang, X.; Cheng, Y.; Jiang, F.; Yu, W.; Peng, J. A fast content-based spam filtering algorithm with fuzzy-SVM and K-means. In Proceedings of the 2018 IEEE International Conference on Big Data and Smart Computing (BigComp), Shanghai, China, 15–17 January 2018; Volume 14, pp. 301–307. [Google Scholar]
- Ghezelbash, R.; Maghsoudi, A.; Carranza, E.J.M. Optimization of geochemical anomaly detection using a novel genetic K-means clustering (GKMC) algorithm. Comput. Geosci. 2020, 134, 104335. [Google Scholar] [CrossRef]
- Pradana, M.G.; Ha, H.T. Maximizing strategy improvement in mall customer segmentation using k-means clustering. J. Appl. Data Sci. 2021, 2, 19–25. [Google Scholar] [CrossRef]
- Du, W.; Bi, J.; Wang, T.; Wang, H. Impact of grid connection of large-scale wind farms on power system small-signal angular stability. CSEE J. Power Energy Syst. 2015, 1, 83–89. [Google Scholar] [CrossRef]
- Munkhchuluun, E.; Meegahapola, L.; Vahidnia, A. Long-term voltage stability with large-scale solar-photovoltaic (PV) generation. Int. J. Electr. Power Energy Syst. 2020, 117, 105663. [Google Scholar] [CrossRef]
- Mondal, A.; Illindala, M.S. Improved frequency regulation in an islanded mixed source microgrid through coordinated operation of DERs and smart loads. IEEE Trans. Ind. Appl. 2017, 54, 112–120. [Google Scholar] [CrossRef]
- Li, Y.; Gao, W.; Ruan, Y. Feasibility of virtual power plants (VPPs) and its efficiency assessment through benefiting both the supply and demand sides in Chongming country, China. Sustain. Cities Soc. 2017, 35, 544–551. [Google Scholar] [CrossRef]
- Rouzbahani, H.M.; Karimipour, H.; Lei, L. A review on virtual power plant for energy management. Sustain. Energy Technol. Assess. 2021, 47, 101370. [Google Scholar] [CrossRef]
- Zamani, A.G.; Zakariazadeh, A.; Jadid, S.; Kazemi, A. Stochastic operational scheduling of distributed energy resources in a large scale virtual power plant. Int. J. Electr. Power Energy Syst. 2016, 82, 608–620. [Google Scholar] [CrossRef]
- Tan, C.; Tan, Z.; Wang, G.; Du, Y.; Pu, L.; Zhang, R. Business model of virtual power plant considering uncertainty and different levels of market maturity. J. Clean. Prod. 2022, 362, 131433. [Google Scholar] [CrossRef]
- Shabanzadeh, M.; Sheikh-El-Eslami, M.K.; Haghifam, M.R. A medium-term coalition-forming model of heterogeneous DERs for a commercial virtual power plant. Appl. Energy 2016, 169, 663–681. [Google Scholar] [CrossRef]
Algorithm | MILP | DDPG | MADDPG | HE-MADDPG | DP-MADDPG | |
---|---|---|---|---|---|---|
Feature | ||||||
Lax requirement for model accuracy | ✓ | ✓ | ✓ | ✓ | ||
Adaptability to multi-agents | ✓ | ✓ | ✓ | |||
Advantage in convergence speed | ✓ | ✓ | ||||
Privacy protection | ✓ | ✓ |
[kW] | [kW] | [kW] | ||
---|---|---|---|---|
150/0 | 60 | 45 | 0.5/0.4 | |
110/0 | 45 | 36 | 0.5/0.4 | |
[kW] | [kW] | Initial [kW] | ||
100/0 | 15 | 45 | 0.9025 | |
[kW] | [kW] | Initial [kW] | ||
150/0 | 24 | 56 | 0.83 | |
ESS | [kWh] | Initial [kWh] | ||
1000/−1000 | 0 | 0.01 | 0.95/0.95 | |
800/−800 | 0 | 0.01 | 0.95/0.95 |
Parameters | Critic | Actor |
---|---|---|
Learning rate | 0.0001 | 0.001 |
Soft update coefficient | 0.01 | 0.01 |
Number of layers of neural network | 2 | 2 |
Number of neural per layer | 64 | 64 |
Activation function of hidden layer | Relu | Relu |
Activation function of output layer | / | Tanh |
Number of episodes | 10,000 | 10,000 |
The number of times per episode | 24 | 24 |
Size of experience replay unit | 100,000 | 100,000 |
Number of EHs | Computational Time [s] | Average Reward |
---|---|---|
1 | 2916 | −1830.68 |
2 | 4752 | −2750.65 |
4 | 8424 | −5128.89 |
8 | 11,232 | −9288.73 |
Algorithm | Discrepancy Rate [%] | Computational Time [s] | Accuracy |
---|---|---|---|
MADDPG | 0 | 1685 | 0.954 |
31.2 | 6242 | 0.923 | |
DP-MADDPG | 63.4 | 6351 | 0.902 |
90.8 | 6532 | 0.883 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, X.; Wang, Q.; Yu, J.; Sun, Q.; Hu, H.; Liu, X. A Multi-Agent Deep-Reinforcement-Learning-Based Strategy for Safe Distributed Energy Resource Scheduling in Energy Hubs. Electronics 2023, 12, 4763. https://doi.org/10.3390/electronics12234763
Zhang X, Wang Q, Yu J, Sun Q, Hu H, Liu X. A Multi-Agent Deep-Reinforcement-Learning-Based Strategy for Safe Distributed Energy Resource Scheduling in Energy Hubs. Electronics. 2023; 12(23):4763. https://doi.org/10.3390/electronics12234763
Chicago/Turabian StyleZhang, Xi, Qiong Wang, Jie Yu, Qinghe Sun, Heng Hu, and Ximu Liu. 2023. "A Multi-Agent Deep-Reinforcement-Learning-Based Strategy for Safe Distributed Energy Resource Scheduling in Energy Hubs" Electronics 12, no. 23: 4763. https://doi.org/10.3390/electronics12234763
APA StyleZhang, X., Wang, Q., Yu, J., Sun, Q., Hu, H., & Liu, X. (2023). A Multi-Agent Deep-Reinforcement-Learning-Based Strategy for Safe Distributed Energy Resource Scheduling in Energy Hubs. Electronics, 12(23), 4763. https://doi.org/10.3390/electronics12234763