Reinforcement Learning Model-Based and Model-Free Paradigms for Optimal Control Problems in Power Systems: Comprehensive Review and Future Directions
Abstract
:1. Introduction
- 1.
- We study the structure of the MDP formulation and the nature of the learned policy for different control problems in power systems in model-based and model-free paradigms, focusing on discrete or continuous state space and stochastic or deterministic policy.
- 2.
- We find correlation between reinforcement learning algorithms and control problems in power systems and infer whether these change when following a model-based or model-free paradigm.
- 3.
- We highlight current trends such as the growing literature on the subject, showing prominent RL algorithms that are investigated and data standardization reflecting reproducibility in the recent literature.
- 4.
- We discuss future research possibilities, including the incorporation of multi-agent reinforcement learning frameworks, safe reinforcement learning study, and the investigation of RL applications in emerging power system technologies.
2. Technical Background on Reinforcement Learning
2.1. Markov Decision Processes
2.2. Model-Based and Model-Free Reinforcement Learning
2.3. Model Computation
2.3.1. Dynamic Programming
2.3.2. Model Predictive Control (MPC)
2.4. Policy Learning Basic Concepts
2.4.1. Value Function
2.4.2. Policy Iteration
2.4.3. Value Iteration
2.4.4. Policy Gradient
2.5. Multi-Agent Reinforcement Learning
2.6. Safe Reinforcement Learning
3. Model-Based Paradigm
3.1. Energy Market Management
3.2. Power Grid Stability and Control
3.3. Building Energy Management
3.4. Electrical Vehicles
3.5. Energy Storage Management
3.6. Prominent Trends
4. Model-Free Paradigms
4.1. Energy Market Management
4.2. Power Grid Stability and Control
4.3. Building Energy Management
4.4. Electrical Vehicles
4.5. Energy Storage Management
4.6. Prominent Trends
5. Comparison and Discussion
6. Challenges and Future Research Work
6.1. Challenges
6.1.1. Limited Real-World Data and Standardized Tools
6.1.2. Limited Scalability, Generalization, and the Curse of Dimensionality
6.1.3. Limited Robustness and Safety for Real-Time Applications
6.1.4. Nonstationarity and Time-Variant Environments
6.1.5. Reward Shaping and Global Objectives in Power Systems
6.2. Future Work
6.2.1. Explainability
6.2.2. Neural Architecture Search
6.2.3. Physics-Informed Neural Networks
6.2.4. Public Datasets
6.2.5. Safe Reinforcement Learning and Coping with Changing Conditions and Unforeseen Events
6.2.6. RL Integration with Different AI Techniques
6.2.7. RL Applications in Emerging Power System Technologies
Edge AI
7. Conclusions
Author Contributions
Funding
Conflicts of Interest
Abbreviations
RL | Reinforcement learning |
MDP | Markov decision process |
EM | Energy market |
GSAC | Grid stability and control |
BEM | Building energy management |
EV | Electric vehicle |
ESS | Energy storage system |
PV | Photovoltaic |
MG | Microgrid |
DR | Demand response |
ET | Energy trading |
AC | Actor–critic & variations |
PG | Policy gradient & variations |
QL | Q-learning & variations |
References
- Schneider, N. Population growth, electricity demand and environmental sustainability in Nigeria: Insights from a vector auto-regressive approach. Int. J. Environ. Stud. 2022, 79, 149–176. [Google Scholar] [CrossRef]
- Begum, R.A.; Sohag, K.; Abdullah, S.M.S.; Jaafar, M. CO2 emissions, energy consumption, economic and population growth in Malaysia. Renew. Sustain. Energy Rev. 2015, 41, 594–601. [Google Scholar] [CrossRef]
- Rahman, M.M. Exploring the effects of economic growth, population density and international trade on energy consumption and environmental quality in India. Int. J. Energy Sect. Manag. 2020, 14, 1177–1203. [Google Scholar] [CrossRef]
- Comello, S.; Reichelstein, S.; Sahoo, A. The road ahead for solar PV power. Renew. Sustain. Energy Rev. 2018, 92, 744–756. [Google Scholar] [CrossRef]
- Fathima, A.H.; Palanisamy, K. Energy storage systems for energy management of renewables in distributed generation systems. Energy Manag. Distrib. Gener. Syst. 2016, 157. [Google Scholar] [CrossRef]
- Heldeweg, M.A.; Séverine Saintier. Renewable energy communities as ‘socio-legal institutions’: A normative frame for energy decentralization? Renew. Sustain. Energy Rev. 2020, 119, 109518. [Google Scholar] [CrossRef]
- Urishev, B. Decentralized Energy Systems, Based on Renewable Energy Sources. Appl. Sol. Energy 2019, 55, 207–212. [Google Scholar] [CrossRef]
- Yaqoot, M.; Diwan, P.; Kandpal, T.C. Review of barriers to the dissemination of decentralized renewable energy systems. Renew. Sustain. Energy Rev. 2016, 58, 477–490. [Google Scholar] [CrossRef]
- Avancini, D.B.; Rodrigues, J.J.; Martins, S.G.; Rabêlo, R.A.; Al-Muhtadi, J.; Solic, P. Energy meters evolution in smart grids: A review. J. Clean. Prod. 2019, 217, 702–715. [Google Scholar] [CrossRef]
- Alotaibi, I.; Abido, M.A.; Khalid, M.; Savkin, A.V. A Comprehensive Review of Recent Advances in Smart Grids: A Sustainable Future with Renewable Energy Resources. Energies 2020, 13, 6269. [Google Scholar] [CrossRef]
- Alimi, O.A.; Ouahada, K.; Abu-Mahfouz, A.M. A Review of Machine Learning Approaches to Power System Security and Stability. IEEE Access 2020, 8, 113512–113531. [Google Scholar] [CrossRef]
- Krause, T.; Ernst, R.; Klaer, B.; Hacker, I.; Henze, M. Cybersecurity in Power Grids: Challenges and Opportunities. Sensors 2021, 21, 6225. [Google Scholar] [CrossRef] [PubMed]
- Yohanandhan, R.V.; Elavarasan, R.M.; Manoharan, P.; Mihet-Popa, L. Cyber-Physical Power System (CPPS): A Review on Modeling, Simulation, and Analysis with Cyber Security Applications. IEEE Access 2020, 8, 151019–151064. [Google Scholar] [CrossRef]
- Guerin, T.F. Evaluating expected and comparing with observed risks on a large-scale solar photovoltaic construction project: A case for reducing the regulatory burden. Renew. Sustain. Energy Rev. 2017, 74, 333–348. [Google Scholar] [CrossRef]
- Garcia, A.; Alzate, J.; Barrera, J. Regulatory design and incentives for renewable energy. J. Regul. Econ. 2011, 41, 315–336. [Google Scholar] [CrossRef]
- Glavic, M. (Deep) Reinforcement learning for electric power system control and related problems: A short review and perspectives. Annu. Rev. Control 2019, 48, 22–35. [Google Scholar] [CrossRef]
- Perera, A.; Kamalaruban, P. Applications of reinforcement learning in energy systems. Renew. Sustain. Energy Rev. 2021, 137, 110618. [Google Scholar] [CrossRef]
- Al-Saadi, M.; Al-Greer, M.; Short, M. Reinforcement Learning-Based Intelligent Control Strategies for Optimal Power Management in Advanced Power Distribution Systems: A Survey. Energies 2023, 16, 1608. [Google Scholar] [CrossRef]
- Chen, X.; Qu, G.; Tang, Y.; Low, S.; Li, N. Reinforcement Learning for Selective Key Applications in Power Systems: Recent Advances and Future Challenges. IEEE Trans. Smart Grid 2022, 13, 2935–2958. [Google Scholar] [CrossRef]
- Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
- Graesser, L.; Keng, W. Foundations of Deep Reinforcement Learning: Theory and Practice in Python; Addison-Wesley Data and Analytics Series; Addison-Wesley: Boston, MA, USA, 2020. [Google Scholar]
- Qiang, W.; Zhongli, Z. Reinforcement learning model, algorithms and its application. In Proceedings of the International Conference on Mechatronic Science, Electric Engineering and Computer (MEC), Jilin, China, 19–22 August 2011; pp. 1143–1146. [Google Scholar] [CrossRef]
- Zhang, K.; Yang, Z.; Başar, T. Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms. In Handbook of Reinforcement Learning and Control; Springer International Publishing: Cham, Switzerland, 2021; pp. 321–384. [Google Scholar] [CrossRef]
- Moerland, T.M.; Broekens, J.; Plaat, A.; Jonker, C.M. Model-based Reinforcement Learning: A Survey. Found. Trends Mach. Learn. 2023, 16, 1–118. [Google Scholar] [CrossRef]
- Huang, Q. Model-based or model-free, a review of approaches in reinforcement learning. In Proceedings of the 2020 International Conference on Computing and Data Science (CDS), Stanford, CA, USA, 1–2 August 2020; pp. 219–221. [Google Scholar] [CrossRef]
- Freed, B.; Wei, T.; Calandra, R.; Schneider, J.; Choset, H. Unifying Model-Based and Model-Free Reinforcement Learning with Equivalent Policy Sets. Reinf. Learn. J. 2024, 1, 283–301. [Google Scholar]
- Bayón, L.; Grau, J.; Ruiz, M.; Suárez, P. A comparative economic study of two configurations of hydro-wind power plants. Energy 2016, 112, 8–16. [Google Scholar] [CrossRef]
- Riffonneau, Y.; Bacha, S.; Barruel, F.; Ploix, S. Optimal power flow management for grid connected PV systems with batteries. IEEE Trans. Sustain. Energy 2011, 2, 309–320. [Google Scholar] [CrossRef]
- Powell, W.B. Approximate Dynamic Programming: Solving the Curses of Dimensionality; John Wiley & Sons: Hoboken, NJ, USA, 2007. [Google Scholar]
- Zargari, N.; Ofir, R.; Chowdhury, N.R.; Belikov, J.; Levron, Y. An Optimal Control Method for Storage Systems with Ramp Constraints, Based on an On-Going Trimming Process. IEEE Trans. Control Syst. Technol. 2023, 31, 493–496. [Google Scholar] [CrossRef]
- García, C.E.; Prett, D.M.; Morari, M. Model predictive control: Theory and practice—A survey. Automatica 1989, 25, 335–348. [Google Scholar] [CrossRef]
- Schwenzer, M.; Ay, M.; Bergs, T.; Abel, D. Review on Model Predictive Control: An Engineering Perspective. Int. J. Adv. Manuf. Technol. 2021, 117, 1327–1349. [Google Scholar] [CrossRef]
- Morari, M.; Garcia, C.E.; Prett, D.M. Model predictive control: Theory and practice. IFAC Proc. Vol. 1988, 21, 1–12. [Google Scholar] [CrossRef]
- Li, Z.; Wu, L.; Xu, Y.; Moazeni, S.; Tang, Z. Multi-Stage Real-Time Operation of a Multi-Energy Microgrid with Electrical and Thermal Energy Storage Assets: A Data-Driven MPC-ADP Approach. IEEE Trans. Smart Grid 2022, 13, 213–226. [Google Scholar] [CrossRef]
- Agarwal, A.; Kakade, S.M.; Lee, J.D.; Mahajan, G. On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift. J. Mach. Learn. Res. 2021, 22, 1–76. [Google Scholar] [CrossRef]
- Wooldridge, M. An Introduction to MultiAgent Systems; Wiley: Hoboken, NJ, USA, 2009. [Google Scholar]
- Altman, E. Constrained Markov decision processes with total cost criteria: Lagrangian approach and dual linear program. Math. Methods Oper. Res. 1998, 48, 387–417. [Google Scholar] [CrossRef]
- Achiam, J.; Held, D.; Tamar, A.; Abbeel, P. Constrained Policy Optimization. In Proceedings of the 34th International Conference on Machine Learning, PMLR, Sydney, Australia, 6–11 August 2017; Volume 70, pp. 22–31. [Google Scholar]
- Xia, Y.; Xu, Y.; Feng, X. Hierarchical Coordination of Networked-Microgrids Toward Decentralized Operation: A Safe Deep Reinforcement Learning Method. IEEE Trans. Sustain. Energy 2024, 15, 1981–1993. [Google Scholar] [CrossRef]
- Yongli, Z.; Limin, H.; Jinling, L. Bayesian networks-based approach for power systems fault diagnosis. IEEE Trans. Power Deliv. 2006, 21, 634–639. [Google Scholar] [CrossRef]
- Chen, N.; Qian, Z.; Nabney, I.T.; Meng, X. Wind Power Forecasts Using Gaussian Processes and Numerical Weather Prediction. IEEE Trans. Power Syst. 2014, 29, 656–665. [Google Scholar] [CrossRef]
- Wen, S.; Zhang, C.; Lan, H.; Xu, Y.; Tang, Y.; Huang, Y. A Hybrid Ensemble Model for Interval Prediction of Solar Power Output in Ship Onboard Power Systems. IEEE Trans. Sustain. Energy 2021, 12, 14–24. [Google Scholar] [CrossRef]
- Chow, J.H.; Sanchez-Gasca, J.J. Power System Coherency and Model Reduction; Wiley-IEEE Press: Hoboken, NJ, USA, 2020. [Google Scholar]
- Saxena, S.; Hote, Y.V. Load Frequency Control in Power Systems via Internal Model Control Scheme and Model-Order Reduction. IEEE Trans. Power Syst. 2013, 28, 2749–2757. [Google Scholar] [CrossRef]
- Machlev, R.; Zargari, N.; Chowdhury, N.; Belikov, J.; Levron, Y. A review of optimal control methods for energy storage systems - energy trading, energy balancing and electric vehicles. J. Energy Storage 2020, 32, 101787. [Google Scholar] [CrossRef]
- Machlev, R.; Tolkachov, D.; Levron, Y.; Beck, Y. Dimension reduction for NILM classification based on principle component analysis. Electr. Power Syst. Res. 2020, 187, 106459. [Google Scholar] [CrossRef]
- Chien, I.; Karthikeyan, P.; Hsiung, P.A. Prediction-based peer-to-peer energy transaction market design for smart grids. Eng. Appl. Artif. Intell. 2023, 126, 107190. [Google Scholar] [CrossRef]
- Levron, Y.; Shmilovitz, D. Optimal Power Management in Fueled Systems with Finite Storage Capacity. IEEE Trans. Circuits Syst. I Regul. Pap. 2010, 57, 2221–2231. [Google Scholar] [CrossRef]
- Sanayha, M.; Vateekul, P. Model-based deep reinforcement learning for wind energy bidding. Int. J. Electr. Power Energy Syst. 2022, 136, 107625. [Google Scholar] [CrossRef]
- Wolgast, T.; Nieße, A. Approximating Energy Market Clearing and Bidding with Model-Based Reinforcement Learning. arXiv 2023, arXiv:2303.01772. [Google Scholar] [CrossRef]
- Sanayha, M.; Vateekul, P. Model-Based Approach on Multi-Agent Deep Reinforcement Learning with Multiple Clusters for Peer-To-Peer Energy Trading. IEEE Access 2022, 10, 127882–127893. [Google Scholar] [CrossRef]
- He, Q.; Wang, J.; Shi, R.; He, Y.; Wu, M. Enhancing renewable energy certificate transactions through reinforcement learning and smart contracts integration. Sci. Rep. 2024, 14, 10838. [Google Scholar] [CrossRef] [PubMed]
- Zou, Y.; Wang, Q.; Xia, Q.; Chi, Y.; Lei, C.; Zhou, N. Federated reinforcement learning for Short-Time scale operation of Wind-Solar-Thermal power network with nonconvex models. Int. J. Electr. Power Energy Syst. 2024, 158, 109980. [Google Scholar] [CrossRef]
- Nanduri, V.; Das, T.K. A Reinforcement Learning Model to Assess Market Power Under Auction-Based Energy Pricing. IEEE Trans. Power Syst. 2007, 22, 85–95. [Google Scholar] [CrossRef]
- Cai, W.; Kordabad, A.B.; Gros, S. Energy management in residential microgrid using model predictive control-based reinforcement learning and Shapley value. Eng. Appl. Artif. Intell. 2023, 119, 105793. [Google Scholar] [CrossRef]
- Ojand, K.; Dagdougui, H. Q-Learning-Based Model Predictive Control for Energy Management in Residential Aggregator. IEEE Trans. Autom. Sci. Eng. 2022, 19, 70–81. [Google Scholar] [CrossRef]
- Nord Pool. Nord Pool Wholesale Electricity Market Data. 2024. Available online: https://data.nordpoolgroup.com/auction/day-ahead/prices?deliveryDate=latest¤cy=EUR&aggregation=Hourly&deliveryAreas=AT (accessed on 19 September 2024).
- Australia Gird. Australia Gird Data. 2024. Available online: https://www.ausgrid.com.au/Industry/Our-Research/Data-to-share/Average-electricity-use (accessed on 19 September 2024).
- Chinese Listed Companies, CNY. Carbon Emissions Data. 2024. Available online: https://www.nature.com/articles/s41598-024-60527-3/tables/1 (accessed on 19 September 2024).
- Hiskens, I. IEEE PES Task Force on Benchmark Systems for Stability Controls; IEEE: Piscataway, NJ, USA, 2013. [Google Scholar]
- Elia. Belgium Grid Data. 2024. Available online: https://www.elia.be/en/grid-data/ (accessed on 19 September 2024).
- Comed. Chicago Electricity Price Data. 2024. Available online: https://hourlypricing.comed.com/live-prices/ (accessed on 19 September 2024).
- Huang, R.; Chen, Y.; Yin, T.; Li, X.; Li, A.; Tan, J.; Yu, W.; Liu, Y.; Huang, Q. Accelerated Derivative-Free Deep Reinforcement Learning for Large-Scale Grid Emergency Voltage Control. IEEE Trans. Power Syst. 2022, 37, 14–25. [Google Scholar] [CrossRef]
- Hossain, R.R.; Yin, T.; Du, Y.; Huang, R.; Tan, J.; Yu, W.; Liu, Y.; Huang, Q. Efficient learning of power grid voltage control strategies via model-based Deep Reinforcement Learning. Mach. Learn. 2023, 113, 2675–2700. [Google Scholar] [CrossRef]
- Cao, D.; Zhao, J.; Hu, W.; Ding, F.; Yu, N.; Huang, Q.; Chen, Z. Model-free voltage control of active distribution system with PVs using surrogate model-based deep reinforcement learning. Appl. Energy 2022, 306, 117982. [Google Scholar] [CrossRef]
- Huang, Q.; Huang, R.; Hao, W.; Tan, J.; Fan, R.; Huang, Z. Adaptive Power System Emergency Control Using Deep Reinforcement Learning. IEEE Trans. Smart Grid 2020, 11, 1171–1182. [Google Scholar] [CrossRef]
- Duan, J.; Yi, Z.; Shi, D.; Lin, C.; Lu, X.; Wang, Z. Reinforcement-Learning-Based Optimal Control of Hybrid Energy Storage Systems in Hybrid AC–DC Microgrids. IEEE Trans. Ind. Inform. 2019, 15, 5355–5364. [Google Scholar] [CrossRef]
- Totaro, S.; Boukas, I.; Jonsson, A.; Cornélusse, B. Lifelong control of off-grid microgrid with model-based reinforcement learning. Energy 2021, 232, 121035. [Google Scholar] [CrossRef]
- Yan, Z.; Xu, Y. Real-Time Optimal Power Flow: A Lagrangian Based Deep Reinforcement Learning Approach. IEEE Trans. Power Syst. 2020, 35, 3270–3273. [Google Scholar] [CrossRef]
- Zhang, H.; Yue, D.; Dou, C.; Xie, X.; Li, K.; Hancke, G.P. Resilient Optimal Defensive Strategy of TSK Fuzzy-Model-Based Microgrids’ System via a Novel Reinforcement Learning Approach. IEEE Trans. Neural Networks Learn. Syst. 2023, 34, 1921–1931. [Google Scholar] [CrossRef]
- Aghaei, J.; Niknam, T.; Azizipanah-Abarghooee, R.; Arroyo, J.M. Scenario-based dynamic economic emission dispatch considering load and wind power uncertainties. Int. J. Electr. Power Energy Syst. 2013, 47, 351–367. [Google Scholar] [CrossRef]
- Zhang, H.; Yue, D.; Xie, X.; Dou, C.; Sun, F. Gradient decent based multi-objective cultural differential evolution for short-term hydrothermal optimal scheduling of economic emission with integrating wind power and photovoltaic power. Energy 2017, 122, 748–766. [Google Scholar] [CrossRef]
- Zhang, Z.; Zhang, C.; Lam, K.P. A deep reinforcement learning method for model-based optimal control of HVAC systems. In Proceedings of the SURFACE at Syracuse University, Syracuse, NY, USA, 24 September 2018. [Google Scholar] [CrossRef]
- Zhang, Z.; Chong, A.; Pan, Y.; Zhang, C.; Lam, K.P. Whole building energy model for HVAC optimal control: A practical framework based on deep reinforcement learning. Energy Build. 2019, 199, 472–490. [Google Scholar] [CrossRef]
- Chen, B.; Cai, Z.; Bergés, M. Gnu-rl: A precocial reinforcement learning solution for building hvac control using a differentiable mpc policy. In Proceedings of the 6th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation, Hangzhou, China, 13–14 November 2019; pp. 316–325. [Google Scholar]
- Drgoňa, J.; Picard, D.; Kvasnica, M.; Helsen, L. Approximate model predictive building control via machine learning. Appl. Energy 2018, 218, 199–216. [Google Scholar] [CrossRef]
- Arroyo, J.; Manna, C.; Spiessens, F.; Helsen, L. Reinforced model predictive control (RL-MPC) for building energy management. Appl. Energy 2022, 309, 118346. [Google Scholar] [CrossRef]
- Drgoňa, J.; Tuor, A.; Skomski, E.; Vasisht, S.; Vrabie, D. Deep learning explicit differentiable predictive control laws for buildings. IFAC-PapersOnLine 2021, 54, 14–19. [Google Scholar] [CrossRef]
- Kowli, A.; Mayhorn, E.; Kalsi, K.; Meyn, S.P. Coordinating dispatch of distributed energy resources with model predictive control and Q-learning. In Coordinated Science Laboratory Report no. UILU-ENG-12-2204, DC-256; Coordinated Science Laboratory: Urbana, IL, USA, 2012. [Google Scholar]
- Bianchi, C.; Fontanini, A. TMY3 Weather Data for ComStock and ResStock. 2021. Available online: https://data.nrel.gov/submissions/156 (accessed on 19 September 2024).
- Blum, D.; Arroyo, J.; Huang, S.; Drgoňa, J.; Jorissen, F.; Walnum, H.T.; Chen, Y.; Benne, K.; Vrabie, D.; Wetter, M.; et al. Building optimization testing framework (BOPTEST) for simulation-based benchmarking of control strategies in buildings. J. Build. Perform. Simul. 2021, 14, 586–610. [Google Scholar] [CrossRef]
- Wind Data and Tools. Wind Data. 2024. Available online: https://www.nrel.gov/wind/data-tools.html (accessed on 19 September 2024).
- Lee, H.; Cha, S.W. Energy management strategy of fuel cell electric vehicles using model-based reinforcement learning with data-driven model update. IEEE Access 2021, 9, 59244–59254. [Google Scholar] [CrossRef]
- Chiş, A.; Lundén, J.; Koivunen, V. Reinforcement learning-based plug-in electric vehicle charging with forecasted price. IEEE Trans. Veh. Technol. 2016, 66, 3674–3684. [Google Scholar]
- Zhang, F.; Yang, Q.; An, D. CDDPG: A deep-reinforcement-learning-based approach for electric vehicle charging control. IEEE Internet Things J. 2020, 8, 3075–3087. [Google Scholar] [CrossRef]
- Cui, L.; Wang, Q.; Qu, H.; Wang, M.; Wu, Y.; Ge, L. Dynamic pricing for fast charging stations with deep reinforcement learning. Appl. Energy 2023, 346, 121334. [Google Scholar] [CrossRef]
- Xing, Q.; Xu, Y.; Chen, Z.; Zhang, Z.; Shi, Z. A graph reinforcement learning-based decision-making platform for real-time charging navigation of urban electric vehicles. IEEE Trans. Ind. Inform. 2022, 19, 3284–3295. [Google Scholar] [CrossRef]
- Qian, T.; Shao, C.; Wang, X.; Shahidehpour, M. Deep reinforcement learning for EV charging navigation by coordinating smart grid and intelligent transportation system. IEEE Trans. Smart Grid 2019, 11, 1714–1723. [Google Scholar] [CrossRef]
- Vandael, S.; Claessens, B.; Ernst, D.; Holvoet, T.; Deconinck, G. Reinforcement learning of heuristic EV fleet charging in a day-ahead electricity market. IEEE Trans. Smart Grid 2015, 6, 1795–1805. [Google Scholar] [CrossRef]
- Jin, J.; Xu, Y. Optimal policy characterization enhanced actor-critic approach for electric vehicle charging scheduling in a power distribution network. IEEE Trans. Smart Grid 2020, 12, 1416–1428. [Google Scholar] [CrossRef]
- Qian, J.; Jiang, Y.; Liu, X.; Wang, Q.; Wang, T.; Shi, Y.; Chen, W. Federated Reinforcement Learning for Electric Vehicles Charging Control on Distribution Networks. IEEE Internet Things J. 2023, 11, 5511–5525. [Google Scholar] [CrossRef]
- Wang, Y.; Lin, X.; Pedram, M. Accurate component model based optimal control for energy storage systems in households with photovoltaic modules. In Proceedings of the 2013 IEEE Green Technologies Conference (GreenTech), Denver, CO, USA, 4–5 April 2013; pp. 28–34. [Google Scholar] [CrossRef]
- Gao, Y.; Li, J.; Hong, M. Machine Learning Based Optimization Model for Energy Management of Energy Storage System for Large Industrial Park. Processes 2021, 9, 825. [Google Scholar] [CrossRef]
- Liu, T.; Zou, Y.; Liu, D.; Sun, F. Reinforcement learning of adaptive energy management with transition probability for a hybrid electric tracked vehicle. IEEE Trans. Ind. Electron. 2015, 62, 7837–7846. [Google Scholar] [CrossRef]
- Kong, Z.; Zou, Y.; Liu, T. Implementation of real-time energy management strategy based on reinforcement learning for hybrid electric vehicles and simulation validation. PLoS ONE 2017, 12, e0180491. [Google Scholar] [CrossRef]
- Hu, X.; Liu, T.; Qi, X.; Barth, M. Reinforcement learning for hybrid and plug-in hybrid electric vehicle energy management: Recent advances and prospects. IEEE Ind. Electron. Mag. 2019, 13, 16–25. [Google Scholar] [CrossRef]
- Yan, Z.; Xu, Y.; Wang, Y.; Feng, X. Deep reinforcement learning-based optimal data-driven control of battery energy storage for power system frequency support. IET Gener. Transm. Distrib. 2020, 14, 6071–6078. [Google Scholar] [CrossRef]
- Wang, Y.; Lin, X.; Pedram, M. Adaptive control for energy storage systems in households with photovoltaic modules. IEEE Trans. Smart Grid 2014, 5, 992–1001. [Google Scholar] [CrossRef]
- Zhang, H.; Li, J.; Hong, M. Machine learning-based energy system model for tissue paper machines. Processes 2021, 9, 655. [Google Scholar] [CrossRef]
- Wang, Y.; Lin, X.; Pedram, M. A Near-Optimal Model-Based Control Algorithm for Households Equipped with Residential Photovoltaic Power Generation and Energy Storage Systems. IEEE Trans. Sustain. Energy 2016, 7, 77–86. [Google Scholar] [CrossRef]
- NREL. Measurement and Instrumentation Data Center. 2021. Available online: https://midcdmz.nrel.gov/ (accessed on 20 October 2024).
- bge. Baltimore Load Profile Data. 2021. Available online: https://supplier.bge.com/electric/load/profiles.asp (accessed on 19 September 2024).
- Liu, T.; Zou, Y.; Liu, D.; Sun, F. Reinforcement learning–based energy management strategy for a hybrid electric tracked vehicle. Energies 2015, 8, 7243–7260. [Google Scholar] [CrossRef]
- Baah, G.K.; Podgurski, A.; Harrold, M.J. The Probabilistic Program Dependence Graph and Its Application to Fault Diagnosis. IEEE Trans. Softw. Eng. 2010, 36, 528–545. [Google Scholar] [CrossRef]
- Schaefer, A.M.; Udluft, S.; Zimmermann, H.G. A recurrent control neural network for data efficient reinforcement learning. In Proceedings of the 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, Honolulu, HI, USA, 1–5 April 2007; pp. 151–157. [Google Scholar] [CrossRef]
- Bitzer, S.; Howard, M.; Vijayakumar, S. Using dimensionality reduction to exploit constraints in reinforcement learning. In Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, Taiwan, 18–22 October 2010; pp. 3219–3225. [Google Scholar] [CrossRef]
- Barto, A.G.; Mahadevan, S. Recent advances in hierarchical reinforcement learning. Discret. Event Dyn. Syst. 2003, 13, 341–379. [Google Scholar] [CrossRef]
- Cowan, W.; Katehakis, M.N.; Pirutinsky, D. Reinforcement learning: A comparison of UCB versus alternative adaptive policies. In Proceedings of the First Congress of Greek Mathematicians, Athens, Greece, 25–30 June 2018; p. 127. [Google Scholar] [CrossRef]
- Ladosz, P.; Weng, L.; Kim, M.; Oh, H. Exploration in deep reinforcement learning: A survey. Inf. Fusion 2022, 85, 1–22. [Google Scholar] [CrossRef]
- Zhu, Z.; Lin, K.; Jain, A.K.; Zhou, J. Transfer Learning in Deep Reinforcement Learning: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 13344–13362. [Google Scholar] [CrossRef] [PubMed]
- Ren, C.; Xu, Y. Transfer Learning-Based Power System Online Dynamic Security Assessment: Using One Model to Assess Many Unlearned Faults. IEEE Trans. Power Syst. 2020, 35, 821–824. [Google Scholar] [CrossRef]
- Ye, Y.; Qiu, D.; Wu, X.; Strbac, G.; Ward, J. Model-Free Real-Time Autonomous Control for a Residential Multi-Energy System Using Deep Reinforcement Learning. IEEE Trans. Smart Grid 2020, 11, 3068–3082. [Google Scholar] [CrossRef]
- Zhang, S.; May, D.; Gül, M.; Musilek, P. Reinforcement learning-driven local transactive energy market for distributed energy resources. Energy AI 2022, 8, 100150. [Google Scholar] [CrossRef]
- Bose, S.; Kremers, E.; Mengelkamp, E.M.; Eberbach, J.; Weinhardt, C. Reinforcement learning in local energy markets. Energy Inform. 2021, 4, 7. [Google Scholar] [CrossRef]
- Li, J.; Wang, C.; Wang, H. Attentive Convolutional Deep Reinforcement Learning for Optimizing Solar-Storage Systems in Real-Time Electricity Markets. IEEE Trans. Ind. Inform. 2024, 20, 7205–7215. [Google Scholar] [CrossRef]
- Li, X.; Luo, F.; Li, C. Multi-agent deep reinforcement learning-based autonomous decision-making framework for community virtual power plants. Appl. Energy 2024, 360, 122813. [Google Scholar] [CrossRef]
- Ye, Y.; Papadaskalopoulos, D.; Yuan, Q.; Tang, Y.; Strbac, G. Multi-Agent Deep Reinforcement Learning for Coordinated Energy Trading and Flexibility Services Provision in Local Electricity Markets. IEEE Trans. Smart Grid 2023, 14, 1541–1554. [Google Scholar] [CrossRef]
- Chen, T.; Su, W. Indirect Customer-to-Customer Energy Trading with Reinforcement Learning. IEEE Trans. Smart Grid 2019, 10, 4338–4348. [Google Scholar] [CrossRef]
- Fang, X.; Zhao, Q.; Wang, J.; Han, Y.; Li, Y. Multi-agent Deep Reinforcement Learning for Distributed Energy Management and Strategy Optimization of Microgrid Market. Sustain. Cities Soc. 2021, 74, 103163. [Google Scholar] [CrossRef]
- Harrold, D.J.; Cao, J.; Fan, Z. Renewable energy integration and microgrid energy trading using multi-agent deep reinforcement learning. Appl. Energy 2022, 318, 119151. [Google Scholar] [CrossRef]
- Gao, S.; Xiang, C.; Yu, M.; Tan, K.T.; Lee, T.H. Online Optimal Power Scheduling of a Microgrid via Imitation Learning. IEEE Trans. Smart Grid 2022, 13, 861–876. [Google Scholar] [CrossRef]
- Chen, D.; Irwin, D. SunDance: Black-box Behind-the-Meter Solar Disaggregation. In Proceedings of the Eighth International Conference on Future Energy Systems, Shatin, Hong Kong, 16–19 May 2017; e-Energy ’17. pp. 45–55. [Google Scholar] [CrossRef]
- Mishra, A.K.; Cecchet, E.; Shenoy, P.J.; Albrecht, J.R. Smart: An Open Data Set and Tools for Enabling Research in Sustainable Homes. 2012. Available online: https://api.semanticscholar.org/CorpusID:6562225 (accessed on 19 September 2024).
- AEMO. Electricity Distribution and Prices Data. 2024. Available online: https://aemo.com.au/en/energy-systems/electricity/national-electricity-market-nem/data-nem/data-dashboard-nem (accessed on 19 September 2024).
- California ISO. California Electrical Power System Operational Data. 2024. Available online: https://www.caiso.com/ (accessed on 19 September 2024).
- Duan, J.; Shi, D.; Diao, R.; Li, H.; Wang, Z.; Zhang, B.; Bian, D.; Yi, Z. Deep-Reinforcement-Learning-Based Autonomous Voltage Control for Power Grid Operations. IEEE Trans. Power Syst. 2020, 35, 814–817. [Google Scholar] [CrossRef]
- Cao, D.; Zhao, J.; Hu, W.; Yu, N.; Ding, F.; Huang, Q.; Chen, Z. Deep Reinforcement Learning Enabled Physical-Model-Free Two-Timescale Voltage Control Method for Active Distribution Systems. IEEE Trans. Smart Grid 2022, 13, 149–165. [Google Scholar] [CrossRef]
- Diao, R.; Wang, Z.; Shi, D.; Chang, Q.; Duan, J.; Zhang, X. Autonomous Voltage Control for Grid Operation Using Deep Reinforcement Learning. In Proceedings of the 2019 IEEE Power & Energy Society General Meeting (PESGM), Atlanta, GA, USA, 4–8 August 2019; pp. 1–5. [Google Scholar] [CrossRef]
- Hadidi, R.; Jeyasurya, B. Reinforcement Learning Based Real-Time Wide-Area Stabilizing Control Agents to Enhance Power System Stability. IEEE Trans. Smart Grid 2013, 4, 489–497. [Google Scholar] [CrossRef]
- Chen, C.; Cui, M.; Li, F.; Yin, S.; Wang, X. Model-Free Emergency Frequency Control Based on Reinforcement Learning. IEEE Trans. Ind. Inform. 2021, 17, 2336–2346. [Google Scholar] [CrossRef]
- Zhao, J.; Li, F.; Mukherjee, S.; Sticht, C. Deep Reinforcement Learning-Based Model-Free On-Line Dynamic Multi-Microgrid Formation to Enhance Resilience. IEEE Trans. Smart Grid 2022, 13, 2557–2567. [Google Scholar] [CrossRef]
- Du, Y.; Li, F. Intelligent Multi-Microgrid Energy Management Based on Deep Neural Network and Model-Free Reinforcement Learning. IEEE Trans. Smart Grid 2020, 11, 1066–1076. [Google Scholar] [CrossRef]
- Zhou, Y.; Lee, W.; Diao, R.; Shi, D. Deep Reinforcement Learning Based Real-time AC Optimal Power Flow Considering Uncertainties. J. Mod. Power Syst. Clean Energy 2022, 10, 1098–1109. [Google Scholar] [CrossRef]
- Cao, D.; Hu, W.; Xu, X.; Wu, Q.; Huang, Q.; Chen, Z.; Blaabjerg, F. Deep Reinforcement Learning Based Approach for Optimal Power Flow of Distribution Networks Embedded with Renewable Energy and Storage Devices. J. Mod. Power Syst. Clean Energy 2021, 9, 1101–1110. [Google Scholar] [CrossRef]
- Birchfield, A.B.; Xu, T.; Gegner, K.M.; Shetye, K.S.; Overbye, T.J. Grid Structural Characteristics as Validation Criteria for Synthetic Networks. IEEE Trans. Power Syst. 2017, 32, 3258–3265. [Google Scholar] [CrossRef]
- Chen, C.; Zhang, K.; Yuan, K.; Zhu, L.; Qian, M. Novel Detection Scheme Design Considering Cyber Attacks on Load Frequency Control. IEEE Trans. Ind. Inform. 2018, 14, 1932–1941. [Google Scholar] [CrossRef]
- Qiu, S.; Li, Z.; Li, Z.; Li, J.; Long, S.; Li, X. Model-free control method based on reinforcement learning for building cooling water systems: Validation by measured data-based simulation. Energy Build. 2020, 218, 110055. [Google Scholar] [CrossRef]
- Zhang, X.; Chen, Y.; Bernstein, A.; Chintala, R.; Graf, P.; Jin, X.; Biagioni, D. Two-stage reinforcement learning policy search for grid-interactive building control. IEEE Trans. Smart Grid 2022, 13, 1976–1987. [Google Scholar] [CrossRef]
- Zhang, X.; Biagioni, D.; Cai, M.; Graf, P.; Rahman, S. An Edge-Cloud Integrated Solution for Buildings Demand Response Using Reinforcement Learning. IEEE Trans. Smart Grid 2021, 12, 420–431. [Google Scholar] [CrossRef]
- Mocanu, E.; Mocanu, D.C.; Nguyen, P.H.; Liotta, A.; Webber, M.E.; Gibescu, M.; Slootweg, J.G. On-Line Building Energy Optimization Using Deep Reinforcement Learning. IEEE Trans. Smart Grid 2019, 10, 3698–3708. [Google Scholar] [CrossRef]
- Wei, T.; Wang, Y.; Zhu, Q. Deep reinforcement learning for building HVAC control. In Proceedings of the 54th Annual Design Automation Conference 2017, Austin, TX, USA, 18–22 June 2017; pp. 1–6. [Google Scholar] [CrossRef]
- Yu, L.; Sun, Y.; Xu, Z.; Shen, C.; Yue, D.; Jiang, T.; Guan, X. Multi-Agent Deep Reinforcement Learning for HVAC Control in Commercial Buildings. IEEE Trans. Smart Grid 2021, 12, 407–419. [Google Scholar] [CrossRef]
- Shin, M.; Kim, S.; Kim, Y.; Song, A.; Kim, Y.; Kim, H.Y. Development of an HVAC system control method using weather forecasting data with deep reinforcement learning algorithms. Build. Environ. 2024, 248, 111069. [Google Scholar] [CrossRef]
- EnergyPlus Whole Building Energy Simulation Program. Available online: https://energyplus.net/ (accessed on 20 October 2024).
- Gao, G.; Li, J.; Wen, Y. DeepComfort: Energy-Efficient Thermal Comfort Control in Buildings Via Reinforcement Learning. IEEE Internet Things J. 2020, 7, 8472–8484. [Google Scholar] [CrossRef]
- Dey, S.; Marzullo, T.; Zhang, X.; Henze, G. Reinforcement learning building control approach harnessing imitation learning. Energy AI 2023, 14, 100255. [Google Scholar] [CrossRef]
- Ruelens, F.; Claessens, B.J.; Quaiyum, S.; De Schutter, B.; Babuška, R.; Belmans, R. Reinforcement Learning Applied to an Electric Water Heater: From Theory to Practice. IEEE Trans. Smart Grid 2018, 9, 3792–3800. [Google Scholar] [CrossRef]
- Tutiempo Weather Service. Weather Data. 2024. Available online: https://en.tutiempo.net/climate/ws-486980.html (accessed on 19 September 2024).
- Datadryad. Thermal Comfort Field Measurements. 2024. Available online: https://datadryad.org/stash/dataset/doi:10.6078/D1F671 (accessed on 19 September 2024).
- Pecan Street. Consumption Data. 2024. Available online: https://www.pecanstreet.org/ (accessed on 19 September 2024).
- EIA. Commercial Buildings Energy Consumption Data. 2024. Available online: https://www.eia.gov/consumption/commercial/data/2012/bc/cfm/b6.php (accessed on 19 September 2024).
- Ulrike Jordan, K.V. Hot-Water Profiles. 2001. Available online: https://sel.me.wisc.edu/trnsys/trnlib/iea-shc-task26/iea-shc-task26-load-profiles-description-jordan.pdf (accessed on 19 September 2024).
- Zhang, C.; Liu, Y.; Wu, F.; Tang, B.; Fan, W. Effective charging planning based on deep reinforcement learning for electric vehicles. IEEE Trans. Intell. Transp. Syst. 2020, 22, 542–554. [Google Scholar] [CrossRef]
- Wang, R.; Chen, Z.; Xing, Q.; Zhang, Z.; Zhang, T. A modified rainbow-based deep reinforcement learning method for optimal scheduling of charging station. Sustainability 2022, 14, 1884. [Google Scholar] [CrossRef]
- Wang, S.; Bi, S.; Zhang, Y.A. Reinforcement learning for real-time pricing and scheduling control in EV charging stations. IEEE Trans. Ind. Inform. 2019, 17, 849–859. [Google Scholar] [CrossRef]
- Qian, T.; Shao, C.; Li, X.; Wang, X.; Shahidehpour, M. Enhanced coordinated operations of electric power and transportation networks via EV charging services. IEEE Trans. Smart Grid 2020, 11, 3019–3030. [Google Scholar] [CrossRef]
- Zhao, Z.; Lee, C.K. Dynamic pricing for EV charging stations: A deep reinforcement learning approach. IEEE Trans. Transp. Electrif. 2021, 8, 2456–2468. [Google Scholar] [CrossRef]
- Sadeghianpourhamami, N.; Deleu, J.; Develder, C. Definition and evaluation of model-free coordination of electrical vehicle charging with reinforcement learning. IEEE Trans. Smart Grid 2019, 11, 203–214. [Google Scholar] [CrossRef]
- Yeom, K. Model predictive control and deep reinforcement learning based energy efficient eco-driving for battery electric vehicles. Energy Rep. 2022, 8, 34–42. [Google Scholar] [CrossRef]
- Dorokhova, M.; Martinson, Y.; Ballif, C.; Wyrsch, N. Deep reinforcement learning control of electric vehicle charging in the presence of photovoltaic generation. Appl. Energy 2021, 301, 117504. [Google Scholar] [CrossRef]
- Wen, Z.; O’Neill, D.; Maei, H. Optimal demand response using device-based reinforcement learning. IEEE Trans. Smart Grid 2015, 6, 2312–2324. [Google Scholar] [CrossRef]
- Lee, S.; Choi, D.H. Reinforcement learning-based energy management of smart home with rooftop solar photovoltaic system, energy storage system, and home appliances. Sensors 2019, 19, 3937. [Google Scholar] [CrossRef]
- Cao, J.; Harrold, D.; Fan, Z.; Morstyn, T.; Healey, D.; Li, K. Deep Reinforcement Learning-Based Energy Storage Arbitrage with Accurate Lithium-Ion Battery Degradation Model. IEEE Trans. Smart Grid 2020, 11, 4513–4521. [Google Scholar] [CrossRef]
- Bui, V.H.; Hussain, A.; Kim, H.M. Double Deep Q-Learning-Based Distributed Operation of Battery Energy Storage System Considering Uncertainties. IEEE Trans. Smart Grid 2020, 11, 457–469. [Google Scholar] [CrossRef]
- Bui, V.H.; Hussain, A.; Kim, H.M. Q-Learning-Based Operation Strategy for Community Battery Energy Storage System (CBESS) in Microgrid System. Energies 2019, 12, 1789. [Google Scholar] [CrossRef]
- Chen, T.; Su, W. Local Energy Trading Behavior Modeling with Deep Reinforcement Learning. IEEE Access 2018, 6, 62806–62814. [Google Scholar] [CrossRef]
- Liu, F.; Liu, Q.; Tao, Q.; Huang, Y.; Li, D.; Sidorov, D. Deep reinforcement learning based energy storage management strategy considering prediction intervals of wind power. Int. J. Electr. Power Energy Syst. 2023, 145, 108608. [Google Scholar] [CrossRef]
- Zhou, H.; Erol-Kantarci, M. Correlated deep q-learning based microgrid energy management. In Proceedings of the 2020 IEEE 25th International Workshop on Computer Aided Modeling and Design of Communication Links and Networks (CAMAD), Pisa, Italy, 14–16 September 2020; pp. 1–6. [Google Scholar] [CrossRef]
- Ji, Y.; Wang, J.; Xu, J.; Fang, X.; Zhang, H. Real-time energy management of a microgrid using deep reinforcement learning. Energies 2019, 12, 2291. [Google Scholar] [CrossRef]
- Liu, T.; Hu, X. A bi-level control for energy efficiency improvement of a hybrid tracked vehicle. IEEE Trans. Ind. Inform. 2018, 14, 1616–1625. [Google Scholar] [CrossRef]
- UK Government. UK Wholesale Electricity Market Prices. 2024. Available online: https://tradingeconomics.com/united-kingdom/electricity-price (accessed on 19 September 2024).
- Lopes, J.P.; Hatziargyriou, N.; Mutale, J.; Djapic, P.; Jenkins, N. Integrating distributed generation into electric power systems: A review of drivers, challenges and opportunities. Electr. Power Syst. Res. 2007, 77, 1189–1203. [Google Scholar] [CrossRef]
- Pfenninger, S.; Hawkes, A.; Keirstead, J. Energy systems modeling for twenty-first century energy challenges. Renew. Sustain. Energy Rev. 2014, 33, 74–86. [Google Scholar] [CrossRef]
- Nafi, N.S.; Ahmed, K.; Gregory, M.A.; Datta, M. A survey of smart grid architectures, applications, benefits and standardization. J. Netw. Comput. Appl. 2016, 76, 23–36. [Google Scholar] [CrossRef]
- Ustun, T.S.; Hussain, S.M.S.; Kirchhoff, H.; Ghaddar, B.; Strunz, K.; Lestas, I. Data Standardization for Smart Infrastructure in First-Access Electricity Systems. Proc. IEEE 2019, 107, 1790–1802. [Google Scholar] [CrossRef]
- Ren, C.; Xu, Y. Robustness Verification for Machine-Learning-Based Power System Dynamic Security Assessment Models Under Adversarial Examples. IEEE Trans. Control Netw. Syst. 2022, 9, 1645–1654. [Google Scholar] [CrossRef]
- Zhang, Z.; Yau, D.K. CoRE: Constrained Robustness Evaluation of Machine Learning-Based Stability Assessment for Power Systems. IEEE/CAA J. Autom. Sin. 2023, 10, 557–559. [Google Scholar] [CrossRef]
- Ren, C.; Du, X.; Xu, Y.; Song, Q.; Liu, Y.; Tan, R. Vulnerability Analysis, Robustness Verification, and Mitigation Strategy for Machine Learning-Based Power System Stability Assessment Model Under Adversarial Examples. IEEE Trans. Smart Grid 2022, 13, 1622–1632. [Google Scholar] [CrossRef]
- Laud, A.D. Theory and Application of Reward Shaping in Reinforcement Learning; University of Illinois at Urbana-Champaign: Urbana, IL, USA, 2004. [Google Scholar]
- Machlev, R.; Heistrene, L.; Perl, M.; Levy, K.; Belikov, J.; Mannor, S.; Levron, Y. Explainable Artificial Intelligence (XAI) techniques for energy and power systems: Review, challenges and opportunities. Energy AI 2022, 9, 100169. [Google Scholar] [CrossRef]
- Zhang, K.; Zhang, J.; Xu, P.D.; Gao, T.; Gao, D.W. Explainable AI in Deep Reinforcement Learning Models for Power System Emergency Control. IEEE Trans. Comput. Soc. Syst. 2022, 9, 419–427. [Google Scholar] [CrossRef]
- Ren, P.; Xiao, Y.; Chang, X.; Huang, P.Y.; Li, Z.; Chen, X.; Wang, X. A Comprehensive Survey of Neural Architecture Search: Challenges and Solutions. ACM Comput. Surv. 2021, 54, 1–34. [Google Scholar] [CrossRef]
- Jalali, S.M.J.; Osório, G.J.; Ahmadian, S.; Lotfi, M.; Campos, V.M.A.; Shafie-khah, M.; Khosravi, A.; Catalão, J.P.S. New Hybrid Deep Neural Architectural Search-Based Ensemble Reinforcement Learning Strategy for Wind Power Forecasting. IEEE Trans. Ind. Appl. 2022, 58, 15–27. [Google Scholar] [CrossRef]
- Wang, Q.; Kapuza, I.; Baimel, D.; Belikov, J.; Levron, Y.; Machlev, R. Neural Architecture Search (NAS) for designing optimal power quality disturbance classifiers. Electr. Power Syst. Res. 2023, 223, 109574. [Google Scholar] [CrossRef]
- Huang, B.; Wang, J. Applications of Physics-Informed Neural Networks in Power Systems—A Review. IEEE Trans. Power Syst. 2023, 38, 572–588. [Google Scholar] [CrossRef]
- Misyris, G.S.; Venzke, A.; Chatzivasileiadis, S. Physics-Informed Neural Networks for Power Systems. In Proceedings of the 2020 IEEE Power & Energy Society General Meeting (PESGM), Montreal, QC, Canada, 2–6 August 2020; pp. 1–5. [Google Scholar] [CrossRef]
- Sami, N.M.; Naeini, M. Machine learning applications in cascading failure analysis in power systems: A review. Electr. Power Syst. Res. 2024, 232, 110415. [Google Scholar] [CrossRef]
- Miraftabzadeh, S.M.; Foiadelli, F.; Longo, M.; Pasetti, M. A Survey of Machine Learning Applications for Power System Analytics. In Proceedings of the 2019 IEEE International Conference on Environment and Electrical Engineering and 2019 IEEE Industrial and Commercial Power Systems Europe (EEEIC / I&CPS Europe), Genova, Italy, 11–14 June 2019; pp. 1–5. [Google Scholar] [CrossRef]
- Bedi, G.; Venayagamoorthy, G.K.; Singh, R.; Brooks, R.R.; Wang, K.C. Review of Internet of Things (IoT) in Electric Power and Energy Systems. IEEE Internet Things J. 2018, 5, 847–870. [Google Scholar] [CrossRef]
- Ngo, V.T.; Nguyen Thi, M.S.; Truong, D.N.; Hoang, A.Q.; Tran, P.N.; Bui, N.A. Applying IoT Platform to Design a Data Collection System for Hybrid Power System. In Proceedings of the 2021 International Conference on System Science and Engineering (ICSSE), Ho Chi Minh City, Vietnam, 26–28 August 2021; pp. 181–184. [Google Scholar] [CrossRef]
- Sayed, H.A.; Said, A.M.; Ibrahim, A.W. Smart Utilities IoT-Based Data Collection Scheduling. Arab. J. Sci. Eng. 2024, 49, 2909–2923. [Google Scholar] [CrossRef]
- Li, H.; He, H. Learning to Operate Distribution Networks with Safe Deep Reinforcement Learning. IEEE Trans. Smart Grid 2022, 13, 1860–1872. [Google Scholar] [CrossRef]
- Vu, T.L.; Mukherjee, S.; Yin, T.; Huang, R.; Tan, J.; Huang, Q. Safe Reinforcement Learning for Emergency Load Shedding of Power Systems. In Proceedings of the 2021 IEEE Power & Energy Society General Meeting (PESGM), Washington, DC, USA, 26–29 July 2021; pp. 1–5. [Google Scholar] [CrossRef]
- Chiam, D.H.; Lim, K.H. Power quality disturbance classification using transformer network. In Proceedings of the International Conference on Cyber Warfare, Security and Space Research, Jaipur, India, 9–10 December 2021; pp. 272–282. [Google Scholar] [CrossRef]
- Gooi, H.B.; Wang, T.; Tang, Y. Edge Intelligence for Smart Grid: A Survey on Application Potentials. CSEE J. Power Energy Syst. 2023, 9, 1623–1640. [Google Scholar] [CrossRef]
- Sodhro, A.H.; Pirbhulal, S.; de Albuquerque, V.H.C. Artificial Intelligence-Driven Mechanism for Edge Computing-Based Industrial Applications. IEEE Trans. Ind. Inform. 2019, 15, 4235–4243. [Google Scholar] [CrossRef]
- Lv, L.; Wu, Z.; Zhang, L.; Gupta, B.B.; Tian, Z. An Edge-AI Based Forecasting Approach for Improving Smart Microgrid Efficiency. IEEE Trans. Ind. Inform. 2022, 18, 7946–7954. [Google Scholar] [CrossRef]
Problem | States | Actions | Reward |
---|---|---|---|
EM | System operator decides upon a power flow distribution | Firms set their bid | The firms’ rewards are the net profit achieved |
GSAC | Voltage levels at different nodes | Adjusting the output of power generators | Cost of deviating from nominal voltage levels |
BEM | Indoor temperature and humidity levels | Adjusting thermostat set-points for heating and cooling | Cost of electricity |
EV | Traffic conditions and route information | Selecting a route based on traffic and charging station availability | Cost of charging, considering electricity prices and charging station fees |
ESS | Battery state of charge and current consumer power demand | The controller decides how much power to produce using the generator | The power generation cost the controller must pay |
Ref. | Application | Algorithm | State Space | Policy | Dataset & Simulator |
---|---|---|---|---|---|
[49] | ET | AC | Continuous | Deterministic | [57] |
[51] | ET | AC | Discrete | Stochastic | [58] |
[52] | ET | QL | Continuous | Deterministic | [59] |
[54] | ET | Other | Discrete | Stochastic | Simulated data |
[55] | ET | PG, other | Continuous | Deterministic | Real data |
[50] | Dispatch | PG | Continuous | Deterministic | Simulated data |
[53] | Dispatch | AC, other | Continuous | Deterministic | [60,61] |
[56] | DR, microgrid | QL | Continuous | Deterministic | Real data, [62] |
Ref. | Application | Algorithm | State Space | Policy | Dataset |
---|---|---|---|---|---|
[63] | Voltage control | Other | Continuous | Stochastic | IEEE 300, IEEE 9, [66] |
[64] | Voltage control | Other | Continuous | Stochastic | IEEE 300 |
[65] | Voltage control | AC | Continuous | Deterministic | IEEE 123 |
[66] | Voltage control | QL | Continuous | Stochastic | IEEE 39, [66] |
[67] | Microgrid | Other | Discrete | Deterministic | HIL platform “dSPACE MicroLabBox” |
[68] | Microgrid | PPO | Continuous | Stochastic | Empirical measurements |
[70] | Power flow, microgrid | AC | Continuous | Stochastic | [71,72] |
[69] | Power flow | PG | Continuous | Stochastic | IEEE 118 |
Ref. | Application | Algorithm | State Space | Policy | Dataset |
---|---|---|---|---|---|
[73] | HVAC | AC | Discrete | Deterministic | [80] |
[74] | HVAC | AC | Continuous | Stochastic | Real data [Upon request] |
[76] | HVAC | Other | Discrete | Deterministic | Simulated data |
[78] | HVAC | Other | Discrete | Deterministic | Simulated data |
[77] | HVAC | QL, other | Discrete | Deterministic | [81] |
[75] | HVAC | PPO | Discrete | Stochastic | “EnergyPlus” |
[79] | Dispatch | QL, other | Discrete | Deterministic | [82], Simulated data |
Ref. | Application | Algorithm | State Space | Policy | Dataset |
---|---|---|---|---|---|
[83] | Power flow | QL | Discrete | Deterministic | MATLAB simulation |
[84] | Charge control | QL | Mixed | Deterministic | Historic prices |
[85] | Charge control | PG | Continuous | Deterministic | Simulated |
[86] | Charge control | AC | Continuous | Deterministic | Simulated |
[87] | Charge control | QL | Discrete | Deterministic | Open street map, ChargeBar |
[88] | Charge control | QL | Discrete | Deterministic | Simulated |
[90] | Charge scheduling | AC | Continuous | Deterministic | Simulated |
[91] | Charge control | AC | Continuous | Stochastic | Historic prices |
[89] | Load balancing | QL | Continuous | Deterministic | Simulated |
Ref. | Application | Algorithm | State Space | Policy | Dataset |
---|---|---|---|---|---|
[100] | Smart grid | QL | Discrete | Stochastic | Simulated data |
[92] | Smart grid | Other | Discrete | Stochastic | Simulated data |
[98] | Smart grid | Other | Discrete | Stochastic | [101,102] |
[94] | EV | QL | Discrete | Stochastic | Simulated data |
[103] | EV | QL, other | Discrete | Deterministic | Simulated data |
[95] | EV | QL | Continuous | Deterministic | Simulated data |
[96] | EV | QL, other | Discrete | Stochastic | Simulated data |
[93] | Renewable energy | Other [104] | Discrete | Stochastic | Simulated data |
[97] | Battery ESS, frequency support | PG, AC | Continuous | Deterministic | Simulated data |
[99] | Energy system modeling | Other | Discrete | Stochastic | Simulated data |
Ref. | Application | Algorithm | State Space | Policy | Dataset |
---|---|---|---|---|---|
[112] | ET | PG | Continuous | Deterministic | Real data |
[113] | ET | QL | Discrete | Deterministic | [122,123] |
[115] | ET | PG | Continuous | Deterministic | [124] |
[116] | ET | AC | Continuous | Stochastic | Simulated data |
[117] | ET | QL | Continuous | Deterministic | Real data |
[119] | Microgrid, dispatch | QL | Continuous | Stochastic | [125] |
[120] | Microgrid | PG | Continuous | Stochastic | Simulated data |
[121] | Microgrid | Other | Continuous | Deterministic | Real and simulated data |
Ref. | Application | Algorithm | State Space | Policy | Dataset |
---|---|---|---|---|---|
[126] | Voltage control | PG | Continuous | Stochastic | Powerflow and Short circuit Assessment Tool (PSAT), 200-bus system [135] |
[127] | Voltage control | AC | Continuous | Stochastic | IEEE 33-, 123-, and 342-node systems |
[128] | Voltage control | QL | Discrete | Stochastic | IEEE 14-bus system |
[129] | Frequency control | QL | Discrete | Deterministic | Stimulated data |
[130] | Frequency control | PG | Discrete | Deterministic | Kundur’s 4-unit-13-bus system, New England 68-bus system, [136] |
[131] | Microgrid | QL | Continuous | Stochastic | 7-bus system and the IEEE 123-bus system |
[132] | Microgrid | Other | Discrete | Deterministic | Simulated data |
[133] | Power flow | PPO | Continuous | Stochastic | Illinois 200-bus system |
[134] | Power flow | PPO | Continuous | Stochastic | Simulated data, West Denmark wind data |
Ref. | Application | Algorithm | State Space | Policy | Dataset |
---|---|---|---|---|---|
[137] | HVAC | QL | Discrete | Deterministic | Simulated data |
[146] | HVAC | PPO | Continuous | Deterministic | “EnergyPlus” |
[141] | HVAC | QL | Continuous | Stochastic | “EnergyPlus” |
[143] | HVAC | QL | Discrete | Deterministic | Simulated data |
[145] | HVAC | PG | Continuous | Stochastic | [148,149] |
[142] | HVAC | AC | Continuous | Stochastic | [150] |
[138] | HVAC,DR | PPO | Continuous | Deterministic | “EnergyPlus” |
[140] | HVAC, DR | QL, PG | Continuous | Stochastic | [150] |
[139] | DR | QL, PG | Discrete | Deterministic | [151] |
[112] | Dispatch | PG | Continuous | Deterministic | Simulated data |
[147] | Dispatch | QL | Continuous | Stochastic | [61,152] |
Ref. | Application | Algorithm | State Space | Policy | Dataset |
---|---|---|---|---|---|
[153] | Scheduling | QL | Discrete | Deterministic | “Open street map”, “ChargeBar” |
[154] | Scheduling | QL | Continuous | Deterministic | Simulated |
[155] | Scheduling | Other | Continuous | Deterministic | Historic data |
[156] | Scheduling | Other | Continuous | Deterministic | Simulated |
[158] | Scheduling | QL | Discrete | Deterministic | “ElaadNL” |
[157] | Cost reduction | Other | Mixed | Deterministic | Simulated |
[159] | Cost reduction | QL | Continuous | Deterministic | Simulated |
[162] | Cost reduction | QL | Discrete | Deterministic | Simulated |
[161] | DR | QL | Discrete | Deterministic | Simulated |
[160] | SoC control | QL, PG | Continuous | Deterministic | Historic data |
Ref. | Application | Algorithm | State Space | Policy | Dataset |
---|---|---|---|---|---|
[164] | Microgrids | QL | Continuous | Deterministic | Simulated data |
[165] | Microgrids | QL | Discrete | Deterministic | Simulated data |
[168] | Microgrids | QL | Continuous | Deterministic | Simulated data |
[169] | Microgrids | QL | Continuous | Deterministic | [125] |
[167] | Frequency control | Other | Continuous | Deterministic | Simulated data |
[97] | Frequency control | PG, AC | Continuous | Deterministic | Simulated data |
[163] | Energy trading | QL | Continuous | Deterministic | [171] |
[166] | Energy trading | QL | Continuous | Deterministic | Simulated data |
[170] | EV | QL | Continuous | Stochastic | Simulated data |
RL Expressions | Power Systems Application Expressions |
---|---|
“model-based” | “energy market management” |
OR | OR |
“model learning” | “voltage control” |
OR | OR |
“model-free” | “frequency control” |
OR | OR |
“data-driven” | “reactive power control” |
AND/OR | OR |
“reinforcement learning” | “grid stability” |
OR | |
“microgrid” | |
OR | |
“building energy management” | |
OR | |
“building” | |
OR | |
“electrical vehicles” | |
OR | |
“EV” | |
OR | |
“energy storage control problems” | |
OR | |
“battery energy storage system” | |
OR | |
“local energy trading” |
QL | PG | AC | PPO | Other | ||||||
---|---|---|---|---|---|---|---|---|---|---|
MB | MF | MB | MF | MB | MF | MB | MF | MB | MF | |
ESS | 5 | 7 | 1 | 1 | 0 | 1 | 0 | 0 | 6 | 1 |
EV | 5 | 7 | 1 | 1 | 3 | 1 | 0 | 0 | 0 | 2 |
BEM | 2 | 6 | 0 | 4 | 2 | 1 | 1 | 2 | 3 | 0 |
GSAC | 1 | 3 | 1 | 2 | 2 | 1 | 1 | 2 | 3 | 1 |
EM | 2 | 3 | 2 | 3 | 3 | 1 | 0 | 0 | 3 | 2 |
Category | Challenges |
---|---|
Lack of standardization |
|
Lack of generalization |
|
Limited safety |
|
Nonstationarity environments |
|
Reward shaping |
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ginzburg-Ganz, E.; Segev, I.; Balabanov, A.; Segev, E.; Kaully Naveh, S.; Machlev, R.; Belikov, J.; Katzir, L.; Keren, S.; Levron, Y. Reinforcement Learning Model-Based and Model-Free Paradigms for Optimal Control Problems in Power Systems: Comprehensive Review and Future Directions. Energies 2024, 17, 5307. https://doi.org/10.3390/en17215307
Ginzburg-Ganz E, Segev I, Balabanov A, Segev E, Kaully Naveh S, Machlev R, Belikov J, Katzir L, Keren S, Levron Y. Reinforcement Learning Model-Based and Model-Free Paradigms for Optimal Control Problems in Power Systems: Comprehensive Review and Future Directions. Energies. 2024; 17(21):5307. https://doi.org/10.3390/en17215307
Chicago/Turabian StyleGinzburg-Ganz, Elinor, Itay Segev, Alexander Balabanov, Elior Segev, Sivan Kaully Naveh, Ram Machlev, Juri Belikov, Liran Katzir, Sarah Keren, and Yoash Levron. 2024. "Reinforcement Learning Model-Based and Model-Free Paradigms for Optimal Control Problems in Power Systems: Comprehensive Review and Future Directions" Energies 17, no. 21: 5307. https://doi.org/10.3390/en17215307
APA StyleGinzburg-Ganz, E., Segev, I., Balabanov, A., Segev, E., Kaully Naveh, S., Machlev, R., Belikov, J., Katzir, L., Keren, S., & Levron, Y. (2024). Reinforcement Learning Model-Based and Model-Free Paradigms for Optimal Control Problems in Power Systems: Comprehensive Review and Future Directions. Energies, 17(21), 5307. https://doi.org/10.3390/en17215307