1. Introduction
Lithium-ion batteries are rechargeable and widely recognized for their high energy density, long cycle life, and low self-discharge rates, which have revolutionized energy storage and usage, becoming a fundamental technology in modern society [
1,
2,
3]. Conventional charging methods, such as constant current and constant voltage (CC/CV) techniques, often fail to address the complexities of lithium-ion battery dynamics, resulting in suboptimal charging performance and potential battery degradation over time [
4]. To address these challenges, researchers have explored the application of advanced control algorithms and techniques for lithium-ion battery charging, aiming to improve efficiency, charging speed, and battery lifespan [
5].
In this context, this paper presents a comparative analysis of three prominent intelligent control methods for lithium-ion battery charging: reinforcement learning (RL), fuzzy logic (FL), and classical proportional–integral–derivative (PID) control.
The RL-based controller was studied due to its ability to learn optimal control strategies through interactions with the battery model, and the power electronics. This last component was simplified using the small signal model to enhance the training process. The RL controller trains a neural network based on a reward function that penalizes current and voltage spikes to achieve greater stability in the charging process. This RL agent aims to control the input voltage to the battery; the neural network adjusts the output to activate the power electronics, evaluating the obtained response and maximizing the reward value through multiple interactions. In addition to the voltage response, the system also monitors the battery state, rewarding or penalizing based on current values and aggressive control actions. RL algorithms, such as Q-learning or deep Q-networks, can discover charging policies that minimize charging time, energy consumption, and battery degradation while ensuring safe operation within the battery’s limitations [
6,
7].
On the other hand, FL controllers offer a flexible and intuitive way to incorporate expert knowledge and heuristic rules into the charging process. By defining linguistic variables such as state of charge, temperature, and charging rate, and establishing a set of inference rules, two FL controllers are designed: the first one regulates voltage, allowing it to remain stable despite different charging profiles; the second one regulates current, avoiding excessive spikes and maintaining a stable value over time. Additionally, fuzzy logic-based methods handle the nonlinearity and inherent uncertainty in lithium-ion battery dynamics better, leading to improved charging performance and extended battery life [
8]. Finally, PID controllers can be tuned to optimize charging profiles, balancing factors such as charging time, energy efficiency, and battery health preservation.
This study will compare the three implemented controllers, considering that the choice of the most suitable intelligent control method for lithium-ion battery charging will depend on factors such as the specific application requirements, available computational resources, the desired level of complexity, and trade-offs between charging speed, energy efficiency, and battery lifespan. The three intelligent control methods will be evaluated using MATLAB-Simulink 2024-B , where the most relevant factors for achieving an efficient lithium battery charging method will be analyzed. It is important to note that MATLAB provides a detailed battery model for research purposes. The simulation algorithms will be available to the community at:
https://github.com/PrediJos/Intelligent-control-strategies-for-lithium-battery-chargers (accessed on 26 September 2024). The remaining sections of this paper are organized as follows: Methodology, Experiments and Results, and Conclusions.
2. Methodology
Simulating lithium battery chargers in MATLAB-Simulink offers a robust platform for analyzing and optimizing battery charging systems [
9]. Lithium batteries provide numerous advantages that make them the preferred choice across various applications. Their high energy density ensures a superior energy-to-weight ratio, making them ideal for portable electronics and electric vehicles, where space and weight are crucial factors [
10]. Moreover, their low self-discharge rate allows them to retain charge over extended periods, making them suitable for long-term energy storage. Additionally, recent advancements in recycling technologies and reduced environmental impact contribute to lithium batteries being a more sustainable and eco-friendly energy storage solution. The following sections will outline control architectures designed to efficiently manage the lithium battery charging process.
2.1. Averaged Small-Signal Converter Modelling
Lithium-ion batteries require an effective design of the energy transmission system to avoid damage and guarantee optimal performance in its discharge process. Inefficient charging methods and components may affect the parameters of the battery, such as life cycle, capacity and efficiency, state of charge (SOC) and Health Status (SOH) [
11,
12]. This article relies on an isolated DC/DC converter power circuit for the energy transference from a source to the lithium-ion battery. The forward converter features source voltage step-down characteristics much like a DC/DC buck converter topology. in addition, the forward converter includes an extra stage between the input and the output with a transformer, which provides galvanic isolation, which offers greater safety and protection to the battery against possible system failures, such as short circuits or overloads [
13]. The low signal model criterion will examine the behavior of the forward converter in the face of small disturbances near a stable operating point. Considering that the internal impedance of the battery will be the resistive component, the model design will consider the capacitive, inductive, and resistive components of the converter. According to Equation (1), the behavior of this circuit can be characterized as a second-order transfer function [
14].
where
and
are the circuit cut-off frequencies. The complete analysis of the circuit can be simplified by using the low-signal (low-Q) approximation, which considers the quality factor Q of a low value, this factor measures the ability of the circuit to filter frequencies close to the resonance frequency
[
14], where the interactions between the inductive and capacitive components of the circuit are less evident. Equation (2) results from applying this criterion to Equation (1), making the transfer functions more manageable and controllable.
The converter model includes its working and transformation ratios yielding,
where
is the transformer’s transformation ratio,
is the duty cycle,
and
are the inductive and capacitive components of the output filter of the forward converter, and
is the battery impedance. Based on these parameters, the transfer function becomes,
The converter simulation uses idealized components, such as power supply, transformer, ideal diodes and switches without delays or internal resistance, with the aim of improving the simulation efficiency [
15].
In
Figure 1, two scenarios were analyzed: without load and another with a lithium-ion battery as load. For the no-load case,
Figure 1a shows that the voltage and current curves obtained from the transfer function are smooth, although the converter output presents oscillations in its final value due to the charge and discharge cycles of the inductor. and capacitor. Despite this, both the transfer function and the converter present similar voltage and current trends over time. In the second scenario, with the battery as load,
Figure 1c highlights that the main difference lies in the stabilization time. The idealization of the transfer function leads to a faster system response compared to the real converter, although the final output values are consistent.
One of the most notable features of using the transfer function is the simulation time. Although the converter model provides a more accurate representation of the real world, simulations when designing the controllers will be time consuming and will often provide similar results.
2.2. Control Architectures Description
The control strategies use the lithium battery model in MATLAB-Simulink as a load, which provides detailed technical parameters for accurate simulation and analysis of battery performance. These parameters include SOC, open circuit voltage (OCV), internal resistance, capacity, charge/discharge rates, thermal dynamics, voltage limits, self-discharge rate, and equivalent circuit components. The controllers will handle the supplied voltage and current during the charging process.
The control architectures are modeled with the same structure for a fair comparison. The CC/CV charging process begins with a current control phase, where the current is set at a safe level, usually a fraction of the battery’s nominal capacity, in this control the battery voltage gradually increases as it accumulates charge, until the battery voltage reaches a threshold of 3.855 volts per cell, slightly below the maximum value. Then, the system switches to the voltage control phase, keeping the voltage stable at this level [
16,
17], in this voltage control, the current gradually decreases as the battery approaches full charge. Current reduction occurs because the voltage is held constant, resulting in a drop in charging current as the battery becomes almost fully charged. This CC/CV approach prevents overcharging, reduces battery stress, minimizes overheating risks, and extends battery life [
16,
17]. The control strategies that were assessed are summarized in
Table 1, and will be detailed in the following sections.
2.2.1. Reinforcement Learning Architecture
Reinforcement learning (RL) enables the controller to learn optimal control strategies through interaction with the system [
18], adapting to varying conditions and improving performance over time. This approach allows the controller to handle complex, nonlinear dynamics that traditional control methods might struggle with. The RL architecture is summarized in
Figure 2b, where the trained agent is executed using the MATLAB’s reinforcement learning toolbox. The current control was addressed with a classic PI. This choice was made due to the high computational load experienced during the training while using two agents. In addition, the battery model responds slowly to any action yielding an excessively long training time.
This should incentivize maintaining the voltage within a desired range while penalizing deviations from this range. Additionally, it should consider other objectives such as minimizing energy consumption, reducing oscillations, and ensuring stability. In this sense, the reward function is computed as,
where
and
denote a reward and a penalization constant, respectively. Note that the penalizations are negative values that produce a decrease in the total reward.
is the voltage error at time
(current error),
and
are the control action and the voltage observation at time
(previous samples).
and
are voltage constants that denote the voltage reference and the maximum voltage overshooting threshold. The last constant is usually considered a 10% above the cell’s voltage level. The proposed reward function promotes lower errors and actions closer to the voltage setpoint and penalizes large voltage observations and larger errors.
The agent´s policy learning architecture is composed by an actor–critic scheme as shown in
Figure 3a. The actor network, responsible for selecting actions, can handle continuous action spaces effectively, which is crucial for tasks requiring fine-grained control, such as voltage regulation. The critic network, on the other hand, evaluates the actions by estimating the value function, providing feedback to the actor. This setup allows for stable and efficient learning by reducing variance in the policy gradient estimates. In this work, we use a Continuous Gaussian Actor Network (CGAN), which handles continuous action spaces by outputting the parameters of a Gaussian distribution—mean and standard deviation. This stochastic approach facilitates exploration, allowing the agent to try various actions and learn optimal policies. The actor–critic architecture enhances learning stability by using the critic to evaluate actions and reduce variance in policy updates. This method is particularly effective for tasks requiring fine-grained control, like voltage regulation in battery charging, where precise adjustments are crucial. The smooth and differentiable nature of the Gaussian distribution supports efficient gradient-based optimization, improving learning efficiency and performance in complex control environments.
The actor–critic agent was setup with several fully connected layers (the universal function approximators) to handle the complex voltage relationships [
19,
20]. Most of the activation functions were setup as Rectified Linear Unit functions (RELU) for a computationally efficient training process, mitigate the vanishing gradient problem, and promote sparse activation of the layers, as shown in
Figure 3a. The proposed agent was trained using the MATLAB’s reinforcement learning toolbox. The training results displayed in
Figure 3b showed that after 200 epochs the maximum average reward is achieved. It is worth noticing that each epoch has a duration of approximately 5 min in a standard PC, yielding a long training time. The constants in the reward function were set empirically to ensure a maximum learning. The most critical objective is the error reduction, hence,
and
were set to 200, and
and
were set to −10. Other values may enhance the learning process; however, the slow response of the battery difficulties the tuning process.
2.2.2. Fuzzy Architecture
The fuzzy Proportional-Derivative (PD) controller can handle nonlinearities and uncertainties in the system more effectively than traditional PD controllers, providing robust performance under varying operating conditions. It adapts to changes in battery characteristics, such as state-of-charge and temperature variations, ensuring stable and accurate voltage control. The fuzzy logic approach allows for smooth and gradual control actions, reducing the risk of overshooting and oscillations. The fuzzy architecture detailed in
Figure 4a uses a fuzzy-PD to control the voltage and a fuzzy-PD + I that includes an integral action to control the current. The Fuzzy Inference Systems (FIS) were built with a Sugeno scheme, which is robust to system variations and uncertainties. It requires a simplified rule base, reducing design complexity, and ensures smooth output transitions, minimizing oscillations and overshooting for a more reliable and efficient charging process. In this work, we trust in normalized triangular membership functions to handle the error and its derivative, and the output uses three Sugeno normalized functions to handle the states minimum, zero and maximum as shown in
Figure 4b. The inputs range in
Figure 4b were constrained from −100 to 100 to avoid saturations due to the input variable´s variation. The input range is modified by the constants
and
, and the output range by
. In our notation, the subindices
and
denote the relation for the voltage and current controllers, respectively.
2.2.3. Classic PID Architecture
Classic proportional–integral–derivative (PID) controllers are widely used in battery charging applications due to their simplicity, effectiveness, and proven performance. They are easy to implement and tune, making real-time adjustments to optimize charging conditions. PID controllers provide stable and accurate control by combining proportional, integral, and derivative actions, which ensures precise regulation of voltage and current. Their versatility allows them to be adapted to various battery types and charging scenarios. Additionally, PID controllers are cost-effective, requiring minimal computational resources, making them suitable for embedded systems and low-cost hardware implementations. The PID architecture only uses two classic controllers for voltage and current as shown in
Figure 2a. The internal structure of a classic PID was assumed to be well known and it was not detailed due to space limitations.
3. Experiments and Results
3.1. Performace Assessment
In the case of implemented drivers. The reinforcement learning reward function was determined using heuristic methods, varying each constant of a given condition in small steps until obtaining the function that provided the best training by trial and error, obtaining
. For the fuzzy controller (PD + I), manual tuning was carried out by trial and error, starting with the adjustment of the parameter P until an adequate response was achieved, followed by the adjustment of the constants D and I, and finally the control signal K in that order, achieving for current control
are
, and for voltage control
are
. The PID controller was tuned using the MATLAB PID Tuner tool, which uses neural networks to find the optimal parameters, obtaining that for Current control
, and for voltage control Current
[
21].
In the analysis of the controllers, several aspects are evaluated such as precision in voltage and current regulation, response time, stability and resistance to disturbances. For the controller based on neural networks trained with reinforcement learning, the system reached a stable nominal voltage without overshoots when the current was zero, see
Figure 5b. However, when applying CC/CV control, the agent’s lack of knowledge of the existing current caused fluctuations in the voltage, allowing the current to variably decrease while the battery continued to charge. The fuzzy controller, designed to maintain a stable charge, showed longer charge and stabilization times, but no spikes in the transition between current and voltage control, as shown in
Figure 5c. The PID controller, although it responded quickly to errors, presented greater disturbances, and during the transition from voltage to current, an excess current was observed in
Figure 5b, that could be dangerous for the battery. In terms of loading speed, the reinforcement learning controller was the fastest, although for practical use, it would need to consider the circulating current to improve its performance. Also, a chattering phenomenon occurred, which could damage the electronic components and reduce battery life, indicating the need to improve the RL and fuzzy architectures. The fuzzy controller, although safer, was slower and could be optimized by adjusting the inference rules. The PID showed intermediate performance, without depending on the operator’s experience.
3.2. Performace Metrics
To measure the performance of the different controllers, an analysis of the voltage control action entering the DC/DC converter was carried out, resulting in
Table 2. For the RMS value, it is intended to verify the variation in the voltage control action, showing that the RL controller has a slightly higher value, indicating greater variations in its control action. This suggests that it has a very fine-tuned response, reacting to even the smallest changes.
In addition to the analysis of the voltage delivered as a control action to the converter, the battery current has been evaluated with an RMS value for all controllers. In which a constant value of 0.3 A was obtained. This result suggests that, despite differences in control strategies and variations in control action, all controllers manage to maintain the output current within a desired range. This is indicative that although the RL controller shows greater variability in voltage control action, it does not compromise stability and consistency in terms of output current, which is crucial for the safe and efficient operation of the system.
4. Discussion
The results in
Table 2 were obtained in the simulation of each controller which is configured for 6 s and for the three controllers, the one that took the longest to execute was the RL, which took 21,046 s, additionally, when compared with the other controllers applied to the DC/DC converter to regulate the voltage and current of a lithium-ion battery, it will be observed that the RL controller presented greater variations in its output, even when faced with small disturbances, causing the chattering phenomenon, which refers to high frequency oscillations in the control actions, which when applied as an activation signal in real power electronics can cause: overheating of the battery and converter components, noise in the sensor and a reduction in battery life. This variation in the RL controller is 0.02 V. To improve the controller and mitigate vibrations, options include adding low-pass filters at the controller output, modifying the neural network architecture, or using a hybrid control approach where a PID or fuzzy controller regulates the output of the neural network.
The RL controller has a high adaptability to different situations, due to the fact that it reacts to disturbances with greater sensitivity, but these variations suggest the need to include more parameters in the agent training to improve its ability to adapt to a greater number of scenarios, increase charging efficiency, and avoid possible instabilities. On the other hand, fuzzy controllers due to the inference rules placed seek to avoid high current peaks, so it performs a slower charge, while the PID reacts to the error by modifying the output signal and can adapt to dynamic changes.
Regarding the performance of the three controllers, their output was evaluated using the root mean square (RMS) value. The results show that all controllers maintain an average voltage similar to the nominal voltage of the battery, which is crucial to prevent overcharging and avoid overheating of the battery. Regarding current control, all controllers manage to reach the reference without presenting overshoots, in very short response times, also demonstrating good tolerance to external disturbances. Regarding voltage control, it was observed that the controllers manage to reach the reference consistently, which causes the current to begin to gradually decrease until it reaches zero, thus completing the charging cycle. However, it is important to note that, in the case of the reinforcement learning (RL)-based controller, a slight oscillation in the current was observed, which fluctuates between the current value and zero, which could require additional adjustments to improve its stability in the final phase of the charge.
5. Conclusions
Using PID, fuzzy, and agent-based controllers trained with reinforcement learning to regulate the charge of a battery offers different advantages and challenges. PID controllers are known for their simplicity and effectiveness in regulating voltage and current, providing a fast and stable response in many applications. Fuzzy controllers transfer certain experience to programmers by making the system adaptable to certain situations, depending on the number of inference rules and controllers applied. On the other hand, controllers based on neural networks with reinforcement learning can dynamically adapt to changing conditions and optimize their performance over time. Although the latter can offer significant improvements in terms of precision and efficiency, they also require a greater computational load and can result in longer simulation times; and for greater precision, they will require more complex neural network architectures that will increase simulation times. When charging lithium-ion batteries, each type of controller offers specific advantages and disadvantages. The controller based on reinforcement learning stands out for its high adaptability and ability to adjust to various loading conditions, thanks to its ability to handle small disturbances and optimize the loading process dynamically. However, it requires more extensive training and the inclusion of additional parameters to maximize its efficiency and avoid instabilities. On the other hand, fuzzy and PID controllers offer greater stability and lower variability, which can be beneficial in applications where more predictable and reliable performance is required. Although these controllers show less ability to adapt to changing conditions, their stability and simplicity may be suitable for less dynamic systems or where predictability is crucial.
Author Contributions
Conceptualization, P.R. and W.C.; methodology, P.R., W.C., and J.M.; software, P.R., W.C., J.M., and J.R.; validation, P.R., W.C., J.M., J.R, D.O., and L.S.; formal analysis, P.R.; investigation, P.R., W.C., and J.M.; resources, P.R.; data curation, W.C.; writing—original draft preparation, P.R.; writing—review and editing, J.M.; visualization, W.C.; supervision, J.R., D.O., and L.S. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Data are contained within the article.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Nishi, Y. The Dawn of Lithium-Ion Batteries. Electrochem. Soc. Interface 2016, 25, 71–74. [Google Scholar] [CrossRef]
- Zubi, G.; Dufo-López, R.; Carvalho, M.; Pasaoglu, G. The lithium-ion battery: State of the art and future perspectives. Renew. Sustain. Energy Rev. 2018, 89, 292–308. [Google Scholar] [CrossRef]
- Takeuchi, E.S.; Leising, R.A. Lithium Batteries for Biomedical Applications. MRS Bull. 2002, 27, 624–627. [Google Scholar] [CrossRef]
- Chen, K.; Zhang, K.; Lin, X.; Zheng, Y.; Yin, X.; Hu, X.; Song, Z.; Li, Z. Data-Enabled Predictive Control for Fast Charging of Lithium-Ion Batteries with Constraint Handling. arXiv 2022, arXiv:2209.12862. [Google Scholar] [CrossRef]
- Liu, K.; Zou, C.; Li, K.; Wik, T. Charging Pattern Optimization for Lithium-Ion Batteries with an Electrothermal-Aging Model. IEEE Trans. Ind. Inform. 2018, 14, 5463–5474. [Google Scholar] [CrossRef]
- Park, S.; Pozzi, A.; Whitmeyer, M.; Perez, H.; Joe, W.T.; Raimondo, D.M.; Moura, S. Reinforcement Learning-based Fast Charging Control Strategy for Li-ion Batteries. In Proceedings of the 2020 IEEE Conference on Control Technology and Applications (CCTA), Montreal, QC, Canada, 24–26 August 2020; pp. 100–107. [Google Scholar] [CrossRef]
- Chang, F.; Chen, T.; Su, W.; Alsafasfeh, Q. Control of battery charging based on reinforcement learning and long short-term memory networks. Comput. Electr. Eng. 2020, 85, 106670. [Google Scholar] [CrossRef]
- Tian, N.; Fang, H.; Wang, Y. Real-Time Optimal Lithium-Ion Battery Charging Based on Explicit Model Predictive Control. IEEE Trans. Ind. Inform. 2021, 17, 1318–1330. [Google Scholar] [CrossRef]
- MathWorks. Battery, MathWorks. Available online: https://la.mathworks.com/help/sps/powersys/ref/battery.html (accessed on 7 August 2024).
- Takkalaki, N.; Desai, S.G.; Mishra, R.; Dubey, K. Design and Simulation of Lithium-Ion Battery for Electric Vehicle. In Proceedings of the 2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT), Kharagpur, India, 6–8 July 2021; pp. 1–6. [Google Scholar] [CrossRef]
- Qi, Y.; Zhi, P.; Ye, H.; Zhu, W. Research on Lifetime Optimization of Unmanned Ship Lithium Battery Pack Power Supply System Based on Artificial Fish Swarm Algorithm. In Proceedings of the 2020 39th Chinese Control Conference (CCC), Shenyang, China, 27–29 July 2020; pp. 5379–5384. [Google Scholar] [CrossRef]
- Chowdhury, S.; Bin Shaheed, M.N.; Sozer, Y. An Integrated State of Health (SOH) Balancing Method for Lithium-Ion Battery Cells. In Proceedings of the 2019 IEEE Energy Conversion Congress and Exposition (ECCE), Baltimore, MD, USA, 29 September–3 October 2019; pp. 5759–5763. [Google Scholar] [CrossRef]
- Mohan, N.; Undeland, T.M.; Robbins, W.P. Power Electronics: Converters, Applications, and Design, 3rd ed.; Wiley: Hoboken, NJ, USA, 2002. [Google Scholar]
- Erickson, R.W.; Maksimovic, D. Fundamentals of Power Electronics, 2nd ed.; Kluwer: Norwell, MA, USA, 2001. [Google Scholar]
- MATLAB. Power Electronics Simulation, MathWorks. Available online: https://la.mathworks.com/discovery/power-electronics-simulation.html (accessed on 7 August 2024).
- Aizpuru, I.; Iraola, U.; Canales, J.M.; Echeverria, M.; Gil, I. Passive balancing design for Li-ion battery packs based on single cell experimental tests for a CCCV charging mode. In Proceedings of the 2013 International Conference on Clean Electrical Power: Renewable Energy Resources Impact, (ICCEP), Alghero, Italy, 11–13 June 2013; pp. 93–98. [Google Scholar] [CrossRef]
- Reddy, B.S.T.; Reddy, K.S.; Deepa, K.; Sireesha, K. A FLC based Automated CC-CV Charging through SEPIC for EV using Fuel Cell. In Proceedings of the 2020 International Conference on Recent Trends on Electronics, Information, Communication & Technology (RTEICT), Bangalore, India, 12–13 November 2020; pp. 177–183. [Google Scholar] [CrossRef]
- Marahatta, A.; Rajbhandari, Y.; Shrestha, A.; Phuyal, S.; Thapa, A.; Korba, P. Model predictive control of DC/DC boost converter with reinforcement learning. Heliyon 2022, 8, e11416. [Google Scholar] [CrossRef] [PubMed]
- MATLAB. rlACAgent. MathWorks, Inc. Available online: https://www.mathworks.com/help/reinforcement-learning/ref/rl.agent.rlacagent.html (accessed on 7 August 2024).
- Ye, J.; Guo, H.; Mei, S.; Hu, Y.; Zhang, X. A TD3 Algorithm Based Reinforcement Learning Controller for DC-DC Switching Converters. In Proceedings of the 2023 International Conference on Power Energy Systems and Applications (ICoPESA), Nanjing, China, 24–26 February 2023; pp. 358–363. [Google Scholar] [CrossRef]
- MATLAB. PID Tuner. MathWorks, Inc. Available online: https://www.mathworks.com/help/control/ref/pidtuner-app.html (accessed on 7 August 2024).
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).