An Evolutionary Deep Learning Framework for Accurate Remaining Capacity Prediction in Lithium-Ion Batteries

Liu, Yang; Han, Liangyu; Wang, Yuzhu; Zhu, Jinqi; Zhang, Bo; Guo, Jia

doi:10.3390/electronics14020400

Open AccessArticle

An Evolutionary Deep Learning Framework for Accurate Remaining Capacity Prediction in Lithium-Ion Batteries

by

Yang Liu

¹

,

Liangyu Han

¹,

Yuzhu Wang

¹,

Jinqi Zhu

^1,2,*

,

Bo Zhang

^3,*

and

Jia Guo

¹

College of Computer and Information Engineering, Tianjin Normal University, Tianjin 300387, China

²

Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China

³

College of Physics and Materials Science, Tianjin Normal University, Tianjin 300387, China

^*

Authors to whom correspondence should be addressed.

Electronics 2025, 14(2), 400; https://doi.org/10.3390/electronics14020400

Submission received: 26 December 2024 / Revised: 17 January 2025 / Accepted: 19 January 2025 / Published: 20 January 2025

(This article belongs to the Special Issue Advanced Machine Learning, Pattern Recognition, and Deep Learning Technologies: Methodologies and Applications)

Download

Browse Figures

Versions Notes

Abstract

:

Accurate remaining capacity prediction (RCP) of lithium-ion batteries (LIBs) is crucial for ensuring their safety, reliability, and performance, particularly amidst the growing energy crisis and environmental concerns. However, the complex aging processes of LIBs significantly hinder accurate RCP, as traditional prediction methods struggle to effectively capture nonlinear degradation patterns and long-term dependencies. To tackle these challenges, we introduce an innovative framework that combines evolutionary learning with deep learning for RCP. This framework integrates Temporal Convolutional Networks (TCNs), Bidirectional Gated Recurrent Units (BiGRUs), and an attention mechanism to extract comprehensive time-series features and improve prediction accuracy. Additionally, we introduce a hybrid optimization algorithm that combines the Sparrow Search Algorithm (SSA) with Bayesian Optimization (BO) to enhance the performance of the model. The experimental results validate the superiority of our framework, demonstrating its capability to achieve significantly improved prediction accuracy compared to existing methods. This study provides researchers in battery management systems, electric vehicles, and renewable energy storage with a reliable tool for optimizing lithium-ion battery performance, enhancing system reliability, and addressing the challenges of the new energy industry.

Keywords:

remaining capacity prediction; lithium-ion batteries; evolutionary deep learning; TCN-BiGRU-Attention; hybrid SSA-BO algorithm

1. Introduction

Lithium-ion batteries (LIBs), renowned for their high energy density, long cycle life, and low self-discharge rate, have become indispensable in a wide range of industries, including unmanned aerial vehicles, electric vehicles, and portable electronics [1,2]. Despite their widespread adoption, LIBs inevitably degrade over time, resulting in reduced usable capacity and increased internal resistance, which can ultimately lead to battery failure. If not replaced before reaching their end-of-life (EOL), LIBs pose significant safety risks, potentially compromising the stability and reliability of the systems they power. For instance, in April 2021, a lithium iron phosphate battery short-circuited, causing a fire and explosion at a Beijing facility, which injured four people and resulted in damages of nearly 17 million yuan (approximately USD 2.6 million or EUR 2.4 million). Such incidents underscore the critical need for accurate remaining capacity prediction (RCP) to preemptively identify and mitigate potential hazards before batteries degrade beyond safe thresholds.

However, predicting the remaining capacity of LIBs is a challenging task. LIBs’ degradation process is highly complex, involving intricate physical and chemical mechanisms that vary over time and usage conditions. Purely theoretical models often fail to capture this complexity comprehensively, leading to limited predictive accuracy. Moreover, laboratory-based experimental approaches are impractical for large-scale applications due to their time- and resource-intensive nature. These limitations highlight the necessity of advanced predictive models that can effectively balance accuracy, efficiency, and scalability.

In recent years, data-driven approaches, particularly machine learning (ML), have emerged as powerful tools for predicting LIB performance. ML methods excel at capturing nonlinear relationships and long-term dependencies, making them well suited for deciphering the complex “composition–structure–process–property” relationships inherent in LIBs [3]. However, traditional ML models face notable limitations, including difficulty in capturing both short-term and long-term temporal dependencies, reliance on manual parameter tuning, and challenges in achieving the required precision and timeliness for RCP.

To address these issues, this study introduces an evolutionary deep learning framework tailored for RCP in LIBs, leveraging the complementary strengths of advanced deep learning models and intelligent optimization. The proposed framework integrates the following:

Temporal Convolutional Networks (TCNs): TCNs effectively capture short-term dependencies and local temporal features using causal convolutions, while dilated convolutions expand the receptive field to model long-term dependencies and multi-scale temporal patterns efficiently. This combination enables TCNs to handle both near-term and long-term degradation patterns in time-series data.
Bidirectional Gated Recurrent Units (BiGRUs): Skilled at capturing long-term dependencies by processing information bidirectionally, providing a comprehensive view of the battery’s degradation trajectory.
Attention Mechanism: Dynamically assigns weights to different time steps, emphasizing critical degradation stages that significantly impact prediction accuracy.

Together, these components form a robust time-series feature extraction framework capable of addressing the multi-scale temporal dependencies and mitigating noise present in LIB degradation data. Additionally, a hybrid optimization algorithm combining the Sparrow Search Algorithm (SSA) and Bayesian Optimization (BO) is proposed to fine-tune the network’s hyperparameters and architecture. SSA ensures diverse candidate solutions through global exploration, while BO refines these solutions with high precision, effectively balancing exploration and exploitation. This hybrid approach enables faster convergence and improved model reliability compared to conventional optimization methods.

To validate the proposed framework, extensive experiments and ablation studies were conducted on three datasets: a publicly available NASA dataset and two custom silicon-based half-cell datasets, each incorporating different conductive agents. The results demonstrate that the proposed approach significantly outperforms traditional models and optimization techniques, achieving highly accurate and stable RCP predictions across all datasets. These findings underscore the framework’s potential for real-world applications in battery health monitoring and management systems, contributing to safer and more reliable energy storage solutions.

The key contributions of this study are as follows:

(1): Advanced Time-Series Feature Extraction Model

We developed a novel feature extraction model combining TCNs, BiGRUs, and an attention mechanism. This architecture effectively captures short-term dependencies, long-term dependencies, and critical features, enabling comprehensive and accurate modeling of LIB degradation patterns.

(2): Intelligent Optimization Framework

A hybrid optimization algorithm that integrates SSA and BO is proposed to automatically fine-tune the network parameters. This approach achieves faster convergence, reduced optimization time, and improved reliability in identifying optimal solutions.

(3): Comprehensive Experimental Validation

Extensive experiments and ablation studies are conducted to rigorously evaluate the framework’s effectiveness. The results demonstrate significant improvements in prediction accuracy, robustness, and real-world applicability compared to traditional methods.

The remainder of this paper is organized as follows. Section 2 reviews existing technologies for RCP in LIBs, highlighting their limitations and challenges. Section 3 presents the proposed solution in detail. Section 4 describes the experimental design and validation process. Finally, Section 5 summarizes the main contributions and findings of this study.

2. Related Works

The widespread application of LIBs and the critical importance of accurately predicting their remaining capacity have garnered significant attention from researchers across multiple disciplines. Two commonly used metrics in this domain are State-of-Health (SOH) and Remaining Useful Life (RUL). SOH represents the ratio of the current maximum battery capacity to its rated capacity [4], while RUL refers to the estimated number of remaining charge–discharge cycles before the battery reaches its minimum acceptable capacity. SOH provides a quantitative representation of a battery’s health status, which is closely associated with its capacity degradation. Moreover, accurate and timely RUL predictions are essential for providing early warnings of potential failures and ensuring battery reliability [5].

Numerous methodologies have been developed to predict the SOH and RUL of LIBs. These approaches can be broadly categorized into three groups: model-based methods, data-driven methods, and fusion methods. Below, we provide a detailed review of these methodologies.

(1): Model-Based Methods

Model-based approaches rely on mathematical and physical representations of battery behavior, including electrochemical models, equivalent circuit models (ECMs), and empirical models:

(i): Electrochemical Models

These models provide detailed insights into the internal physicochemical processes of LIBs. For instance, Li et al. [6] and Chen et al. [7] utilized first-principles models to characterize degradation mechanisms. While offering high accuracy, these models are computationally intensive and challenging to adapt to varying operating conditions.

(ii): Equivalent Circuit Models (ECMs)

ECMs, such as those developed by Allafi et al. [8] and Naseri et al. [9], simplify LIB behavior by representing it with electrical components, such as resistors and capacitors. Notably, Naseri et al. [10] proposed an equivalent circuit model specifically designed for real-time battery management applications, offering a computationally efficient approach to accurately capture battery behavior under dynamic conditions. While these models are well suited for real-time applications due to their efficiency, they often fail to account for the complexities of internal state changes, which can limit their accuracy in highly dynamic operating environments.

(iii): Empirical Models

Empirical approaches, like those proposed by Wang Shuai et al. [11], rely on historical data to derive relationships between capacity and degradation factors. These models are easy to implement but struggle to accommodate real-time dynamic changes and environmental variability.

(2): Data-Driven Methods

Data-driven approaches do not require a deep understanding of the battery’s internal mechanisms. Instead, they leverage historical and operational data to predict remaining capacity. These methods can be further divided into three categories:

(i): Statistical Techniques

Models such as ARIMA and Kalman filtering [12,13,14,15,16] offer simplicity and computational efficiency. However, their predictive accuracy is limited, particularly for long-term capacity estimations and complex degradation patterns.

(ii): Stochastic Models:

Gaussian Processes (GPs) [17,18,19] have been widely employed for uncertainty quantification and interval prediction. While effective in providing confidence intervals, they face challenges when applied to time-varying conditions and large-scale data.

(iii): Machine Learning (ML) and Deep Learning (DL)

ML techniques, such as support vector machines (SVMs) [20,21], have demonstrated success in handling nonlinear relationships. Recently, DL models, including convolutional neural networks (CNNs), recurrent neural networks (RNNs), and transformers [22,23,24,25,26,27,28], have excelled in recognizing complex patterns and predicting capacity degradation. However, DL models require high-quality datasets and substantial computational resources, which limits their scalability in real-time applications.

(3): Fusion Methods

Fusion methods combine the strengths of multiple approaches to counteract individual weaknesses and maximize prediction accuracy. For instance, hybrid models integrating CNNs and RNNs have shown improved State-of-Charge (SOC) prediction [29]. Advanced frameworks like SFTTN utilize CNNs and Bidirectional Long Short-Term Memory (BiLSTM) networks to address cross-domain estimation challenges [30]. Dong et al. [31] highlight the effectiveness of hybrid and optimization-driven approaches in enhancing SOH and RUL predictions. Li et al. [32] proposed a CNN-BiLSTM hybrid model for cross-domain capacity estimation of lithium-ion batteries, effectively addressing generalization challenges and achieving improved accuracy across diverse datasets. Wang et al. [33] introduced a transfer-learning-based framework for SOH prediction of lithium-ion batteries, enabling accurate cross-chemistry performance estimation and improving model adaptability to various battery types. Additionally, combined models incorporating Graph Convolutional Networks and attention mechanisms achieve enhanced predictions [34]. Zhu et al. [35] proposed graph-based methods for RUL predictions, leveraging spatial–temporal dependencies to improve prediction accuracy and robustness. In conclusion, while each method has distinct strengths, integrating methods through fusion approaches presents a promising path for more robust and accurate LIB capacity predictions.

Despite significant progress, challenges remain in predicting LIB remaining capacity: (i) Complexity of Degradation Mechanisms: Nonlinear and variable degradation processes often hinder model-based methods. (ii) Generalizability and Scalability: Many methods struggle to adapt across varying chemistries, conditions, and usage scenarios. (iii) Computational Efficiency: Deep learning and fusion methods demand substantial computational resources, limiting real-time applicability. To address these challenges, this study proposes an evolutionary deep learning framework that integrates advanced time-series feature extraction with intelligent optimization algorithms. By combining hybrid deep learning architectures with a tailored optimization strategy, the framework enables accurate, robust, and scalable predictions, enhancing safety and reliability in battery management systems.

3. Methodology

3.1. Overview

The framework of the proposed method, illustrated in Figure 1, comprises the following key steps: ① data collection, ② data preprocessing, ③ design of the time-series feature extraction model, ④ automatic hyperparameter optimization, ⑤ model training, and ⑥ model testing. Specifically, these steps aim to ensure a systematic approach to the development and evaluation of the proposed methodology. Each stage plays a critical role in addressing the challenges of the problem domain, with a particular focus on enhancing the accuracy and efficiency of time-series data analysis. Detailed explanations for each step numbered ① to ⑥ are provided below:

①: Data Collection

The capacity of LIBs is defined as the total charge they can provide when fully discharged. It serves as a key metric for evaluating battery health and performance. Two key indicators, State-of-Health (SOH) and remaining capacity (RCP), are used in this study to quantify battery degradation and predict future performance.

SOH is defined as the ratio of the current maximum capacity to the rated capacity. It is expressed as

S O H (t) = \frac{C_{t}}{C_{0}} \times 100 %

(1)

where

C_{0}

denotes rated capacity and

C_{t}

denotes the measured capacity of cycle

t

. As the number of charge/discharge cycles increases, the capacity of LIBs degrades, with the End of Life (EOL) typically defined as the point where the remaining capacity reaches 70–80% of the initial rated capacity for most commercial batteries.

In this study,

C_{t}

is derived from the following integral:

C_{t} = \int_{t_{1}}^{t_{2}} i (t) d t

(2)

where the following definitions hold:

C_{t}

: Battery capacity in ampere-hours

(Ah)

or milliampere-hours

(mAh)

.

i (t)

: Current flowing through the battery at time

t

(in amperes).

t_{1}

: Start time of the discharge process.

t_{2}

: End time of the discharge process, typically when the battery reaches a specified cutoff voltage.

Here, the capacity

C_{t}

of an LIB is utilized as the direct health indicator, as it provides a quantitative measure of the remaining cycles. Considering the substantial volume of data generated during the charging and discharging processes, the integral calculation is approximated using a summation approach for practical implementation. The capacity

C

, based on this summation method, is expressed as

C_{t} = \sum_{i = 2}^{n} (I_{i} \times Δ t_{i})

(3)

where the following definitions hold:

C_{t}

: Total battery capacity in ampere-hours

(Ah)

or milliampere-hours

(mAh)

.

I_{i}

: Current at measurement interval

i

(in amperes).

Δ t_{i}

: Duration of measurement interval

i

(in hours).

n

: Total number of measurement intervals.

This transformation simplifies the computational process while maintaining accuracy, enabling effective utilization of the large-scale data generated during the entire charge–discharge cycle.

②: Data Preprocessing

To preprocess the collected data, the Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) method is employed to eliminate noise effectively. This technique ensures the preservation of intrinsic signals while reducing interference, thereby improving data quality for subsequent analysis (refer to [36] for further details).

After denoising, a sliding window technique is applied to segment the capacity data. This method transforms the original time-series data into a structured dataset, where each sample consists of a fixed-length sequence of input features and a corresponding output label. By doing so, the sliding window approach enables the model to capture local patterns and temporal trends, enhancing its predictive performance. This is illustrated in Figure 2, considering a dataset with N data points segmented using a sliding window of length L. This process generates N − L + 1 samples, where each sample comprises L − 1 data points as input features and the final data point as the output label. This segmentation method ensures that the model is trained on sequential data, allowing it to learn time-series dependencies and improve prediction accuracy.

③: The Design of the Time-Series Feature Extraction Model

A time-series feature extraction model is developed based on the specified architecture (detailed in Section 3.2), with initialized hyperparameters tailored to the application. This model is specifically designed to identify and capture the intrinsic temporal patterns within the time-series data, thereby facilitating accurate predictions of the remaining capacity of LIBs.

④: Automatic Hyperparameter Optimization

Given the complexity of the time-series feature extraction model outlined in Step ③ and the extensive hyperparameter space, two evolutionary algorithms are employed for automatic hyperparameter optimization. The first approach utilizes Particle Swarm Optimization (PSO), while the second involves a hybrid method combining the Sparrow Search Algorithm (SSA) with Bayesian Optimization (BO), as detailed in Section 3.3. This automated approach overcomes the limitations and inefficiencies of manual tuning by systematically exploring the hyperparameter space to identify near-optimal configurations.

The hybrid SSA-BO method is particularly notable for its ability to synergize the global search efficiency of SSA with the local optimization precision of BO. Key hyperparameters optimized through this approach include the learning rate, the number of BiGRU neurons, the attention key-value size, and the convolution kernel size. This ensures that the model is well calibrated for training, ultimately achieving superior performance in predicting the remaining capacity of LIBs.

⑤: Model Training

The optimized time-series feature extraction model from Step ④ is trained using the prepared training dataset. This process involves iteratively feeding the preprocessed and segmented data through the model, enabling the adjustment of weights and biases to minimize the prediction error.

⑥: Model Testing

The trained model from Step ⑤ is evaluated using a separate test dataset to assess its generalization capability. The testing phase involves passing unseen data through the model to generate predictions, allowing for an evaluation of its accuracy in predicting the remaining capacity of LIBs. This step is critical for validating the model’s robustness and demonstrating its applicability to real-world scenarios. By ensuring strong performance on test data, the model’s effectiveness in practical applications is thoroughly verified.

3.2. TCN-BiGRU-Attention Model

For time-series feature extraction in the context of LIBs aging monitoring, this study proposes the TCN-BiGRU-Attention model, as illustrated in Figure 3. The model integrates three advanced techniques: Temporal Convolutional Networks (TCN) [37], Bidirectional Gated Recurrent Units (BiGRUs) [38], and attention mechanisms [39]. This combination is specifically designed to enhance the efficiency and accuracy of predictions in LIB aging analysis.

Traditional models, such as LSTMs, have been widely employed in similar tasks due to their ability to capture long-range dependencies. However, LSTMs often struggle to effectively extract critical features, especially those spanning multiple time steps. To overcome these limitations, the proposed model incorporates several innovations:

(1): Temporal Convolutional Networks (TCNs)

Temporal Convolutional Networks (TCNs) serve as a core component of the model, enabling the effective capture of local features through their convolutional operations. These features are essential for identifying short-term variations in the LIB aging process. Mathematically, the output of a TCN layer can be represented as

y_{t} = f (x_{t - k : t})

(4)

where

t - k : t

denotes the input sequence from time

t - k

to

t

, and

f

represents a nonlinear transformation function composed of multiple dilated convolution layers. TCNs dynamically adjust their dilation factors to capture patterns at multiple time scales, uncovering multi-scale variations in battery degradation. The output of each dilated convolution layer is defined as

h_{t} = σ (W * x_{t - d^{l} : t} + b)

(5)

where

d^{l}

is the dilation factor for the

l

th layer,

W

and

b

are the weight matrix and bias vector, respectively,

*

denotes the convolution operation, and

σ

is the activation function, typically ReLU. Furthermore, by employing causal convolution strategies, TCNs ensure that predictions are based solely on current and past data, preventing any leakage of future information. This makes TCNs highly suitable for real-time prediction tasks.

(2): Bidirectional Gated Recurrent Units (BiGRUs)

To extract bidirectional time-series features, BiGRUs are introduced into the model. While a battery’s current state is influenced by its historical usage, future usage patterns can also provide valuable context. The forward GRU processes the sequence from past to present, while the backward GRU processes it from future to past. The combined output at time

t

can be formulated as

h_{t} = [\vec{h_{t}}; \overset{\leftarrow}{h_{t}}]

(6)

where

\vec{h_{t}}

and

\overset{\leftarrow}{h_{t}}

represent the hidden states produced by the forward and backward GRUs, respectively, and

[\cdot; \cdot]

denotes concatenation. This dual-direction processing allows the model to comprehensively capture sequential dependencies, making it particularly effective for assessing battery health.

(3): Self-Attention Mechanism

The self-attention mechanism enhances the model’s ability to focus on key segments of the input sequence while filtering out irrelevant or redundant information. It computes the attention scores between each position

i

and all other positions in the sequence. The attention score between position

i

and position

j

is given as

α_{i j} = \frac{e x p (e_{i j})}{\sum_{k = 1}^{T} e x p (e_{i j})}

(7)

where

e_{i j}

is the energy score calculated as

e_{i j} = q_{i}^{⊺} k_{j} / \sqrt{d_{k}}

(8)

where

q_{i}

and

k_{j}

are the query and key vectors at positions

i

and

j

, and

d_{k}

is the dimensionality of the keys. This improves the model’s resistance to noise, which is particularly beneficial for LIB aging data that often contain outliers or disturbances. By prioritizing relevant features and mitigating the impact of noise, the self-attention mechanism significantly enhances the overall reliability and predictive performance of the model.

(4): Overall TCN-BiGRU-Attention Model

The integrated model leverages these components synergistically:

(i): TCN captures local and multi-scale temporal features ( $h_{t}^{T C N}$ ).
(ii): BiGRU extracts bidirectional sequential dependencies ( $h_{t}^{B i G R U}$ ).
(iii): The self-attention mechanism enhances feature selection and noise resistance ( $h_{t}^{A t t e n t i o n}$ ).

The final output for predicting the remaining capacity of LIBs is computed as

\hat{y} = f_{O u t p u t} (h_{t}^{A t t e n t i o n})

(9)

where

f_{O u t p u t}

is a fully connected layer mapping the processed features to the predicted value.

The TCN-BiGRU-Attention model leverages the unique strengths of these components, offering a robust solution for LIB aging monitoring. Its capacity to capture multi-scale temporal patterns, bidirectional dependencies, and key information ensures superior accuracy and reliability in predicting the remaining capacity of LIBs.

3.3. Optimizing the Network Structure

The time-series feature extraction model from Section 3.2 is optimized to address issues of overfitting in large networks and underfitting in smaller ones. To achieve this, evolutionary algorithms are employed to refine both the model architecture and its hyperparameters. The optimization process focuses on achieving an appropriate balance between generalization capabilities and computational efficiency.

3.3.1. Hyperparameter Selection and Justification

This study optimizes four critical hyperparameters: the learning rate, the number of BiGRU neurons, the attention key-value dimension, and the convolution kernel size. The ranges and justifications for these hyperparameters are as follows:

(1): Learning Rate

The learning rate regulates the step size during weight updates in the training process. A smaller learning rate enhances stability and prevents overshooting the optimal solution, whereas a larger learning rate accelerates convergence but risks instability. The selected range is informed by prior research on deep learning for time-series data and preliminary experiments, which demonstrated its effectiveness in balancing convergence speed and stability.

(2): Number of BiGRU Neurons

The BiGRU layer controls the model’s ability to capture bidirectional temporal dependencies. Increasing the number of neurons boosts the model’s representational capacity but may lead to overfitting and elevated computational costs. The chosen range balances model complexity, risk of overfitting, and the nature of LIB degradation data, ensuring an appropriate trade-off between these factors.

(3): Attention Key-Value Dimension

The attention mechanism dynamically assigns importance to different time steps. A lower key-value dimension might restrict the model’s ability to identify critical patterns, while a higher dimension enhances detail representation but raises computational overhead. The range was carefully selected to achieve a balance between prediction accuracy and computational efficiency.

(4): Convolution Kernel Size

The kernel size in the TCN layer influences its capability to extract temporal features. Smaller kernel sizes emphasize fine-grained details, while larger ones focus on capturing broader temporal dependencies. The selected range provides sufficient flexibility to model multi-scale temporal features without imposing excessive computational complexity.

These hyperparameter ranges were determined based on a combination of empirical analysis, insights from prior literature, and preliminary experimental results, ensuring robust and reliable optimization tailored to the characteristics of the dataset.

3.3.2. Optimization Algorithms

Two evolutionary algorithms are used to optimize the model for LIBs capacity prediction, focusing on four critical parameters: the learning rate, the number of BiGRU neurons, the attention key value, and the convolution kernel size. These optimizations ensure an appropriately sized and tuned model, improving generalization capabilities while minimizing computational costs.

The two evolutionary algorithms used are the Particle Swarm Optimization (PSO) algorithm and a hybrid algorithm that combines the Sparrow Search Algorithm (SSA) with Bayesian Optimization (BO). PSO is a classical evolutionary algorithm, and detailed information can be found in Reference [40]. Below, we provide a detailed introduction to the hybrid algorithm proposed in this paper, which combines SSA and BO.

(1): Sparrow Search Algorithm (SSA)

The Sparrow Search Algorithm (SSA) is a meta-heuristic optimization method inspired by the foraging and anti-predator behaviors of sparrows [41]. It efficiently balances exploration and exploitation in complex search spaces by categorizing sparrows into three distinct roles: producers, scroungers, and those perceiving danger, enabling adaptive position updates within an n-dimensional space. SSA stands out for its simplicity, strong global search capability, minimal parameter requirements, and broad applicability. Its intuitive mathematical framework makes it straightforward to implement and adaptable to a wide range of optimization problems, including continuous, discrete, and combinatorial tasks. By mimicking the collective foraging behavior of sparrow groups, SSA effectively navigates large solution spaces.

Despite its advantages, SSA does have certain limitations. A major drawback is its tendency to prematurely converge to local optima, indicating inadequate exploration in highly complex or rugged landscapes. Additionally, the algorithm’s performance is highly sensitive to parameter settings, often necessitating manual fine-tuning to achieve optimal results. As iterations progress, population diversity tends to decline, reducing search efficiency and limiting its ability to escape local optima. These challenges become particularly pronounced in high-dimensional and complex optimization problems, where SSA may struggle to maintain robust and efficient search performance, ultimately impacting its overall effectiveness.

(2): Bayesian Optimization (BO)

Bayesian Optimization (BO) [42] is an efficient global optimization method designed for expensive black-box optimization tasks. It reduces the number of objective function evaluations by constructing a surrogate model to guide the search for the optimal solution. The primary steps of BO are as follows:

(i): Surrogate Model Construction

A Gaussian Process (GP) is commonly used as the surrogate model. GPs are capable of capturing the uncertainty and smoothness of the objective function, providing a probabilistic estimate of the function’s behavior over the search space.

(ii): Acquisition Function Selection

The acquisition function is responsible for selecting the next evaluation point by balancing exploration and exploitation. Typical acquisition functions include Expected Improvement (EI), Upper Confidence Bound (UCB), and Probability of Improvement (PI).

(iii): Acquisition Function Optimization

The next evaluation point is identified by optimizing the acquisition function, ensuring that the search prioritizes regions with high potential for improvement.

(iv): Surrogate Model Update

The surrogate model is updated with the results of the newly evaluated point, refining the model parameters to improve subsequent predictions.

(3): Hybrid Optimization Algorithm

To address the limitations of SSA, this study proposes a hybrid optimization algorithm that integrates SSA with BO. This hybrid approach leverages the strengths of both methods, providing significant improvements in optimization performance. The key enhancements are as follows:

(i): Enhanced Global and Local Search Capabilities

SSA demonstrates strong global search capabilities during the early stages of optimization by simulating the foraging behavior of sparrow groups, effectively locating multiple promising solutions across a broad search space. BO, in contrast, excels in local search by constructing probabilistic models to predict high-potential regions for further exploration. By combining these approaches, the hybrid algorithm ensures that BO performs refined searches based on the diverse solutions generated by SSA, thereby improving the likelihood of identifying the global optimum.

(ii): Improved Robustness

The hybrid algorithm mitigates the risk of premature convergence. SSA’s extensive initial exploration generates diverse candidate solutions, which BO uses to construct robust surrogate models. These models reduce the chances of the optimization process being trapped in local optima, thereby enhancing convergence reliability.

(iii): Maintaining Population Diversity

The integration of SSA and BO achieves a balance between exploration and exploitation. SSA’s initial exploration ensures population diversity, while BO refines and improves the solutions without compromising diversity. This synergy enhances the algorithm’s ability to adapt to complex and high-dimensional optimization landscapes.

In summary, the proposed hybrid optimization framework combines the extensive global search capabilities of SSA with the precise local refinement of BO. This integration not only increases the probability of finding the global optimum but also improves the algorithm’s overall robustness, efficiency, and adaptability, making it highly suitable for solving complex optimization problems.

The core steps of the Hybrid Algorithm are visually illustrated in Figure 4 and are elaborated upon in the pseudo-code provided in Algorithm 1.

Algorithm 1: Hybrid Optimization Algorithm for TCN-BiGRU-Attention Model

Input:
(i)

Objective function f

(e.g., validation loss)
(ii)

Search space S

(iii)

Number of iterations T

(iv)

Initial population size N

(v)

Neural network model N N

(TCN-BiGRU-Attention).
Output:
(i)

Optimal hyperparameters θ^{*}

(learning rate, number of BiGRU neurons, attention key value, convolution kernel size)
(ii)

Trained final neural network model N N

.
1: Define the search space:
Set the ranges for hyperparameters, including learning rate, BiGRU neurons, attention key value, and convolution kernel size.
2: Initialize the population

P

:

Generate N random individuals within the defined search space P = {p_{1}, p_{2}, \dots, p_{N}}

.
3: Evaluate the fitness of each individual in

P

using

f

:

For each p_{i} \in P

:

Train and validate N N

with hyperparameters p_{i}

Compute

f (p_{i})

.
4: Apply SSA to generate a set of high-quality initial values

I

:

Run SSA for a few iterations to refine the initial population P

;

Select the top k individuals from P

based on their fitness values to form I

.
5: Initialize the GP surrogate model

M

with the initial values

I

:
Use the refined population to build a Gaussian Process (GP) model as the surrogate for the objective function.
6: for

t

= 1 to T do:
(6.1)

Choose the next evaluation point x^{*}

by optimizing the acquisition function A (M)

;

Use an acquisition function (Expected Improvement) to select x^{*}

;

Train and validate N N f (x^{*})

with hyperparameters x^{*}

.
(6.2)

Update the surrogate model M

with the new evaluation (x^{*}, f (x^{*}))

.
end for.
7: Select the optimal hyperparameters

θ^{*}

from the evaluated points:
Choose the hyperparameters with the lowest validation loss.
8: Train the final neural network model

N N

with the optimal hyperparameters

θ^{*}

:

Train N N

using θ^{*}

on the entire training dataset.
9: Return θ* and the trained model

N N

:

Return the optimal hyperparameters θ^{*}

and the final trained model N N

.

3.3.3. Limitations of the Hybrid Approach

While the hybrid SSA + BO algorithm offers significant advantages, the limitations of this algorithm are as follows:

Computational Cost: The combined use of SSA and BO introduces additional computational overhead, particularly in constructing and optimizing the BO surrogate model.
Parameter Initialization: The performance of the hybrid algorithm depends on the initial parameter ranges and population size, which may require domain-specific knowledge or empirical tuning.
Scalability: For real-time applications, the optimization process may need simplification or hardware acceleration to meet latency requirements.

3.4. Summary

In this section, we presented a comprehensive framework for LIB degradation prediction, emphasizing the design of a time-series feature extraction model that integrates TCN, BiGRU, and an attention mechanism to effectively capture multi-scale temporal patterns. Additionally, we outlined the proposed methodology for optimizing the network structure and hyperparameters using a hybrid SSA-BO algorithm, which enhances both predictive accuracy and computational efficiency. The following section details the experimental design, datasets, and evaluation metrics used to validate the proposed framework, demonstrating its applicability and robustness under different conditions.

4. Experiment

4.1. Experimental Overview

The objective of the experimental validation is to assess the effectiveness and robustness of the proposed TCN-BiGRU-Attention model under varying datasets and optimization strategies. The experiments are designed to systematically evaluate the contributions of individual model components, as well as the impact of the hybrid optimization algorithm (SSA-BO). By comparing the results across different scenarios, we aim to demonstrate the model’s superiority over traditional methods in accurately predicting the remaining capacity of LIBs.

4.2. Dataset Description

(1): NASA Dataset

To assess the effectiveness of the proposed method, we used a lithium-ion battery dataset from the NASA Ames Prognostics Center of Excellence [43]. Batteries 5, 6, 7, and 18 were selected for analysis, having undergone testing under various conditions, including charging, discharging, and impedance measurements. During charging, a constant current of 1.5 A was applied until the battery voltage reached 4.2 V, followed by a constant voltage mode until the current dropped to 20 mA. During discharging, a constant load of 2 A was applied until the voltage reached specific thresholds: 2.7 V, 2.5 V, 2.2 V, and 2.5 V for batteries 5, 6, 7, and 18, respectively. The experiment ended when the battery capacity decreased by 30%. The initial capacity was 2 Ah, with end-of-life defined as 1.4 Ah. This dataset provides a comprehensive benchmark for evaluating the proposed method under diverse operational and degradation conditions.

(2): Silicon-Based Anodes Half-cell Dataset

The second lithium-ion battery (LIB) dataset utilized in this study focuses on silicon–carbon (Si-C) anode LIBs, specifically Si/CNTs and Si/Graphene composites. Due to the inherently low electrical conductivity of pure silicon, its application in scenarios requiring high charge–discharge efficiency is limited. To address this limitation, a facile ultrasonic dispersion method was employed to incorporate conductive agents into nano-silicon particles, resulting in the fabrication of the Si/CNTs and Si/Graphene composite materials. For clarity in this study, these are referred to as Half-Cell (Si/CNTs) and Half-Cell (Si/Graphene), respectively. Electrochemical testing for these materials was conducted using a VMP-300 potentiostat/galvanostat (Bio-Logic, Grenoble, France), with the voltage range set between 0.01 V and 3 V. The experiment concluded when the battery capacity decayed to 50% of its initial value. This dataset provides unique insights into the performance and degradation characteristics of Si-C anode LIBs, offering a valuable resource for evaluating the proposed method in the context of distinct battery chemistries.

4.3. Main Experiments

The main experiments consisted of twelve experimental groups: four using the NASA dataset (batteries B5, B6, B7, and B18), four using the laboratory silicon-based half-cell dataset with Si/CNTs composite anodes, and four using the laboratory silicon-based half-cell dataset with Si/Graphene composite anodes. All groups employed the TCN-BiGRU-Attention network for time-series feature extraction, with key parameters either manually initialized or automatically optimized using evolutionary algorithms. This setup was designed to comprehensively evaluate the model’s adaptability and performance across different datasets, including varying material chemistries.

For each experimental group, three optimization schemes were implemented:

(1): No Optimization: Parameters were manually set within their initial ranges, providing a baseline for comparison.
(2): Particle Swarm Optimization (PSO): Parameters were optimized using PSO, leveraging its global search capabilities.
(3): Hybrid Optimization (SSA + BO): The hybrid algorithm combined SSA for broad exploration and BO for local refinement, dynamically tuning parameters to achieve optimal performance.

Key parameters optimized include the learning rate, the number of BiGRU neurons, the attention key-value dimension, and the convolution kernel size. These parameters were initially defined within specific ranges, as shown in Table 1. The Adam optimizer, which was consistently used throughout the training process, was manually initialized and not subjected to automatic optimization. The initialization settings for the optimization algorithms are presented in Table 2.

The experimental results presented in Figure 5 demonstrate that the automatic optimization of key parameters significantly enhances the performance of the TCN-BiGRU-Attention network. The evaluation metrics include Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and R-squared (R²). These metrics provide a comprehensive assessment of the model’s predictive performance under different optimization strategies and datasets.

The hybrid SSA + BO approach consistently outperformed both the unoptimized and PSO-optimized models, showcasing its ability to balance global exploration and local refinement during optimization. As shown in Table 3, Table 4 and Table 5, this conclusion holds across the NASA, Half-Cell (Si/CNTs), and Half-Cell (Si/Graphene) datasets, effectively enhancing predictive accuracy and robustness, making this approach suitable for diverse datasets and challenging experimental scenarios.

As illustrated in Figure 6, the mean values of the evaluation metrics across the experimental groups are visualized using a radar chart, providing a clear comparison of the TCN-BiGRU-Attention network’s performance under different optimization methods. The radar chart includes five key evaluation metrics: MSE, RMSE, MAE, MAPE, and R². To ensure consistency and enhance visualization, the R² metric was transformed using 1 − R², and all metrics were normalized to a 0–1 scale for uniformity.

The results demonstrate that the Hybrid Optimization Method (SSA + BO) consistently delivers superior performance across all evaluation metrics, underscoring its comprehensive optimization capabilities. For instance, in terms of RMSE, the SSA + BO method achieves a 31.2% reduction compared to the unoptimized baseline and a 10.6% improvement over the PSO-optimized network on the NASA dataset. On the Half-Cell (Si/CNTs) dataset, SSA + BO achieves similar improvements, reducing RMSE by 35.9% compared to the baseline and by 21.9% relative to PSO. Likewise, on the Half-Cell (Si/Graphene) dataset, SSA + BO demonstrates equivalent effectiveness, reducing RMSE by 15.0% relative to the baseline and by 3.96% when compared to PSO. Furthermore, the method’s consistent superiority across various evaluation metrics highlights its robustness and adaptability to diverse datasets and experimental conditions. These results establish the hybrid approach as a highly reliable and efficient optimization strategy for complex modeling tasks.

4.4. Model Ablation Study

An ablation study for lithium-ion battery remaining capacity prediction is conducted to evaluate the contribution of each component in the proposed model. By systematically removing or altering specific components of the TCN-BiGRU-Attention network, the changes in model performance are analyzed. This approach facilitates the identification of the components that most significantly influence the accuracy and robustness of the final predictions, offering valuable insights into the model’s architecture and design decisions.

4.4.1. Experiment Design

(1): Remove TCN: To assess the contribution of the Temporal Convolutional Network (TCN) component, this experiment removes the TCN layer from the TCN-BiGRU-Attention network. The resulting model, consisting of only CNN, BiGRU, and the attention mechanism, is evaluated on the test set. The performance of this modified model is compared with the full model to analyze the impact of the TCN on prediction accuracy. RMSE results for all twelve experimental groups are presented in Table 6.
(2): Remove Attention Mechanism: To evaluate the impact of the attention mechanism, this experiment removes the attention component from the TCN-BiGRU-Attention network, leaving a model comprising only TCN and BiGRU. The modified model is evaluated on the test set, and its performance is compared against the full model to determine the significance of the attention mechanism in enhancing prediction accuracy. The Root Mean Squared Error (RMSE) results for all twelve experimental groups are presented in Table 7.
(3): Replace BiGRU with GRU: To examine the contribution of the bidirectional structure in BiGRU, this experiment replaces BiGRU with a standard unidirectional Gated Recurrent Unit (GRU) in the TCN-BiGRU-Attention network. The modified model is evaluated on the test set, and its performance is compared with the full model to assess the impact of the bidirectional mechanism. The Root Mean Squared Error (RMSE) results for all twelve experimental groups are presented in Table 8.

With the experimental design outlined, we now turn to the analysis of the results from these ablation studies, which highlight the importance of each model component.

4.4.2. Result Analysis

The ablation study results are visualized in Figure 7 using box plots that display the maximum, minimum, and average RMSE values for each experimental group. The analysis highlights that the TCN, attention mechanism, and BiGRU components are essential to the model’s performance, as detailed in the following:

(1): Removing the TCN layer resulted in a significant increase in average RMSE by 43.3%, 16.6%, and 71.3% on the NASA, Half-Cell (Si/CNTs), and Half-Cell (Si/Graphene) datasets, respectively, compared to the full model. This underscores the critical role of TCN in capturing long-term dependencies in time-series data, which is vital for predicting complex patterns accurately.
(2): Eliminating the attention mechanism led to a substantial decline in performance, with average RMSE increases of 31.1%, 49.7%, and 75.2% on the same datasets. These findings highlight the importance of the attention mechanism in focusing on relevant input features, thereby enhancing prediction accuracy and robustness, particularly in highly variable datasets.
(3): Replacing the bidirectional BiGRU with a unidirectional GRU caused notable performance degradation, marked by average RMSE increases of 64.3%, 74.6%, and 92.6% across the datasets. This demonstrates that the bidirectional architecture of BiGRU is crucial for capturing both forward and backward dependencies in sequential data, significantly impacting prediction accuracy.

Figure 7’s box plots show that the complete TCN-BiGRU-Attention network not only achieves the lowest average RMSE across all groups but also exhibits the least variability in RMSE values. This indicates that integrating all three components leads to a more stable and robust model performance.

In conclusion, these findings emphasize the indispensable roles of the TCN, attention mechanism, and BiGRU in contributing to the overall effectiveness of the network. They also provide valuable insights into achieving superior prediction accuracy and robustness for future model design and optimization in time-series forecasting tasks.

4.5. Ablation Study of Network Optimization Algorithms

To evaluate the effectiveness of integrating BO with SSA, an ablation study was conducted to analyze the impact of BO on the overall performance of the optimization process. This study aims to quantify the contribution of BO to the hybrid algorithm and to highlight its critical role in enhancing optimization efficiency and model accuracy.

4.5.1. Experiment Design

To assess the contribution of BO in the hybrid algorithm, the network structure was optimized using only SSA, excluding BO. The modified algorithm’s performance was evaluated on the test set and compared to the full hybrid optimization approach. RMSE results for all twelve experimental groups are presented in Table 9.

With the experiment design outlined, we now turn to the analysis of the results, which highlight the significant impact of Bayesian Optimization (BO) on improving optimization performance.

4.5.2. Result Analysis

The experimental outcomes are visually depicted in Figure 8 through box plots that illustrate the maximum, minimum, and average RMSE (Root Mean Square Error) values across twelve distinct experimental groups. This visualization facilitates a detailed comparison between the SSA-only optimization approach and the hybrid SSA + BO (Bayesian Optimization) method, underscoring BO’s substantial influence on enhancing optimization performance.

Specifically, the integration of Bayesian Optimization into the SSA framework results in a notable reduction in RMSE: by 21.4% for the NASA dataset, 36.0% for the Half-Cell (Si/CNTs) dataset, and 32.3% for the Half-Cell (Si/Graphene) dataset, relative to the SSA-only approach. These reductions highlight BO’s pivotal role in refining the optimization process, thereby enabling the SSA + BO method to achieve markedly superior accuracy and robustness.

Furthermore, the box plots in Figure 8 reveal that the SSA + BO method consistently yields lower average RMSE values across all experimental groups while also demonstrating reduced variability. This pattern indicates a more stable and reliable optimization performance, which is critical for practical applications.

In summary, these findings emphasize the importance of incorporating Bayesian Optimization with SSA for improved model optimization. They provide compelling evidence supporting the effectiveness of the hybrid SSA + BO method and offer valuable insights for the future design and development of optimization algorithms.

5. Conclusions

This study introduces an evolutionary deep learning framework for accurately predicting the remaining capacity (RCP) of lithium-ion batteries (LIBs), achieving significant improvements in accuracy, robustness, and adaptability. By combining advanced time-series feature extraction with evolutionary optimization, the framework effectively addresses key challenges in battery capacity prediction.

However, computational complexity poses challenges for real-time applications due to high computational demands. To mitigate these issues, two primary strategies are considered: hardware acceleration and model simplification through domain-specific knowledge. Domain-specific knowledge refers to expert insights from the field, such as electrochemical principles and physical models of battery behavior, which can streamline feature extraction and reduce computational overhead without sacrificing accuracy. Future work will focus on integrating these electrochemical and physical insights to enhance efficiency and reduce complexity, as well as validating the framework in real-world applications such as battery management systems, renewable energy storage, and electric vehicles. This ensures scalability and applicability in complex and large-scale settings. In summary, by addressing current limitations and expanding into practical applications, this research aims to pave the way for more accurate, efficient, and sustainable battery management solutions across various industries.

Author Contributions

Conceptualization: Y.L., J.Z. and B.Z.; Methodology: Y.L.; Software: L.H.; Validation: L.H., Y.W. and J.Z.; Formal Analysis: Y.L.; Investigation: Y.L.; Resources: B.Z.; Data Curation: B.Z.; Writing—Original Draft Preparation: Y.L.; Writing—Review and Editing: J.G.; Visualization: Y.W.; Supervision: J.Z.; Project Administration: J.Z.; Funding Acquisition: J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by multiple funding sources, including the Municipal Government of Quzhou (Grants Nos. 2023D015, 2023D007, 2023D033, 2023D034, and 2023D035), which provided essential financial support for this study. Additionally, the Tianjin Science and Technology Program Projects (Grant No. 24YDTPJC00630) and the Tianjin Municipal Education Commission Research Program Project (Grant No. 2022KJ012) also contributed significantly to the completion of this research.

Data Availability Statement

The data supporting the findings of this study are partially available. NASA dataset: The NASA dataset used in this study is publicly available and can be accessed at [NASA Battery Data Set URL] (https://data.nasa.gov/dataset/Battery-Data-Set/hz6w-zm6v (accessed on 17 January 2025)). Half-Cell Dataset: The half-cell dataset used in this study is confidential and not publicly available due to privacy and ethical restrictions. Requests for access to the half-cell dataset should be directed to the corresponding author, [[email protected]].

Acknowledgments

The authors wish to acknowledge the support of all team members and institutions involved in this research. Additionally, we would like to express our gratitude for the financial support provided by the following funding agencies: Municipal Government of Quzhou: Grant Nos. 2023D015, 2023D007, 2023D033, 2023D034, 2023D035; Tianjin Science and Technology Program Projects: Grant No. 24YDTPJC00630; Tianjin Municipal Education Commission Research Program Project: Grant No. 2022KJ012; We appreciate the invaluable assistance and resources provided by these organizations, which have been crucial to the successful completion of this research.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Fan, X.; Yang, Y.; Fei, Z.; Huang, Z.; Tsui, K.-L. Life Prediction of Lithium-Ion Batteries Based on Stacked Denoising Autoencoders. Reliab. Eng. Syst. Saf. 2021, 208, 107396. [Google Scholar] [CrossRef]
Xu, X.; Tang, S.; Yu, C.; Xie, J.; Han, X.; Ouyang, M. Remaining Useful Life Prediction of Lithium-Ion Batteries Based on Wiener Process under Time-Varying Temperature Condition. Reliab. Eng. Syst. Saf. 2021, 214, 107675. [Google Scholar] [CrossRef]
Liu, Y.; Zhao, T.; Ju, W.; Shi, S. Materials Discovery and Design Using Machine Learning. J. Mater. 2017, 3, 159–177. [Google Scholar] [CrossRef]
Rezvanizaniani, S.M.; Liu, Z.; Chen, Y.; Lee, J. Review and Recent Advances in Battery Health Monitoring and Prognostics Technologies for Electric Vehicle (EV) Safety and Mobility. J. Power Sources 2014, 256, 110–124. [Google Scholar] [CrossRef]
Zhou, B.; Cheng, C.; Ma, G.; Zhang, Y. Remaining Useful Life Prediction of Lithium-Ion Battery Based on Attention Mechanism with Positional Encoding. IOP Conf. Ser. Mater. Sci. Eng. 2020, 895, 012006. [Google Scholar] [CrossRef]
Li, D.; Yang, L.; Li, C. Control-Oriented Thermal-Electrochemical Modeling and Validation of Large Size Prismatic Lithium Battery for Commercial Applications. Energy 2021, 214, 119057. [Google Scholar] [CrossRef]
Chen, N.; Zhang, P.; Dai, J.; Gui, W. Estimating the State-of-Charge of Lithium-Ion Battery Using an H-Infinity Observer Based on Electrochemical Impedance Model. IEEE Access 2020, 8, 26872–26884. [Google Scholar] [CrossRef]
Zhang, C.; Allafi, W.; Dinh, Q.; Ascencio, P.; Marco, J. Online Estimation of Battery Equivalent Circuit Model Parameters and State of Charge Using Decoupled Least Squares Technique. Energy 2018, 142, 678–688. [Google Scholar] [CrossRef]
Naseri, F.; Schaltz, E.; Stroe, D.-I.; Gismero, A.; Farjah, E. An Enhanced Equivalent Circuit Model with Real-Time Parameter Identification for Battery State-of-Charge Estimation. IEEE Trans. Ind. Electron. 2022, 69, 3743–3751. [Google Scholar] [CrossRef]
Cai, C.; Gong, Y.; Fotouhi, A.; Auger, D.J. A Novel Hybrid Electrochemical Equivalent Circuit Model for Online Battery Management Systems. J. Energy Storage 2024, 99 Pt A, 113142. [Google Scholar] [CrossRef]
Wang, S.; Wang, Y.; Su, X.; Zhang, L.; Liu, M.; Chen, J.; Huang, Q.; Lee, T.; Zhao, F.; Gao, H. Impact of Energy Efficiency and Operating Temperature on the Remaining Life of Lithium-Ion Batteries. Intell. Comput. Appl. 2018, 8, 162–171. [Google Scholar]
Zhou, Y.; Huang, M. Lithium-Ion Batteries Remaining Useful Life Prediction Based on a Mixture of Empirical Mode Decomposition and ARIMA Model. Microelectron. Reliab. 2016, 65, 265–273. [Google Scholar] [CrossRef]
Kim, S.; Lee, P.-Y.; Lee, M.; Kim, J.; Na, W. Improved State-of-Health Prediction Based on Auto-Regressive Integrated Moving Average with Exogenous Variables Model in Overcoming Battery Degradation-Dependent Internal Parameter Variation. J. Energy Storage 2022, 46, 103888. [Google Scholar] [CrossRef]
Liu, K.; Shang, Y.; Ouyang, Q.; Widanage, W.D. A Data-Driven Approach with Uncertainty Quantification for Predicting Future Capacities and Remaining Useful Life of Lithium-Ion Battery. IEEE Trans. Ind. Electron. 2020, 68, 3170–3180. [Google Scholar] [CrossRef]
Xie, Y.X.; Wang, S.L.; Shi, W.H.; Xiong, X.; Chen, X. A New Method of Unscented Particle Filter for High-Fidelity Lithium-Ion Battery SOC Estimation. Energy Storage Sci. Technol. 2021, 10, 722–731. [Google Scholar]
Jiao, Z.Q.; Fan, X.M.; Zhang, X.; Luo, Y.; Liu, Y. State Tracking and Remaining Useful Life Predictive Method of Li-Ion Battery Based on Improved Particle Filter Algorithm. Trans. China Electrotech. Soc. 2020, 35, 3979–3993. [Google Scholar]
Li, X.; Yuan, C.; Wang, Z. Multitime-Scale Framework for Prognostic Health Condition of Lithium Battery Using Modified Gaussian Process Regression and Nonlinear Regression. J. Power Sources 2020, 467, 228358. [Google Scholar] [CrossRef]
Deng, Z.W.; Hu, X.S.; Lin, X.K.; Che, Y.; Xu, L.; Guo, W. Data-Driven State of Charge Estimation for Lithium-Ion Battery Packs Based on Gaussian Process Regression. Energy 2020, 205, 118000. [Google Scholar] [CrossRef]
Zhang, R.; Ji, C.H.; Zhou, X.; Liu, T.; Jin, G.; Pan, Z.; Liu, Y. Capacity Estimation of Lithium-Ion Batteries with Uncertainty Quantification Based on Temporal Convolutional Network and Gaussian Process Regression. Energies 2024, 297, 131154. [Google Scholar] [CrossRef]
Li, X.; Yuan, C.; Wang, Z. State of Health Estimation for Li-Ion Battery via Partial Incremental Capacity Analysis Based on Support Vector Regression. Energy 2020, 203, 117852. [Google Scholar] [CrossRef]
Liu, Z.Y.; He, H.J.; Xie, J.; Wang, K.; Huang, W. Self-Discharge Prediction Method for Lithium-Ion Batteries Based on Improved Support Vector Machine. J. Energy Storage 2022, 55, 105571. [Google Scholar] [CrossRef]
Qin, W.; Lv, H.; Liu, C.; Dey, N.; Jahanshahi, P. Remaining Useful Life Prediction for Lithium-Ion Batteries Using Particle Filter and Artificial Neural Network. Ind. Manag. Data Syst. 2019; ahead-of-print. [Google Scholar] [CrossRef]
Ren, L.; Dong, J.; Wang, X.; Meng, Z.; Zhao, L.; Deen, M.J. A Data-Driven Auto-CNN-LSTM Prediction Model for Lithium-Ion Battery Remaining Useful Life. IEEE Trans. Ind. Inform. 2020, 17, 3478–3487. [Google Scholar] [CrossRef]
Hong, J.; Lee, D.; Jeong, E.-R.; Yi, Y. Towards the Swift Prediction of the Remaining Useful Life of Lithium-Ion Batteries with End-to-End Deep Learning. Appl. Energy 2020, 278, 115646. [Google Scholar] [CrossRef]
Zhang, Y.; Xiong, R.; He, H.; Pecht, M.G. Long Short-Term Memory Recurrent Neural Network for Remaining Useful Life Prediction of Lithium-Ion Batteries. IEEE Trans. Veh. Technol. 2018, 67, 5695–5705. [Google Scholar] [CrossRef]
Park, K.; Choi, Y.; Choi, W.J.; Ryu, H.-Y.; Kim, H. LSTM-Based Battery Remaining Useful Life Prediction with Multi-Channel Charging Profiles. IEEE Access 2020, 8, 20786–20798. [Google Scholar] [CrossRef]
Shi, Z.; Chehade, A. A Dual-LSTM Framework Combining Change Point Detection and Remaining Useful Life Prediction. Reliab. Eng. Syst. Saf. 2021, 205, 107257. [Google Scholar] [CrossRef]
Chen, D.; Hong, W.; Zhou, X. Transformer Network for Remaining Useful Life Prediction of Lithium-Ion Batteries. IEEE Access 2022, 10, 19621–19628. [Google Scholar] [CrossRef]
Mao, J.; Yin, X.; Chen, R.; Ding, K.; Jiang, L. An Improved Approach Based on Transformer Network for Remaining Useful Life Prediction of Lithium-Ion Batteries. Energies 2022, 15, 9317. [Google Scholar] [CrossRef]
Shen, L.; Li, J.; Zuo, L.; Zhu, L.; Shen, H.T. Source-Free Cross-Domain State of Charge Estimation of Lithium-Ion Batteries at Different Ambient Temperatures. IEEE Trans. Power Electron. 2023, 38, 6851–6862. [Google Scholar] [CrossRef]
Borst, N.; Verhagen, W.J.C. Introducing CNN-LSTM Network Adaptations to Improve Remaining Useful Life Prediction of Complex Systems. Aeronaut. J. 2023, 127, 2143–2153. [Google Scholar] [CrossRef]
Hafizhahullah, H.; Yuliani, A.R.; Pardede, H.; Ramdan, A.; Zilvan, V.; Krisnandi, D.; Kadar, J. A Hybrid CNN-LSTM for Battery Remaining Useful Life Prediction with Charging Profiles Data. In Proceedings of the 2022 International Conference on Computer, Control, Informatics and Its Applications (IC3INA’22), Virtual Event, 22–23 November 2022; pp. 106–110. [Google Scholar] [CrossRef]
Hofmann, T.; Dubarry, M.; Hamar, J.; Erhard, S.; Schmidt, J.P. Transfer Learning from Synthetic Data for SOH Estimation. In ECS Meeting Abstracts, Vol. MA2024-02, A03: Accelerating Next-Generation Battery R&D Through Data-Driven Approaches; ECS—The Electrochemical Society: Pennington, NJ, USA, 2024; p. 364. [Google Scholar] [CrossRef]
Wei, Y.; Wu, D. Prediction of State of Health and Remaining Useful Life of Lithium-Ion Battery Using Graph Convolutional Network with Dual Attention Mechanisms. Reliab. Eng. Syst. Saf. 2023, 230, 108289. [Google Scholar] [CrossRef]
Srinivas, S.S.; Sarkar, R.K.; Runkana, V. Battery GraphNets: Relational Learning for Lithium-Ion Batteries (LIBs) Life Estimation. arXiv 2024, arXiv:2408.07624. [Google Scholar] [CrossRef]
Colominas, M.A.; Schlotthauer, G.; Torres, M.E. Improved Complete Ensemble EMD: A Suitable Tool for Biomedical Signal Processing. Biomed. Signal Process. Control 2014, 14, 19–29. [Google Scholar] [CrossRef]
Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. arXiv 2014, arXiv:1412.3555. Available online: https://arxiv.org/abs/1412.3555 (accessed on 15 March 2024).
Kennedy, J.; Eberhart, R. Particle Swarm Optimization. In Proceedings of the IEEE International Conference on Neural Networks, Perth, Australia, 27 November–1 December 1995; Volume 4, pp. 1942–1948. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. In Advances in Neural Information Processing Systems 30 (NIPS 2017); Curran Associates, Inc.: Red Hook, NY, USA, 2017; pp. 5998–6008. Available online: https://arxiv.org/pdf/1706.03762 (accessed on 15 March 2024).
Tang, J.; Liu, G.; Pan, Q. A Review of Representative Swarm Intelligence Algorithms for Solving Optimization Problems: Applications and Trends. IEEE/CAA J. Autom. Sin. 2021, 8, 1627–1643. [Google Scholar] [CrossRef]
Xue, J.; Shen, B. A Novel Swarm Intelligence Optimization Approach: Sparrow Search Algorithm. Syst. Sci. Control Eng. 2020, 8, 22–34. [Google Scholar] [CrossRef]
Snoek, J.; Larochelle, H.; Adams, R.P. Practical Bayesian Optimization of Machine Learning Algorithms. In Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–8 December 2012; Curran Associates, Inc.: Red Hook, NY, USA, 2012; pp. 2951–2959. [Google Scholar]
Saha, B.; Goebel, K. Battery Data Set. NASA Ames Prognostics Data Repository 2007. Available online: https://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository/ (accessed on 15 March 2024).

Figure 1. The framework of the proposed evolutionary deep learning method.

Figure 2. Sliding window technique for time-series data.

Figure 3. Architecture of the TCN-BiGRU-Attention model. (a) Represents input sequence data. (b) Extracts temporal features with dilated convolutions, capturing dependencies at different scales. d = 1, d = 2, d = 4, …, indicate the dilation rates, corresponding to progressively larger receptive fields for long-term dependency modeling. The white and grey squares are used to highlight dilated convolutions in the TCN. (c) Processes sequences in both forward and backward directions, extracting contextual dependencies. (d) Highlights critical features from the BiGRU output by assigning attention weights (α_1,1, α_1,2, …). (e) Provides the final prediction after passing through the fully connected layer indicated by cross signs.

Figure 4. Flowchart of the hybrid optimization algorithm.

Figure 5. Final test results of the twelve experimental groups. (a–d) Results on the NASA dataset. (e–h) Results on the Half-Cell (Si/CNTs) dataset. (i–l) Results on the Half-Cell (Si/Graphene) dataset.

Figure 6. Radar chart comparing the performance of the three optimization methods. (a) Performance on the NASA dataset. (b) Performance on the Half-Cell (Si/CNTs) dataset. (c) Performance on the Half-Cell (Si/Graphene) dataset.

Figure 7. Box plots comparing the performance of the proposed model and ablation models. (a) Results on the NASA dataset. (b) Results on the Half-Cell (Si/CNTs) dataset. (c) Results on the Half-Cell (Si/Graphene) dataset.

Figure 8. Box plots comparing the performance of the BO algorithm and the hybrid algorithm (SSA + BO). (a) Results on the NASA dataset. (b) Results on the Half-Cell (Si/CNTs) dataset. (c) Results on the Half-Cell (Si/Graphene) dataset.

Table 1. TCN-BiGRU-Attention network parameters.

Component	Parameter	Initial Range	Manual Initialization	Automatically Optimized
Training	Learning Rate	[0.001, 0.01]	RG	Adjusted dynamically by PSO/SSA + BO
	Optimizer	-	Adam	Adam
	Batch Size	-	32	32
TCN	Number of Layers	-	3	3
	Convolution Kernel Size	[2, 10]	RG	Determined by PSO/SSA + BO
	Dilation Factors	-	[1, 2, 4]	[1, 2, 4]
	Dropout Rate	-	0.2	0.2
BiGRU	Number of Neurons	[10, 50]	RG	Determined by PSO/SSA + BO
Attention	Key-Value Dimension	[2, 50]	RG	Determined by PSO/SSA + BO

Table 2. Optimization algorithm initialization parameters.

Algorithm	Parameter	Value/Description
PSO	Swarm Size (N)	6
	Maximum Iterations (T_max)	4
	Cognitive Coefficient (c₁)	1.5
	Social Coefficient (c₂)	1.5
	Inertia Weight (w)	0.8
SSA	Population Size (N)	6
	Maximum Iterations (T_max)	4
	Alarm Threshold (R₂)	0.8
	Safety Threshold (k)	0.5–1
BO	Surrogate Model	Gaussian Process
BO	Acquisition Function	Expected Improvement (EI)

Table 3. Performance comparison of the three experimental groups on the NASA dataset.

Optimization Method	GROUP	MSE	RMSE	MAE	MAPE	R²
Unoptimized	B5	0.0003849	0.01962	0.01464	0.0097	0.9882
	B6	0.0019812	0.04451	0.02199	0.0139	0.9120
	B7	0.0039188	0.06260	0.07914	0.0374	0.8750
	B18	0.0003752	0.01937	0.01287	0.0080	0.9885
	Mean	0.0016650	0.03653	0.03216	0.0172	0.9434
PSO	B5	0.0002373	0.01541	0.01045	0.0066	0.9927
	B6	0.0018106	0.04255	0.01909	0.0120	0.9195
	B7	0.0014178	0.03765	0.02902	0.0198	0.9717
	B18	0.0002857	0.01690	0.01084	0.0068	0.9912
	Mean	0.0009379	0.02813	0.01735	0.0113	0.9688
Hybrid (SSA + BO)	B5	0.0002122	0.01457	0.00884	0.0057	0.9936
	B6	0.0016997	0.04123	0.01645	0.0104	0.9239
	B7	0.0009132	0.03022	0.02225	0.0143	0.9823
	B18	0.0002127	0.01458	0.00898	0.0057	0.9936
	Mean	0.0007595	0.02515	0.01413	0.0090	0.9733

Table 4. Performance comparison of the three experimental groups on the laboratory Half-Cell (Si/CNTs) dataset.

Optimization Method	GROUP	MSE	RMSE	MAE	MAPE	R²
Unoptimized	I	0.0032252	0.05679	0.05074	0.0275	0.9709
	II	0.0028417	0.05331	0.04344	0.0245	0.9678
	III	0.0031221	0.05588	0.04776	0.0239	0.9665
	IV	0.0045294	0.06730	0.05579	0.0321	0.9630
	Mean	0.0034296	0.05832	0.04943	0.0270	0.9670
PSO	I	0.0027646	0.05258	0.02924	0.0142	0.9751
	II	0.0025061	0.05062	0.02747	0.0136	0.9709
	III	0.0020725	0.04552	0.02866	0.0135	0.9778
	IV	0.0018385	0.04288	0.03091	0.0152	0.9850
	Mean	0.0022954	0.04790	0.02907	0.0141	0.9772
Hybrid (SSA + BO)	I	0.001337	0.03657	0.02597	0.0133	0.9879
	II	0.001262	0.03552	0.02400	0.0117	0.9857
	III	0.001339	0.03659	0.02464	0.0123	0.9857
	IV	0.001673	0.04092	0.02917	0.0151	0.9863
	Mean	0.001403	0.03740	0.02594	0.0131	0.9864

Table 5. Performance comparison of the three experimental groups on the laboratory Half-Cell (Si/Graphene) dataset.

Optimization Method	GROUP	MSE	RMSE	MAE	MAPE	R²
Unoptimized	I	0.011862	0.10891	0.07959	0.05499	0.98491
	II	0.080990	0.28459	0.08683	0.04744	0.90090
	III	0.005486	0.07987	0.06019	0.03307	0.98592
	IV	0.009826	0.09913	0.06546	0.05845	0.98704
	Mean	0.027041	0.14313	0.07302	0.04849	0.96469
PSO	I	0.009574	0.09785	0.07081	0.03377	0.98782
	II	0.065282	0.25550	0.08229	0.03954	0.92012
	III	0.004885	0.06989	0.05319	0.02412	0.99199
	IV	0.006832	0.08266	0.06348	0.03094	0.99099
	Mean	0.021643	0.12648	0.06744	0.03209	0.97273
Hybrid (SSA + BO)	I	0.009378	0.09684	0.06615	0.03055	0.98807
	II	0.065217	0.25538	0.08220	0.03948	0.92000
	III	0.002743	0.05237	0.04300	0.02366	0.99550
	IV	0.006689	0.08179	0.06176	0.03097	0.99118
	Mean	0.021006	0.12160	0.06328	0.03117	0.97369

Table 6. Comparative results of CNN-BiGRU-Attention and TCN-BiGRU-Attention models.

Dataset	GROUP	CNN-BiGRU-Attention	TCN-BiGRU-Attention
NASA	B5	0.01483	0.01457
	B6	0.02720	0.02709
	B7	0.05682	0.01284
	B18	0.03744	0.02271
Half-Cell (Si/CNTs)	I	0.04776	0.03657
	II	0.15991	0.12844
	III	0.03580	0.03553
	IV	0.04081	0.03661
Half-Cell (Si/Graphene)	I	0.01036	0.00938
	II	0.07810	0.01197
	III	0.00468	0.00356
	IV	0.00334	0.00269

Table 7. Comparative results of TCN-BiGRU and TCN-BiGRU-Attention models.

Dataset	GROUP	CNN-BiGRU	TCN-BiGRU-Attention
NASA	B5	0.01392	0.01457
	B6	0.04171	0.02709
	B7	0.01392	0.01284
	B18	0.04334	0.02271
Half-Cell (Si/CNTs)	I	0.13416	0.03657
	II	0.19977	0.12844
	III	0.06738	0.03553
	IV	0.06970	0.03661
Half-Cell (Si/Graphene)	I	0.00546	0.00938
	II	0.08857	0.01197
	III	0.00385	0.00356
	IV	0.01347	0.00269

Table 8. Comparative results of TCN-GRU-Attention and TCN-BiGRU-Attention models.

Dataset	GROUP	TCN-GRU-Attention	TCN-BiGRU-Attention
NASA	B5	0.04527	0.01457
	B6	0.06269	0.02709
	B7	0.06787	0.01284
	B18	0.04020	0.02271
Half-Cell (Si/CNTs)	I	0.24148	0.03657
	II	0.28569	0.12844
	III	0.19593	0.03553
	IV	0.20875	0.03661
Half-Cell (Si/Graphene)	I	0.12131	0.00938
	II	0.08503	0.01197
	III	0.07737	0.00356
	IV	0.08740	0.00269

Table 9. Comparative results of BO algorithm and hybrid algorithm (SSA + BO).

Dataset	GROUP	SSA	SSA + BO
NASA	B5	0.06871	0.06445
	B6	0.02427	0.02033
	B7	0.10656	0.08013
	B18	0.10801	0.08836
Half-Cell (Si/CNTs)	I	0.08911	0.12844
	II	0.16727	0.12844
	III	0.06497	0.05501
	IV	0.07898	0.05185
Half-Cell (Si/Graphene)	I	0.01004	0.00938
	II	0.09201	0.07263
	III	0.00372	0.00356
	IV	0.03161	0.01829

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, Y.; Han, L.; Wang, Y.; Zhu, J.; Zhang, B.; Guo, J. An Evolutionary Deep Learning Framework for Accurate Remaining Capacity Prediction in Lithium-Ion Batteries. Electronics 2025, 14, 400. https://doi.org/10.3390/electronics14020400

AMA Style

Liu Y, Han L, Wang Y, Zhu J, Zhang B, Guo J. An Evolutionary Deep Learning Framework for Accurate Remaining Capacity Prediction in Lithium-Ion Batteries. Electronics. 2025; 14(2):400. https://doi.org/10.3390/electronics14020400

Chicago/Turabian Style

Liu, Yang, Liangyu Han, Yuzhu Wang, Jinqi Zhu, Bo Zhang, and Jia Guo. 2025. "An Evolutionary Deep Learning Framework for Accurate Remaining Capacity Prediction in Lithium-Ion Batteries" Electronics 14, no. 2: 400. https://doi.org/10.3390/electronics14020400

APA Style

Liu, Y., Han, L., Wang, Y., Zhu, J., Zhang, B., & Guo, J. (2025). An Evolutionary Deep Learning Framework for Accurate Remaining Capacity Prediction in Lithium-Ion Batteries. Electronics, 14(2), 400. https://doi.org/10.3390/electronics14020400

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Evolutionary Deep Learning Framework for Accurate Remaining Capacity Prediction in Lithium-Ion Batteries

Abstract

1. Introduction

2. Related Works

3. Methodology

3.1. Overview

3.2. TCN-BiGRU-Attention Model

3.3. Optimizing the Network Structure

3.3.1. Hyperparameter Selection and Justification

3.3.2. Optimization Algorithms

3.3.3. Limitations of the Hybrid Approach

3.4. Summary

4. Experiment

4.1. Experimental Overview

4.2. Dataset Description

4.3. Main Experiments

4.4. Model Ablation Study

4.4.1. Experiment Design

4.4.2. Result Analysis

4.5. Ablation Study of Network Optimization Algorithms

4.5.1. Experiment Design

4.5.2. Result Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI