1. Introduction
The widespread use of fossil fuels in the automotive field has been considered in the last few decades as one of the major contributors responsible for urban pollution and climate change. These issues have created a strong impulse towards the development of alternative propulsion systems that could guarantee competitive features vs. conventional internal combustion engine (ICE)-based powertrains in terms of operational costs, performance, and autonomy, as well as industrial feasibility. In this scenario, hybrid-electric powertrains have proven to be a good solution as they allow the performance of a smooth transition from ICE-based to full-electric powertrains, thus allowing the benefits of both solutions. Indeed, HEVs allow the achieving of low exhaust and acoustic emissions, significant energy savings, and improved drivability, typical of an electric powertrain, along with the high autonomy guaranteed by ICE [
1,
2].
The key point for the proper operation of HEVs is the energy management that is aimed at defining the most suitable power split between the multi-source energy systems available onboard. The energy management strategy (EMS) is in charge of instantaneously evaluating the optimal power ratio between ICE and electric motor-generator (EMG), in relation to the requested drive power and the available energy in the battery (i.e., its state-of-charge—SoC), with the aim of minimizing fuel consumption and exhaust emissions.
Recently, a remarkable amount of research studies have been carried out on EMS, with different methodologies and goals, depending on HEV topology (e.g., series, parallel, plug-in), mathematical formulation (e.g., rule-based, optimization-based) and available route information (e.g., causal, non-causal) [
3,
4]. Particularly, EMSs based on optimal control theory (i.e., optimization-based) allow the achieving of a global optimum solution, but they require the
a-priori knowledge of vehicle velocity trajectory (i.e., non-causal). In this framework, dynamic programming (DP) and Pontryagin’s minimum principle (PMP) are widely used in the literature [
5,
6] for vehicle optimization problems [
7,
8]. On the other hand, in the majority of the real-world driving applications, a vehicle’s velocity trajectory is not available since it is determined by the causal interaction of the vehicle with the environment and other vehicles along the route. Therefore, the aforementioned techniques (i.e., DP and PMP) are implemented in offline global optimization research to achieve benchmark solutions for supporting the onboard EMS design or the powertrain system configuration/sizing [
9]. Nevertheless, other optimization-based techniques, such as equivalent consumption minimization strategy (ECMS) and model predictive control (MPC), are extensively used for real-time EMS, but only a sub-optimal solution can be achieved in these cases [
10,
11,
12].
In order to overcome these issues, a speed prediction scheme has to be implemented in the supervisory controller, which allows evaluating the globally optimized EMS along the estimated load trajectory. The prediction can be performed by making use of both historical data and look-ahead information, retrieved by onboard vehicle sensors and connectivity [
13]. Hedge et al. [
14] presented a methodology to predict the vehicle’s speed trajectories by look-ahead data and concluded that good accuracy could be achieved by suitably tuning the level of look-ahead data with the route typology. The authors proposed in [
15] a load estimate via a recurrent neural network coupled with DP to optimize the supervisory control strategy during a future time horizon.
In this context, the purpose of this work is to design a supervisory controller for the optimal EMS of a parallel HEV based on the application of DP over a receding horizon (RH) of limited length derived from ADAS and V2X technologies. The proposed methodology allows limiting the uncertainty of predicting the whole speed trajectory and makes the optimal control solution more realistic.
3. Receding Horizon (RH)-Based Energy Management Strategy (EMS)
A parallel hybrid architecture has one degree of freedom through which power flows can be managed optimally, considering the different operating modes. Therefore, as remarked in the introduction section, fuel consumption can be minimized by managing these energy flows with a suitable energy management strategy (EMS) that involves a supervisory controller [
9,
16].
The strategies that can be adopted by the supervisory controller can be divided into two main classes: optimal and heuristic. The former uses optimal control theory to formulate a problem that can be solved subsequently, whereas the latter follows rule-based indications calibrated in advance through optimal control techniques. Another classification can be performed according to the used information: causal and non-causal. Non-causal control strategies require the full knowledge of all inputs (e.g., those related to the travel path). On the other hand, causal strategies use only past and present information.
Energy management optimization could be performed either offline or online: offline design generally exploits the knowledge of the driving cycle in advance. Nevertheless, when the driving cycle changes during operation, online approaches are required. Offline optimization generally uses a non-causal strategy (i.e., complete knowledge of the entire future path) to achieve minimum fuel consumption [
4]. Conversely, online optimization uses causal control strategies (i.e., only knowledge of the previous and current state), but they can also use non-causal control strategies when the vehicle follows a predetermined path (e.g., public transports, trains) or when future driving conditions are somehow predicted [
11].
The formulation of the basic optimization problem consists of the choice of a control variable
u(
t) to maximize (or minimize) an integral objective function
J:
where
x(
t) represents the state variable and
L is an instantaneous cost function that combines several pieces of information defined for the specific problem. In HEVs, the state variable that is usually considered is the battery SOC. When accounting for a charge sustaining approach, to consider the constraint on the final SOC, the objective function is formulated adding a penalty function:
Among the different procedures to solve an optimal control problem, dynamic programming (DP) allows the finding of the optimal global solution respecting multiple constraints. The main disadvantage is its heavy computational load, which increases exponentially with the number of system state and control variables [
12]. DP was firstly introduced in 1957 as a solution for multi-stage decision problems in which time plays a significant role, and the order of operations may be crucial [
17].
The specific problem objective of this study can be presented as the identification of the optimal power profile (or the optimal power split among the battery and thermal engine) to be applied according to the mission profile in order to minimize the overall fuel consumption of the vehicle. Therefore, the instantaneous cost function becomes the fuel mass flow, whereas the system state and control variables are the battery SOC and power, respectively:
The minimum and maximum battery power values are strictly related to the battery charge and discharge features. The engaged gear is chosen according to the one that gives the lowest fuel consumption, whereas the minimum and maximum limits for the battery SOC are imposed and represent an input for the strategy implementation.
DP implementation requires the discretization of time as well as state and control variable feasible sets. Thus, the solution found with the DP can be considered optimal within the error limits introduced by the discretization. The overall mission cost in Equation (18) can be written as:
where integral form is now substituted by the summation of the costs related to the
N time intervals and the final cost, the DP algorithm starts from the end of the mission profile and proceeds backward in time towards the start of the mission. During this process, at each
k-th discrete time instant, the algorithm stores the optimal value of the battery power and the cost-to-go (
Jk) that represents the minimum cost related to the transition from each
i-th SOC at
tk to the SOC at the end (
tN). After computing all the costs for the final SOC to all possible initial SOC values, the algorithm evaluates the optimal state trajectory through a forward evaluation in time. More details on the DP approach can be found in [
18,
19].
In this work, the combination of the DP method with the receding horizon (RH) approach to find sub-optimal solutions with respect to those achieved with the knowledge of the entire driving path (full horizon—FH) is presented. In real applications, it is generally not possible to know the whole driving horizon, especially for long journeys, due to factors such as alternative routes, traffic. Therefore, if the optimal solution is calculated on a predetermined path, when, for some reason, the drivability conditions vary, the vehicle does not follow the optimal trajectory programmed but a non-optimal one. The RH approach allows the solving of the problem of optimal control on a shorter horizon and to update the solution with a periodic interval (which could be either in time or in space) [
13,
14]. Hence, two main parameters need to the defined as illustrated in the example of
Figure 3: (i) the path length (
Hp) and (ii) the updating interval (
Hs). It is clear that the choice of such parameters is fundamental to achieve a satisfactory solution that represents a compromise between the optimal solution achieved with the FH and the practical applicability of the RH-based approach.
4. Case Study Results and Discussion
The RH-based EMS is applied to a WLTC driving cycle, and the performance in terms of fuel consumption is investigated with respect to that achieved with an FH approach. Moreover, a parametric analysis is performed by varying Hp and Hs and observing the deviation of the fuel consumption so as to identify the limit conditions under which the EMS solution degrades with respect to the FH solution.
Figure 4a,b presents the desired vehicle speed and the related traveled distance, respectively, for the WLTC cycle, whereas
Figure 4c represents the slope accounted for in the study and
Figure 4d reports the traveled height related to the applied slope and desired speed. For the initial urban path, an increasing slope is assumed, followed by a decrease and a stabilization in the last highway section of the cycle. As previously commented, the battery management accounts for a charge sustaining procedure, with minimum and maximum battery SOC being 0.5 and 0.7, respectively. Moreover, the initial SOC is assumed to be 0.6, which needs to be respected at the end of the whole FH optimization as well as at the end of each
Hp horizon for the RH optimization.
Figure 5 reports with solid blue lines the results achieved with an FH approach on the considered cycle in terms of SOC (
Figure 5a) and fuel consumption (
Figure 5b) behaviors over time. It can be observed that the strategy based on the knowledge of the entire path asks for an almost continuous discharge of the battery until the minimum SOC is reached at 900 s. Afterward, the battery is charged, taking advantage of both engine charging and regenerative braking, until the maximum allowed SOC is reached at around 1500 s. The last part of the highway route accounts for battery discharge to respect the imposed SOC at the end of the path (i.e., 0.6 reached at 1800 s).
Fuel consumption behavior over time reflects the constraints imposed by the cycle, especially those related to the slope. Although discharging the battery in the first part of the cycle, the fuel consumption increases with an almost constant slope. This is due to the need for traction due to the increase in road slope. Since the EMS is aware of the future road slope variation, the battery is used to overcome the initial part of the cycle where more energy is required, while in the second part, battery SOC can be recovered thanks to regenerative braking due to the reduction in path slope. This is indeed evident between 900 s and 1500 s, where the fuel consumption is almost constant at 0.6 kg. The final part of the cycle sees a greater increase in fuel consumption instead due to the end of road slope change and to the higher speed requested. The final result is then 0.98992 kg of fuel consumed for the entire cycle.
The application of the RH-based approach requires the setting of both the partial horizon
Hp and the update interval
Hs. For a preliminary analysis, an initial value of 300 s is chosen for the
Hp, whereas the
Hs is set to 150 s. This means that the WLTC cycle is analyzed with a sliding window of 300 s that is updated every 150 s. The results of this process are compared to those achieved with the FH in
Figure 5a in terms of SOC time behavior and in
Figure 5b in terms of fuel consumption with red dashed lines. From
Figure 5a, it can be observed that, differently from the FH results, the SOC is practically kept constant at around 0.6 in the first part of the cycle. This is due to the fact that the algorithm can only update the path knowledge with a narrower window and, since the road slope increases up to 900 s, the algorithm cannot consume battery energy without defying the constraint on the SOC at the end of the
Hp window. This condition changes as soon as the algorithm encounters a slope inversion within the
Hp window, as can be seen after 750 s. This indeed allows the recharging of the battery after its use. The second part of the cycle has similar behavior, although smoothed, in terms of SOC trend to that of FH since the algorithm can use more of the battery (being in descent), and the recharging process takes place before the final part of the highway route to fulfill the charge sustaining constraint. The change in SOC behavior is reflected in a change in the final fuel consumption achieved by the RH approach. However, although the SOC trend is quite different, the fuel consumption trend is almost the same if compared to that of the FH approach, as can be seen in
Figure 5b. As a final result, the fuel consumption achieved for the WLTC cycle with the imposed
Hp and
Hs windows is about 1.0574 kg, with an increase from FC consumption of 6.8%.
To investigate the change in EMS results with different
Hp windows, the same
Hs value of 150 s is considered, and two further values of 200 s and 400 s are accounted for
Hp. The related results in terms of SOC trend are shown in
Figure 6 with respect to the FH, and the RH results formerly discussed. It can be immediately seen that, by increasing the horizon (i.e., the path) length, the SOC trend tends towards that of the FH, as obviously expected.
The achieved result in terms of fuel consumption for Hp = 200 s is 1.0622 kg, with an increase of 7.3% with respect to FH consumption, while for Hp = 400 s, the fuel consumption is 1.0081 kg (an increase of 1.83%). The achieved results prove that with a Hp window above 400 s, the results are almost comparable with those of the FH, with a fuel consumption increase lower than 2%.
Further extending the parametric analysis to the
Hs window length, different combinations of
Hs-
Hp values are investigated, and the related results are presented in an integrated way in
Figure 7. The optimal fuel consumption achieved with the FH approach is depicted with a dashed black line at the bottom of the figure. On the x-axis, the values of the
Hp windows are reported, from 100 s up to 400 s, while the different curves are parametric with respect to
Hs (50 s, 100 s, 150 s and 200 s). Some numerical results are presented in
Table 1.
From the trends reported in
Figure 7 and the values illustrated in
Table 1, it can be clearly assessed that fuel consumption decreases with an increasing path horizon
Hp while grows with the rise of the update length
Hs. The limit of 2% deviation from FH consumption is achieved with
Hp values equal to 400 s and only if
Hs is kept below 200 s.