1. Introduction
Over the past few years, RMPC was enjoying enormous acceptance in practical applications, including trajectory tracking, industrial process control, and energy systems [
1,
2,
3]. The successful implementation of RMPC in the various branches is on account of its prominent advantages. In particular, RMPC provides an integrated solution for controlling systems with model uncertainty, additive disturbance, and constraints. Theoretically, the feature attracted remarkable attention for analyzing and synthesizing different forms of RMPC. As a result, several RMPC algorithms were investigated in the literature [
4,
5,
6,
7], and so on.
Recently, the application requirements for considering practical constraints and the realization environment prompted increasing attention of RMPC towards new orientations. For instance, the increasing demand for algorithms underscores the need to integrate optimization performance and control robustness, propelling the development of tube-based MPC (TMPC) [
8,
9,
10]. The deployment of tubes draws forth a set of strictly set theoretic strategies for RMPC synthesis, which consider a computationally efficient treatment of uncertainties and their interaction with the system dynamics and constraints. In [
11,
12], a class of linear systems with bounded disturbance and convex constraint separated the nominal system from the actual system by adopting a separation control strategy. What is noteworthy is that the conservatism of the proposal employing this construction in [
13,
14] was caused by deploying the fixed tube cross-section shape sets. To mitigate this conservatism, the homothetic tube model predictive control (HTMPC) strategy proposed in [
15,
16] explored the impact of disturbances by constructing locally accurate reachable sets centered around nominal system trajectories. In light of these developments, the concept of HTMPC emerged as an enhanced and more adaptable framework for RMPC synthesis. Among the array of control schemes considered, HTMPC stands out as an improved and more versatile option. What sets it apart is its capacity to parameterize the cross-sections of the state tube and control tube in terms of associated centers and scaling sequences. This study aims to further investigate this concept by considering variations in state error and changes in the value of the cost function during error adjustment in designing the tube size controller and auxiliary control law, which distinguishes it from previous literature [
15] that incorporates scaling vector optimization into OCP, thereby increasing computational complexity and aiming to optimize scaling vectors to a specific value. However, it is essential to note that an inherent drawback of the HTMPC approach lies in its computational complexity, which grows significantly with an increasing number of constraints, as measured by the proliferation of polytopic regions.
Furthermore, the issue of computational complexity is generally associated with dynamic programming in the presence of constraints and uncertainties, which inspires the development of parameterized RMPC [
17,
18,
19]. The parameterized optimization problem is commonly approximated using neural network (NN) or DNN to enhance computational efficiency [
20,
21]. Certain studies even turned to symmetric neural networks (SNNs) due to their unique properties [
22]. SNNs, characterized by symmetric weight initialization and activation functions, demonstrated their ability to accelerate convergence and improve the robustness of neural network-based approaches [
23]. Some studies adopted an offline approach to generate nominal systems [
24,
25]. While effective in reducing online computation time, this method leans toward a more conservative control strategy in highly uncertain scenarios, necessitating a trade-off with control performance. Additionally, other studies considered system uncertainty by establishing a linear variable parameter system model [
26,
27]. This approach facilitates adaptive learning to address system changes and uncertainties, making it better suited for handling variations in the variable parameter system. However, applying this technique in complex, large-scale systems demands substantial computational resources for the training and inference of DNN, potentially leading to real-time control delays. As the field of online learning technology continues to mature, its integration with RMPC holds promise for enhancing the real-time capabilities and scalability of the control scheme. Notably, previous studies employed reinforcement learning techniques to solve linear quadratic regulator and MPC problems, providing convergence proofs for associated issues [
28,
29]. Advanced deep reinforcement learning algorithms further demonstrated their potential within an RMPC framework, emphasizing the iterative interaction between optimal control actions and performance indices [
30,
31]. These instances underscore the capacity of online learning techniques to address quadratic programming problems. Therefore, the integration of online learning techniques, including deep neural networks (DNNs) with a symmetric architecture, holds immense potential in enhancing the real-time capabilities and scalability of robust model predictive control (RMPC). Our proposed approach, which leverages the computational power of GPUs for real-time acquisition of time-varying nominal system information, not only ensures real-time control performance, but also optimizes efficiency.
Building upon the above research, it is not difficult to find that a promising approach involves incorporating tubes with increased degrees of freedom into the optimization process while employing function approximation and online learning techniques within the framework of RMPC to enhance computational efficiency. The main contributions of the paper are three-fold:
A fuzzy-based tube size controller is investigated to adjust the local error tube-scaling vector. Specifically, the controller is designed by considering the state error between the nominal and the actual systems; the error and error variety rate bounds are then established, and the fuzzy IF-THEN rules are derived. The tightened sets on state error are developed to satisfy the system constraints in the case of external disturbances and model uncertainties.
An auxiliary control law pertaining to the scaling vector of the error tube holds greater significance. The auxiliary control law effectively mitigates interference impact on the system by considering variations in the system’s cost function.
A theoretically rigorous and technically achievable framework for RMPC with online parameter estimation, based on a constrained DNN with symmetry properties to improve computing performance, was developed: the OPC is defined based on the parameters of online learning; the DNN structure is expanded using Dykstra’s projection algorithm to ensure the feasibility of the successor state and control input; a time-varying nominal system is generated based on the aforementioned content to fulfill the requirements of system robustness.
The remainder of this paper is organized as follows: Preliminaries and problem formulation is considered in
Section 2. In
Section 3, a novel RMPC scheme is developed based on the fuzzy-based tube size controller and constrained DNN algorithm.
Section 4 provides two numerical examples to illustrate the feasibility and effectiveness of the proposed control scheme. In
Section 5, some conclusions are drawn.
2. Preliminaries and Problem Formulation
2.1. Nomenclatures
The set of non-negative reals is denoted by ; is a sequence of non-negative integers . For a set A and a real matrix M of compatible dimensions, the image of A under M is denoted by . Given two subsets C and B of and , the Minkowski set addition is defined by and Minkowski set subtraction is defined by . is substituted for . For and , define . The distance of a point from a point is denoted by . denotes the convex hall of elements in . For an unknown vector , the notations represent its optimal value.
2.2. Problem Formulation
Consider a discrete-time linear system with bounded disturbance (actual system) in the form of
where
N is the horizon length.
and
are the state vector and the control input of the actual system subject to bounded disturbance
.
is taking values in the set
. The
denotes the successor state of the actual system. The system variables are selected such that the following constraints are satisfied:
where
and
are compact and convex, which contains the origin as an interior point. The compact set
contains the origin.
Let the nominal (reference) system without any disturbance corresponding to (1) be defined by
where
and
are the state and control input of the nominal system without accounting for any uncertainty, respectively.
denotes the desired value of the successor state in the system (1).
The state error is represented as
Assumption 1. The matrix pair
is known and stabilizable;
The state
can be measured at each sample time;
The current disturbance
and future disturbances
are not known and can take arbitrary values.
In this paper, the fixed shape set of the error tube is expressed as
E. For any non-empty set
, the error tube is a sequence of sets
, where
is given by
where
is the scaling vector. Meanwhile, for each relevant
, the state tube
and control tube
corresponding to HTMPC [
18] are indirectly determined by the following form
where
and
are the sequence of state tube and control tube centers determined by state error
.
is the disturbance rejection gain [
32]. The corresponding control policy is a sequence of control laws
with
Refer to Equations (5)–(8), clearly, given set , the error tube , state tube , control tube , and control policy are determined by the sequences of and . Consequently, introduce a decision variable .
Subsequently, the OCP
is defined by
where the cost function
is defined by
with
and
here,
is the stage cost, which is employed to achieve the desired performance of the control. The terminal cost represented by
ensures stability and recursive feasibility.
,
, and
are known positive definite symmetric matrices. For any
, the set of permissible decision variables
corresponds to the value of the set-valued map
as
, where
where
is the terminal constraint set [
33] for
.
Similar to the tube MPC principle [
8], if
satisfies
and
satisfies
, then the imposed constraints on the actual system state
and control input
are also met. In this work, the determination of
is related to
, while the determination of
is concerned with
; thus, it is imperative to satisfy both conditions
and
. Furthermore, at step N, if
fulfills terminal constraint
(the Equation (21) provides the formulation and limitation of
), it guarantees that the system state complies with requirement
.
Constraints (15) and (19) represent the set dynamics of the error tube and the homothetic state tube, respectively, which contribute to dynamic relaxation in [
8]. In addition, the terminal constraint set
satisfies the following constraint:
The performance evaluation of the terminal control necessitates the definition of a 0-step homothetic tube controllability set
[
15], which must satisfy the following constraints:
where
denotes a set
projected onto
as
.
2.3. Controller Synthesis
The objectives of this paper is to design an optimal control policy based on any given initial state error , which not only renders the local state asymptotically tracking the reference state , namely asymptotically approaching zero, but also minimizes the OCP. The problem of solving the conventional control policy of (1) is converted into finding the nominal optimal control input and designing an appropriate disturbance rejection gain while ensuring that the constraints related to are satisfied.
The controller synthesis for the proposed RMPC scheme is specified as
where
is the control action obtained from the presented method. The ancillary control law is denoted as
, which keeps the local state
within the error tube centered around the trajectory of
.
is the output obtained by online learning with state errors as input.
Consider the error system obtained by integrating the Equations (1), (3), and (4) as
where
is the successor state error. The system (24) is rewritten to be
3. DNN-Based RMPC with a Fuzzy-Based Tube Size Controller
This section presents the design of the novel RMPC scheme, which incorporates updates to scaling and policy iteration for nominal control. The innovative RMPC framework consists of a fuzzy-based tube size controller and a constrained DNN-based nominal RMPC component. The former calculates the error tube-scaling vector by considering both state error and error variety rate, while the latter determines a sequence of constraints associated with scaling to ensure optimal control policy generation. Concurrently, the DNN-based nominal RMPC offers a time-varying nominal system that exhibits enhanced computational efficiency. Moreover, by incorporating variations in the cost function value into the auxiliary control law design, it effectively mitigates the adverse effects of interference on the system.
3.1. Error Tube and Constraint Satisfaction
This work discusses that the fuzzy control is used to estimate (predict) the corresponding error tube-scaling vector , allowing for computational feasibility of the OCP . More importantly, an auxiliary control law pertaining to the scaling vector of the error tube holds greater significance. The auxiliary control law effectively mitigates interference impact on the system by considering variations in the system’s cost function.
If satisfies Assumption 2, then for any established , it holds that . Further, the nominal state and control input are restrained indirectly as and . It is clear that if , , then the satisfaction of original constraints for is guaranteed by using the control scheme .
Next, the fuzzy-based tube size controller is employed to estimate the error tube scaling, which generates the scaling vector by considering the local error and the error variety rate. The components of the fuzzy controller [
35] include some fuzzy IF-THEN rules and a fuzzy inference engine. The fuzzy inference engine utilizes the IF-THEN rules to map from input error
and error variety rate
to an output variable
. The lower and upper bound values of
and
are represented as
and
, respectively. Furthermore, divide the two-dimensional graph comprising
and
into nine distinct regions, as depicted in
Figure 1. Upper and lower limits for both
and
define these regions. Each region, denoted as
, corresponds to a specific IF-THEN rule. The fuzzy controller accurately determines the region within the graph where a given pair of values for
and
are located, based on the provided input. Subsequently, it employs IF-THEN rules to calculate the appropriate scaling variables. Taking
as an illustrative example, in this particular scenario, when
and
, it indicates a relatively high positive deviation of the system’s state error with a gradual increase. In such circumstances, the controller generates a diminished value for
as an output, ensuring that the system’s state error exhibits a tightening trend.
To be specific, fuzzy IF-THEN rules are written as
(). IF and or and THEN takes on a smaller value;
(). IF and or and THEN takes on a slightly larger value;
(). IF and or and THEN takes a value as small as possible.
(). IF and or and THEN takes on a larger value;
(). IF and THEN takes a value as large as possible.
For convenience, let the universe of
be a~b and set the universe of
as c~d. The membership degree function is taken as the triangular function. Then, singleton fuzzifier and average center defuzzifier [
36] were used to calculate outputs
based on the feedback values of
and
in the form of
where
is the membership degree of the five cases mentioned above. The
is an adjustable weight parameter of
under a different context. Afterward, the successor value of
is determined by
with
Theorem 1. Given system (1) controlled with the control policy , the state error is restricted to the error tube . To be specific, the design of the disturbance rejection rate ensures that error for .
Lemma 1 ([
37])
. where are any vector. is a positive definite matrix. Proof of Theorem 1. Consider the error system (25). The disturbance rejection gain guarantees that is constrained to be inside the set , i.e., . Since the nominal system (3) has robust stability, the nominal state should converge to the origin . Then, the state error must converge to error tube because of , namely . Finally, the state error is restricted to a variable error tube whose center is at the origin by implementing the ancillary control law .
Here, the disturbance rejection gain
is solved by the following equation
where
H is determined by equation
and
Q is a positive definite matrix.
denotes the identity matrix with the same dimensions as the state vector
. For convenience, let us set
.
Then,
P is the solution to the following Lyaponuv equation
with
where
is the maximum value of the matrix eigenvalue.
The Lyapunov candidate function is represented as
Consider the first difference equation as
By substituting Equation (25) into Equation (33), one obtains
According to Lemma 1, then it follows that
According to Equation (28), the disturbance is bounded by
as
. We have
By substituting Equation (29) into Inequation (37), further obtain
It is clear that , thus the function (32) is a decreasing function, then . □
This section shows that optimal cross-sections of the error tube are calculated online by considering the adjustable tube-scaling parameters
, which are affected by a combination of error and error variety ratio. Theorem 1 shows that the successor estimation of the actual system has a non-increasing estimation error at each time step. The design of the fuzzy-based tube size controller and the auxiliary control law considers both variations in state error and changes in the value of cost function during error adjustment, unlike previous literature [
15] that incorporates optimization of scaling vector
into OCP, thereby increasing computational complexity and aiming to optimize scaling vectors to a specific value. In addition, we discover that the appropriate selection of the acquisition form of a nominal system can improve prediction accuracy. Nonetheless, the invariable nominal system is considered during the prediction in [
24,
25]. In order to improve the control performance, our main concern here is to define a parameter estimation scheme that generates a time-varying nominal system based on the DNN algorithm and still enables a computationally tractable RMPC algorithm, which is presented in the following.
3.2. Design of DNN-Based Nominal RMPC
This section focuses on designing the DNN-based nominal RMPC to construct a parameter estimation synthesis that provides a time-varying nominal system for the control scheme. The cost function for the constrained system proposed in conventional RMPC is reformulated as an online learning problem by introducing a series of reference control inputs
parameterized by
θ. The modified OCP
, solved online, is defined by
The parameters θ will update in the direction of the gradient of the cost function by adopting the policy gradient method. In the architecture of constrained DNN-based nominal RMPC, the state errors () are used as input to create the optimal control policy as the output of DNN.
This paper employs DNN characterized by inherent symmetry, which features symmetric weights, facilitating efficient parameter sharing. Consequently, the network demands fewer computational resources than conventional network structures, rendering them advantageous in resource-constrained environments. Assuming the network has
L hidden layers, the layers 1 and
L each consist of
neurons. The architecture of a deep neural network is illustrated in
Figure 2.
The superiority of the network architecture employed in this paper over a typical neural network structure is demonstrated in
Table 1.
From
Table 1, it can be observed that in symmetric neural networks, the number of weights to be calculated is reduced since each connection is computed only once and then shared. Notably, despite having fewer parameters, deep neural networks with symmetric structures achieve higher accuracy under the same computational resources. Regarding convergence, symmetric neural networks require 35.79% fewer iterations than conventional neural networks.
The output of the DNN-based nominal RMPC is formulated as
where the linear relationship coefficient matrix and bias vector between the hidden layer and the output layer are denoted as
and
, respectively. The affine function parameters
will be optimized.
is a rectified linear unit function. The output value of the hidden layer is
, and set the input
to
.
Since the neural network may output a potentially infeasible
for a given error
, Dykstra’s projection algorithm [
38] is introduced to ensure that subsequent states and controls remain feasible. Its structure is shown in
Figure 3.
Theorem 2. By applying Dykstra’s projection algorithm, the optimal control input converges to the orthogonal projection of onto the polytopic as .
Proof of Theorem 2. First, define the orthogonal projection of
onto the polytopic
as
, and a series of variables
and
are generated from the DNN structure, which is extended by Dykstra’s projection algorithm. It then iterates as
Assume that the starting condition of the algorithm is and . When , we have , it is clearly that (i.e., the nominal control input converges to ). □
Thus, given a state error , control policy will output .
According to the policy gradient theory presented in [
39], the gradient of the value function
with respect to the policy parameters
θ is
where
is a multivariate Gaussian probability density function used to sample control inputs
, centered at the DNN output
with diagonal covariance
Σ, the covariance
Σ anneals to 0 at the end of training to return to the control police.
The neural network parameters iterate by using stochastic gradient descent as
The learning rate of DNN is selected as a positive number.
The termination criterion for the iteration is defined as
For application of the proposed approach, instead of focusing on constructing a set of polytopic regions, function approximation and reinforcement learning techniques are used to directly learn an approximate optimal control policy. Furthermore, the policy gradient method guarantees the control action converges to locally optimal solutions by applying function approximation to generate unbiased estimates of the gradient with respect to the parameter θ. The proposed optimization method significantly enhances the computational performance of the system control while ensuring the feasibility of control inputs.
3.3. The Feedback Mechanism of the Control Synthesis
In this paper, the feedback loop encompasses state error, state error variety rate, and cost function, as illustrated in
Figure 4. Expressly, the state error and error variety rate are conveyed to the fuzzy controller, subsequently yielding an error scaling vector associated with constraints at the subsequent time step. Simultaneously, the state error contributes to the optimization process of the cost function. The resulting cost function value is then fed back into the auxiliary control law, thereby determining the auxiliary control rate for the upcoming sampling time.
The comparison between the computational performance of the proposed algorithm and HTMPC is shown in
Table 1. Where
,
,
, and
in
Table 1 denote the numbers of affine inequalities of the irreducible representation of the sets
,
,
, and
employed in the propose scheme;
and
are the numbers of affine inequalities of the irreducible representation of the state homothetic set and the terminal constraint set, respectively.
Table 2 clearly demonstrates that assigning the scaling vector to the fuzzy controller’s specialized treatment not only provides a more comprehensive consideration of the impact of state error and error variety rate on error tube scaling, but also effectively reduces the number of decision variables and inequality constraints in the optimization process. Furthermore, the design of a symmetric constrained DNN structure addresses the issue of the exponential growth of polyhedra construction with the increasing number of constraints during the optimization. Consequently, implementing the proposed algorithm allows for a substantial reduction in computational complexity while enhancing the flexibility of system control.
3.4. The DNN-Based RMPC with a Fuzzy-Based Tube Size Controller Structure
To recapitulate, the proposed RMPC scheme comprises a fuzzy-based tube size controller and a DNN-based nominal RMPC part. The fuzzy-based tube size controller is employed to adjust error tube scaling. Meanwhile, the tightened sets (i.e., the minimal disturbance invariant set with an adjustable parameter ) and disturbance rejection gain are computed online to restrain state error. Then, the DNN nominal RMPC is used to generate the time-varying nominal system in the case that the constraints associated with are satisfied. It provides a theoretically rigorous and technically achievable framework for RMPC with online parameter estimation to improve calculated performance.
In this paper, we obtain the error tube shape set
E by computing the minimum robust positively invariant set using the method described in [
34]. Moreover, we set the error bound
to 3.7 and secured the rate of error variation
by 2.5. As parameter
typically ranges between 1 and 2, we design the fuzzy rule table in the
Table 3 format.
Table 3 shows that inputting a data pair
determines a reasonable value for
, which subsequently dictates
’s value according to Equation (27). The determination of
further influences determining associated constraints
,
,
, and
. Additionally, disturbance rejection rate and terminal cost function for control can be determined based on Equations (29) and (30). To meet specified constraints, constrained time-varying nominal system trajectories are computed through a Dykstra’s projection algorithm-extended constrained DNN. The actual system will track the nominal trajectory while satisfying relevant conditions.
Section 4 will explicitly discuss DNN parameter settings depending on the dimensionality of input and output variables. Specifically, Algorithm 1 gives the main procedure of the proposed control scheme, and its whole structure diagram is presented in
Figure 5.
Algorithm 1 DNN-based RMPC with a fuzzy-based tube size controller |
Given initial conditions , and weighting matrices , , determine the set . |
Compute the terminal weight matrix P and disturbance rejection gain by using (29) and (30). |
1: Randomly initialize |
2: Set learning rate |
3: for each time instant k = 0,1,2,…,N do |
4: Compute polytopic , , and |
5: if constraints (41)–(46) are satisfied then |
6: repeat calculate by using (51) |
7: until convergence |
8: else |
9: let |
10: end if |
11: Solve the optimization problem (39) and (40) based on to obtain , |
12: Compute the error variety rate and the corresponding scaling vector , then obtain the successor scaling vector by using (27), |
13: Calculate the control input as , and then implement to the system. |
14: end for |
4. Simulations and Comparison Study
In this section, the advantages of the Algorithm 1 are illustrated by the following simulation examples of both 2-D and 4-D systems. The simulation experiments were conducted using Matlab, and the polyhedral constraint set was constructed utilizing the Mosek and MPT toolbox. Subsequently, the convex optimization problem of the actual system was solved. Deep learning toolboxes were employed to train neural networks for determining optimal control inputs in a nominal system.
Example 1. Consider a 2-D double integrator discrete-time system in the form of (1) with The state constraints are
, the disturbance is bounded as
, and the control constraint is
. The performance index function is defined in (39)–(46) with
and
, the terminal cost
is the value function
, while
P is calculated from (30). Then, disturbance rejection gain
is computed by using (29). The set
is computed as a polytopic. The horizon length is selected as
. The system is simulated using the initial condition
and
, the value of
is induced by Equation (27). In the context of neural network architecture determination,
Figure 6 compares system nominal state trajectories when employing different network structures and deep neural networks with varying layers.
Indeed, from
Figure 6, it is evident that when utilizing a symmetric DNN with six hidden layers, the trajectories of system nominal states can reach the desired values more rapidly (i.e., the trajectories of system nominal states reaching the origin by the 12th sampling time).
The state trajectories for the proposed RMPC scheme are indicated in
Figure 7. The solid line represents the state trajectory of the nominal system (3), while the dash-dot line is the state trajectory of the actual system (1). The error tube
is depicted by green polytopes, while the 0-step homothetic tube controllability set X
0 is represented by the dark gray area. Obviously, the local state at each instance is regulated in an error tube
centered around the trajectory of the nominal state. As anticipated, the cross-section of the error tube diminishes as the nominal state converges towards the origin.
Then, in order to make the comparison between the control performance of Algorithm 1 and the RMPC algorithm more apparent,
N is set to 25.
Figure 8 shows the state curves for Algorithm 1 and the HTMPC strategy. The state constraint is shown in the gray region. Algorithm 1 makes that initiating from an initial condition significantly distant from the desired equilibrium point enables faster convergence to the target state while maintaining a narrower range of fluctuation in state error when satisfying origin constraints for disturbances and state.
Figure 9 presents the control input curves generated by two optimization methods. The region shown in gray is
. Obviously, the control action of the actual system (1) consistently satisfies the control constraint. Meanwhile, Algorithm 1 accelerates the convergence of the control input toward the desired equilibrium point with reduced overshoot.
For the purpose of validating the efficacy of Algorithm 1 in reducing optimization time, a statistical analysis was conducted on the optimization time. Furthermore, to investigate the trend of optimization time, a slightly larger value of
N (
N = 50) was selected during the experimentation. As shown in
Figure 10, the computational efficiency of Algorithm 1 is generally 2–3 orders of magnitude faster than HTMPC. In addition, as
N increases, the calculation time for HTMPC exhibits an exponential growth trend. In contrast, the calculation time required by Algorithm 1 shows a gradual slowing trend and eventually stabilizes within 0.16 ms. Specifically, Algorithm 1 saves an average of 339.54 times more optimization time than HTMPC. When
N = 50, A1 can save 726.23 times the optimization time compared to HTMPC.
Example 2. To further authenticate the proposed approach, consider the system of the form (1) with four state dimensions and two control input dimensions as Constraints are given by the inequalities as
The parameters are set to horizon
N = 30, weighting matrices
and
. The system is simulated according to the provided initial condition
. The other parameters of the system are under the same conditions as those in Example 1. Algorithm 1 will be implemented in this system to test its control performance for large-scale systems. Furthermore, the final DNN structure is determined by comparing the Euclidean norms of state errors generated when applying different deep neural network architectures, as illustrated in
Figure 11. Specifically, the chosen DNN configuration comprises a symmetric deep neural network with eight hidden layers, each containing 14 neurons.
The Euclidean norm
is employed to depict the trend of state error changes. As indicated in
Figure 11, it is observed that when applying a symmetric neural network with eight hidden layers, the system’s state error is generally more minor and converges within the neighborhood of zero more quickly.
Figure 12 depicts the state variable curves for each dimension. The figure demonstrates that the time-varying nominal system obtained by online learning results in a slight error and shorter adjustment time during the convergence of the nominal state. The translucent area in these figures represents the range of error fluctuations; evidently, Algorithm 1 generally yields a bound on state errors than HTMPC, indicating greater flexibility in scaling the state tube. Furthermore, in
Table 2, a visual comparison is performed using specific data to effectively demonstrate the error-constraining capabilities when evaluating the tracking performance of the actual system against the nominal system, employing Algorithm 1 and HTMPC. In order to mitigate the extreme influence of outliers, we opted for the mean squared error (MSE), known for its numerical stability, as the metric for assessing the tracking performance.
The utilization of Algorithm 1 for controlling a 4-D system, as illustrated in
Table 4, leads to a minor MSE between the nominal and actual states across all four dimensions. The average MSE of the four dimensions is reduced by 67.86% when Algorithm 1 is employed, compared to its counterpart HTMPC. Consequently, the implementation of Algorithm 1 ensures a closer approximation of the actual state to the nominal state with reduced error.
Moreover, the time-varying nominal system generated by Algorithm 1, as depicted in
Figure 13, exhibits enhanced control input stabilization capabilities with a faster convergence rate and reduced overshoot.
From a computational perspective, Algorithm 1 exhibits more pronounced advantages regarding computational efficiency for large-scale systems. As illustrated in
Table 5, it can be observed that the proposed method significantly reduces the computation time to less than 6 ms when applied to four-dimensional input systems. In contrast, the HTMPC approach requires a longer computation time. On average, Algorithm 1 achieves optimization up to 7218.07 times faster than HTMPC.