1. Introduction
In optimal control theory, one defines optimal control problems to determine control signals that minimize a performance index as well as satisfy the physical constraints [
1]. Whereas it is difficult to find explicit expressions for the optimal control of nonlinear systems, linear systems can provide explicit equations for the optimal control input. For the advantage of using explicit equations, many works linearize nonlinear systems at the equilibrium point. One of the widely used state feedback control problems is defined for a linear system, where a quadratic performance index is defined using the state, control, terminal state, and associated unspecified weight matrices, called the linear quadratic regulator (LQR) problem. In general, the weight matrices are adjusted by users to find the optimal control and corresponding states that satisfy the requirement or desired behavior set by users. For instance, if it is important that the intermediate state is to be small, a large state weight matrix can be selected. The other weight matrices, such as the terminal state and control weight matrices concerning the terminal state and control effort, are selected in a similar manner. Although a key of solving the LQR problem is to determine the weight matrices properly, most studies have determined these matrices by trial-and-error, which is time-consuming work. For instance, considering a second-order system with one degree-of-freedom (DOF) requires determining seven symmetric components of the weight matrices (i.e., three for the state weight
, one for the control weight
, and three for the terminal state weight
). However, if one considers a two-DOFs system, the number of the symmetric components in the weight matrices increases to 23 (i.e., 10 for
, three for
, and 10 for
). That is, the number of components to be determined in the weight matrices exponentially increases if the size of the system increases. Moreover, determining the weight matrices generally depends on the engineers’ knowledge, and this approach does not always guarantee that the system response will meet the specified performance. For this reason, it is critical to design a proper optimization strategy for determining the weight matrices while satisfying the user-specified performance and reducing the engineers’ effort to find the proper weight matrices, especially for high DOF applications.
Many researchers have proposed weight matrices selection approaches to reduce the effort for determining weight matrices using bio-inspired heuristic algorithms. Among a number of bio-inspired optimization algorithms, some researchers employed a genetic algorithm (GA) to optimize the weight matrices for various system models and fitness functions. Marada et al. [
2] formulated the LQR problem based on the linearized inverted pendulum system and found the state and control weight matrices using the GA. Their approach minimizes a fitness function that consists of a weighted summation of the settling time and the control input considering the location of system closed-loop poles for system stability. Dhiman et al. [
3] also utilized the same system model and optimization algorithm, but the considered fitness function was the quadratic performance index. The GA-based weight optimization approach was also utilized for the vehicle suspension system by Yu et al. [
4] and linearized spacecraft attitude dynamics by Kukreti et al. [
5]. For the vehicle suspension system, the fitness function was composed of suspension performance indices, such as vertical acceleration of the car, dynamic tire load, and suspension working space. Similarly, Kukreti et al. considered the control performance for the fitness function, which is defined as the weighted summation of the final time and attitude and angular velocity errors. In addition to the GA, to optimize the weight matrices of the LQR problem, scholars applied many different optimization algorithms (e.g., a particle swarm optimization [
6], its variations [
7,
8], ant-lion algorithm [
9], ant colony optimization [
10], Jaya’s algorithm [
11], and bat algorithm [
12], etc.) into diverse dynamic systems, and more studies including the mentioned research are summarized in
Table 1. The aforementioned approaches commonly considered the diagonal components of the state and control weight matrices, as well as the algebraic Riccati matrix equation. Also, these approaches, based on bio-inspired optimization algorithms, take offline optimization which requires a huge computational burden.
Unlike weight selection using bio-inspired heuristic algorithms, some scholars proposed analytical approaches to determine the weight matrices. Elumalai and Raaja [
13] proposed a time-domain-based algebraic Riccati and Lagrange optimization technique to determine the weight matrices, where they satisfy a pre-defined performance for a two-dimensional torsion system. Sarkar and Dewan [
14] presented a pole-placement approach that satisfies user-defined time domain specifications for generating the state-feedback gain for an inverted pendulum system. Yang [
15] presented a pole assignment design of a quaternion-based spacecraft attitude control problem for generating the weight matrices by considering a balance between the performance and fuel consumption.
Table 1.
Relevant research papers.
Table 1.
Relevant research papers.
Refs. | System Model | Approach | Consideration |
---|
[2] | Inverted pendulum | Genetic algorithm | - Weighted summation of settling time and control input |
[3] | Inverted pendulum | Genetic algorithm | - Performance index of the LQR problem |
[4] | Active suspension system | Genetic algorithm | - Vertical acceleration of the car, dynamic tire load, and suspension working space |
[5] | Spacecraft attitude dynamics | Genetic algorithm | - Weighted summation of the final time, attitude, and angular velocity error |
[6] | Inverted double pendulum | Particle swarm optimization | - Performance index of the LQR problem |
[7] | Inverted pendulum | Adaptive particle swarm optimization | - Integrated state error |
[8] | Semi-active suspension system | Improved particle swarm optimization | - Vertical acceleration, suspension deflection and tire dynamic load of the passive suspension vehicle |
[9] | Inverted double pendulum | Ant-lion algorithm | - Crane position, upswing and downswing angle |
[10] | Vehicle suspension system | Ant colony optimization | - Suspension travel, suspension velocity, tire deflection, and tire velocity |
[11] | Deregulated power system | Jaya’s algorithm | - State error |
[12] | Vehicle suspension system | Bat algorithm | - Ride comfort and passenger safety |
[16] | Vehicle suspension system | Adaptive predator-prey optimization | - Integral square error of the state |
[17] | Two-axis CNC system | Artificial bee colony optimization | - Settling time and overshoot |
[18] | Inverted double pendulum | Neuro Evolution of Augmenting Topologies | - State error |
[13] | Two-dimensional torsion system | Lagrange optimization | - System response (overshoot, setting time, steady state error) |
[14] | Inverted pendulum | Pole-placement approach | - Time-domain specifications |
[15] | spacecraft attitude dynamics | Pole assignment design | - Control performance and fuel consumption |
Most of the existing studies only find diagonal components for the state and control weight matrices using bio-inspired heuristic optimization algorithms along with the infinite-time algebraic Riccati matrix equation, and the limited number of studies deals with analytical methods for finding diagonal components of the state and control weight matrices as shown in
Table 1. Considering only diagonal components for the weight matrices may not provide better control performance if the states of the considered system are coupled with each other [
19]. On the other hand, this work optimizes all symmetric components of the weight matrices in the LQR problem to minimize the state and control values at the final time by utilizing analytic gradients. Also, the optimization process only contains algebraic equations for the principal equations and the partial derivatives, which are developed by employing the steady-state and time-varying terms for the time-varying differential Riccati matrix equation [
20] and utilizing the coordinate transformation for the states [
21]. The main contributions of this work include the following:
We design a gradient-based optimization strategy for determining diagonal and off-diagonal components of the state, control, and terminal state weight matrices; to provide more flexibility for optimization.
We find only algebraic equations-contained closed-form solutions for the principal equations and the sensitivity partial derivatives, including the time-varying Riccati matrix equation, which require less computational cost for the optimization.
As a result, no numerical integration is required for solving the finite-time fixed-time optimal feedback control as well as optimizing the weight matrices. In simulation studies, two most widely used numerical examples are presented to demonstrate the effectiveness of the proposed approach, for second-order linear differential systems with one and two DOFs.
2. Formulation of Linear Quadratic Regulator Problem
The optimal state feedback control problem for a linear time-invariant and continuous-time model with a finite and fixed-time, called the LQR problem, is to find the control inputs that minimize the following performance index [
22]:
subject to
with the initial condition
and terminal condition
. The optimal control is obtained by [
22]:
where the time-varying Riccati matrix is found by integrating the following backward in time:
This work assumes that
is controllable and
is observable. Note that
with
is defined as
and can be obtained by Cholesky decomposition. In addition, the proof of the stability of the given closed-loop control system is discussed in
Appendix A. To find the optimal control
, indeed, it is required to properly select the weight matrices (
Q,
R, and
), which are the user-defined parameters in general. Then, one can compute the time-varying Riccati matrix, state, and control input through numerical integration. If the user-defined requirement, such as the norm of the states less than a certain value at the final time, is satisfied, one can obtain the state with the optimal control input as shown in
Figure 1. In general, the weight matrices are determined by the engineers’ knowledge and trial-and-error. The engineers need to change the weight matrices iteratively until the requirement is satisfied. However, it is not only time-consuming work but also hard to guarantee that the system response satisfies the specified performance.
3. Optimization of Weight Matrices
To efficiently determine the weight matrices of the LQR problem, this work proposes an optimization process for the weight matrices. Unlike the aforementioned studies in the introduction, this work considers all symmetric components of the weight matrices (
Q,
R, and
) in the formulation to gain more flexibility in optimization. Here, the unknown parameters for the optimization process are the symmetric elements of
Q,
R, and
are as follows:
For the simple notation, the symmetric components of the weight matrices are gathered into one vector as
For optimization, this work introduces a new variable consisting of the state and control, called an augmented state, as follows:
The goal of the proposed optimization process is to find the optimal weight matrices that minimize the performance index while
at the final time is to be zero. Thus, for Equation (
9), one applies Taylor expansion at
as a function of the weight matrices’ symmetric elements defined in Equations (
5)–(7), which leads to
It is important to note that
indicates the partial derivative of an arbitrary variable
with respect to an arbitrary variable
. That is, “,” between the variable and the subscript indicates the partial derivative. Targeting
and collecting the partial derivatives in Equation (
10) into a global Jacobian matrix
J leads to
where
Here, each element in the correction vector
represents changes in each symmetric element of the weight matrices. Since the number of unknown parameters (
) is larger than the number of final conditions (
), one can obtain the solution for
by minimizing the following:
The necessary conditions for the optimization are derived as
Then, manipulating two necessary conditions, the solution for
is given by
and the minimum norm optimization solution for the parameter correction vector is obtained as
Therefore, the symmetric components of the weight matrices are updated as
Figure 2 displays the procedure of the proposed optimization process.
Once initial weight matrices are assumed, one needs to confirm whether
is equal to
or not. This condition comes from deriving the closed-form solution for the time-varying Riccati matrix that will be explained in
Section 4.1.1. If
at the first step, users need to redefine the initial
. However, if
in a second or higher iteration,
obtained is slightly changed by replacing it with
. Note that
is a user-defined small constant value, which is considered not only to avoid the equality condition but also to use the values as close to the updated values as possible. After that, one requires to verify whether the weight matrices satisfy the definiteness condition or not. That is,
Q and
must be positive semi-definite, and
R must be positive definite. In particular, after finding
D matrix via Cholesky decomposition, the observability for
pair is evaluated. After confirming the definiteness condition for each updated weight matrix, one finds
,
, and
, and evaluates
using the confirmed weight matrices only. Note that it uses previous weight matrices if the updated weight matrices violate the definiteness conditions. For instance, if all weight matrices violate the definiteness condition, one should try different initial weight matrices. However, for example, if only the updated
Q violates the definiteness condition, it uses
Q obtained from the previous step and the updated
R and
for the next step. Next, it evaluates the two-stage stopping conditions. First, it terminates the process when the current iteration number exceeds the maximum iteration number, which is defined by users, and then repeats the entire process with the newly assumed initial weight matrices. This means that the initial guesses for the weight matrices used are not properly selected, and it requires starting from new initial guesses to find the optimal weight matrices. Otherwise, it evaluates the second stopping condition. If the resulting solution meets the requirement defined by users (
), it terminates the process and endorses the weight matrices as optimal. If not, it updates the weight matrices. To proceed with the update process, one evaluates the Jacobian
J. After computing
and updating
using
, one converts
into the corresponding updated weight matrix. This procedure continues until the resulting solution satisfies the requirement.
The proposed optimization process requires finding the state and control using Equations (
2)–(
4) to evaluate the augmented state at the final time and computing the partial derivatives of Equations (
2)–(
4) with respect to each symmetric component of the weight matrices to determine the Jacobian matrix in Equation (
12). Then, the partial derivatives for the state, control, and time-varying Riccati matrix with respect to each symmetric element of the weight matrices
(for
) are derived as
where
and
, because
and
are constants. So, to evaluate the augmented state, it first numerically integrates the differential Riccati matrix equation in Equation (
4) backward in time and then integrates the controlled system response computed by substituting the optimal control defined in Equation (
3) into Equation (
2) forward in time. In the same manner, to compute the Jacobian matrix, it integrates the partial derivatives in Equations (
20) and (22) forward in time. In fact, this process is technically straightforward. However, it requires the exponential computational load as the dimension of the state increases because of multiple numerical integrations. Hence, this work develops closed-form algebraic expressions for the principal equations and partial derivatives. The use of closed-form solutions not only eliminates the need for introducing numerical integration methods but also allows computing analytic sensitivity partial derivatives. Otherwise, it would require extensive numerical integration calculations.