1. Introduction
The problem of designing optimal feedback to stabilize dynamic systems can be regarded as one of the fundamental issues of control theory. In particular, for a class of time-invariant linear (LTI) systems, this problem is already well recognized. One of the essential control design approaches to stabilize these systems is based on the minimization of an energetic-like quadratic performance index involving the state trajectory and input, which leads to the linear-quadratic regulator commonly known as LQR. This approach has a long and rich history, dating back to the early work of Kalman [
1], who discussed the problem of optimal feedback, providing the design equations for LQR.
Feedback design becomes more challenging in the case of nonlinear systems, for which the search for an optimal controller can be a complex problem that cannot be solved by analytical methods. One important tool in control design is to resort to a linear approximation of the nonlinear dynamics around the equilibrium point under condition that the corresponding linear model is controllable in the sense of Kalman. A known limitation of this approach is the local nature of the solution, which imposes restrictions on the set of admissible initial conditions. In such circumstances, the LQR approach is capable of providing only suboptimal results. Despite that, LQR still constitutes an essential method for dealing with the control of nonlinear systems. In particular, it is able to guarantee a local exponential convergence, that provides robustness to some class of uncertainties [
2]. For this reason, the method is of great importance in robotics, where the predominant group of mechanical systems exhibit strongly nonlinear properties. Recently, one can find publications that discuss applications of the LQR method, often in the context of complex hybrid techniques for the control of manipulators with rigid or compliant joints [
3,
4,
5]. In particular, LQR-based control solutions have been proposed for a class of underactuated systems [
6,
7,
8,
9]. In some cases LQR approach is used as a tuning method for strictly nonlinear controllers, see for instance [
10]. Modifications of LQR for nonlinear systems have been considered, such as the state-dependent Riccati equation (SDRE) approach [
11] and its various implementations [
12,
13,
14]. In addition, techniques inspired by differential dynamic programming [
15,
16] lead to iterative LQR (iLQR) [
17,
18], for which the dynamics are linearized in a sequence and the cost function about a given nominal trajectory is computed to find an LQR control policy. Based on a similar idea, the LQR-equivalent of Kalman smoothing has been reported in [
18].
In this paper, we refer to nonlinear control theory and use the concept of feedback-equivalent control systems [
19] in order to improve the performance of the LQR design by extending the set of feasible initial conditions that define the so-called basin of attraction. The considered methodology is to use a linear state feedback designed for the linear approximation of a feedback-equivalent control system, and determine a stabilizer taking advantage of state and input maps. A key ingredient is the transformation of the original dynamics to a form that exhibits better characteristics for synthesizing a linear regulator. In particular case, the equivalent dynamics can be even linear [
20], which can considerably improve the controller performance in a given subset of the state space at which linearization is possible. In this context, it may be important to estimate the area of convergence. An important tool here is Lapunov analysis [
21,
22] and the use of nonlinear numerical and analytical methods such as the sum of squares (SOS) [
23,
24], which allow for a less restrictive approximation.
From the point of view of control objectives, the design of the LQR should take into account the quality index determined for the original system, i.e., taking into account the state of this system and the energy expenditure associated with the actual input. Therefore, in the case of the feedback design based on the transformed system, the question of the LQR tuning criterion seems to be important [
25,
26]. This issue is analyzed in this work. We show here a solution to determine the gains that ensure comparable dynamics of the closed-loop system in the vicinity of the set point regardless of the state and input representation adopted. In addition, we provide an equivalent quadratic form that describes the optimization criterion with respect to the transformed dynamics.
In this paper, the control methodology considered is used to design a Pendubot stabilizer. This system, along with Acrobot and the inverted pendulum, can be considered as a benchmark underactuated system in robotics [
27,
28,
29]. We deal with the stabilization of the Pendubot at up-right position without taking into account the swing-up problem. Instead, the problem investigated here is focused on the design of a smooth state feedback at a neighborhood of the desired point taking advantage of the concept of feedback equivalent control systems. Although, the linear approximation of the Pendubot at an equilibrium point is controllable, the system is not fully linearizable using state and input transformations. Furthermore, as discussed precisely in [
30], there are also significant obstructions in the application of the partial linearization approach, while this method cannot be directly employed for stabilization due to the presence of singular points [
31].
To investigate the effect of the state-space representation on the characteristics of the closed-loop system, the original Langrange dynamics of the Pendubot is transformed into two alternative forms. These forms take advantage of the so-called quasi-velocities, which in analytical mechanics are understood as linear combinations of generalized velocities with coefficients that are functions of the generalized coordinates [
32,
33]. In the first form, the inertial normalized quasi-velocities (NQV) proposed by Jain and Rodriguez in [
34,
35] which comes from the factorization of the inertia matrix, are used. In particular, the transformations proposed for the Pendubot to design a swing-up controller [
36], in this paper are used for the stabilization task. It turns out that quasi-velocities along with the nonlinear transformation of the coordinates can be used to represent the Pendubot dynamics in the so-called normal form [
37]. Such a form, considered among others in classification problems, highlights in an organized way the essential features of a nonlinear dynamic system [
30].
The new contributions to this paper include the following:
Comparison of Pendubot mathematical models using different representations, including application of quasi-velocities;
Synthesis of sub-optimal controllers according to the LQR strategy ensuring equal dynamics at the equilibrium point;
Simulation comparison of the controllers and determination of the convergence area under constrained input conditions;
Conducting experimental tests and obtaining results illustrating properties of controllers.
It is noteworthy that another purpose of the work is to adapt the control algorithm to the real system and to show reproducible results. To the authors’ knowledge, in many publications on stabilization of Acrobot and Pendubot-type mechanical structures, real-based models are not explicitly investigated and primarily the basic models proposed in Spong’s works are recalled. Often, other works do not contain sufficient information about the model and controller parameters, or are tested for a non-physical system. Our aim is to dispel these doubts through experimental verification and research on stability or the area of convergence of the algorithm. Therefore, the experiments have been carried out for a system that can be built from components of a commercially available system. For this reason, the results shown in the paper can be treated as a basis for future comparisons.
The paper is organized as follows. In
Section 2 the nominal dynamics of the Pendubot are recalled.
Section 3 deals with feedback-equivalent control systems and describes two equivalent models of the Pendubot taking into account the inertial normalized quasi-velocities (NQV) and transformation to the normal form (NF). In
Section 4, the design of the controller based on the LQR approach and its stability issues are discussed. In
Section 5 simulation and experimental results are presented.
Section 6 discusses the results obtained, and
Section 7 ends with general conclusions and plans for future research.
5. Results
In this section, a comparison of Pendubot stabilization controllers is considered. The research was conducted using both numerical simulation and experimental methods. In order to ensure that the results can be compared, the simulation model takes into account the properties of the laboratory system used in the experiments. Its parameters are summarised in the
Table 3. Furthermore, the torque input
u was saturated according to the DC motor model. The saturation level and resulting the maximal motor input voltage is equal to 10 V.
The laboratory system shown in
Figure 1b is build based on Quanser’s—rotary double inverted pendulum, [
42] and consists of the main unit (Rotary Servo Base Unit), which includes the motor, gear with the clearance erasing system and the encoder coupled with the motor, and the passive double pendulum module.
5.1. Simulations
Here three control design approaches are compared. The gains of the linear controller (
34) are designed according to the LQR procedure applied to the approximated model of the Pendubot
(cf.
Table 2) and taking advantage of the performance index (
32) parameterized by the following weight matrices:
,
and
. The gains of two non-linear controllers described by (
57) are established using (
58) along with (
59) collected in
Table 1.
At first attempt the convergence sets where established taking into account Lyapunov analysis considered in the proof of the local exponential stability in
Section 4.1. Lyapunov function (
38) was chosen for each closed-loop system taking into account the original and transformed errors defined by
and
, respectively. The quadratic bounds of the residual terms
and
were then numerically approximated in the assumed vicinity of the desired point. In this way, the constant
C in (
44) can be determined and the upper bound
can be computed. The set of feasible initial conditions can be found by searching for such states for which
. The sets obtained in the three cases are roughly illustrated in
Figure 3. Since the state space is 4-dimensional, the set cannot be visualized on a 2D figure. Therefore, two velocity components
and
are replaced by
presented on the z-axis.
Although the obtained results confirm the local stability of the closed-loop systems, the main task of the research is to compare the attraction basin of each controller in more realistic conditions. To achieve such a comparison, the trajectories of closed-loop systems were evaluated. Such an analysis requires many simulation trails; thus, efficient implementation of simulation models is an important issue. As a result, simulations were carried out with the use of programming tools in the C++ language, including libraries for solving non-stiff differential equations.
For each controller, a discrete set of initial configurations is defined in the form of a two-dimensional grid. Each cell of the grid corresponds to an initial condition represented by
and zero velocity
. If, for the given condition, the state trajectory converges to the desired point, this trial is considered a positive result and the initial configuration considered is added to the set representing the basin of attraction. The results obtained are presented in
Figure 4. Darker cells present an approximation of the basin of attraction, whereas white cells indicate that, for the corresponding initial condition, the control goal has not been accomplished properly.
Table 4 presents a comparison of the algorithms considered. The Area [%] index specifies the percentage of positive results related to the entire searched grid.
As part of the extended analysis, the waveforms of the robot configuration and the control input obtained during the simulations for the particular choice of initial configuration are presented along with the experimental results in
Section 5.2. To facilitate a comparison between the simulation and the experimental results, the control input is represented by the motor voltage signal instead of the torque
u. To further quantify the performance of each algorithm for the chosen initial condition, the index (
32) is computed and presented in
Table 5.
5.2. Experiment
In the considered application for hardware implementation, a dedicated LabView environment and a driver with an amplifier provided by Quanser are used together with a PC computer, whose task is supervision, monitoring, and measurement registration.
Taking into account the determined basin of attraction for each simulated algorithm, presented in
Figure 4, an experimental verification of these algorithms was carried out for a particular selection of initial conditions that belong to the obtained sets. Two different initial postures of the Pendubot were selected. In Scenario 1 the Pendubot initially is tilted to the right, while in Scenario 2 it is tilted to the left. The desired point has been chosen according to (
65). For each of the cases considered, a table containing the simulation and experiment conditions, as well as a schematic visualization of the robot initial position. Additionally, the table contains both the controller parameters used during the simulation and the corresponding settings used in the experiment. To make the presentation more clear, the following cases are distinguished:
Case A: the linear controller designed for the nominal dynamics
is used, see
Table 6 and
Table 7;
Case B: the nonlinear controller designed based on transformed dynamics
is used, see
Table 8 and
Table 9;
Case C: the nonlinear controller designed based on transformed dynamics
is used, see
Table 10 and
Table 11.
6. Discussion
The simulation results obtained show that the LQR design, taking advantage of equivalent systems, can improve the attraction basin in the task of stabilizing the Pendubot. Based on the conservative estimation of attraction sets by the Lyapunov method, the attraction basin for the classical version of LQR is the most limited. This is due to the dependence of the input matrix on the state, which introduces additional nonlinearity, shown in (
40). In contrast, for the transformed systems considered
, the input matrix is constant, leading to an increase in the attraction basin. It is worth noting that the largest volume of attraction set is obtained for system
, which is defined in terms of inertial quasi-velocities.
The results of the convergence analysis, based on extensive simulations, see
Figure 4, indicate that the considered Lyapunov-based method is too restrictive. Comparing the areas occupied by the cells corresponding to feasible initial configurations presented in
Table 4 one can conclude that algorithms based on Formula (
57) make it possible to increase the area of acceptable initial configurations from 2.5 to 4.5 times compared to the nominal case. It is interesting that the basin of attraction is the largest for the system
while the Lyapunov-based analysis suggests better characteristics with respect to system
.
The results of the simulations performed for the same initial conditions and presented in
Figure 5a,
Figure 6a,
Figure 7a,
Figure 8a,
Figure 9a,
Figure 10a,
Figure 11a,
Figure 12a,
Figure 13a,
Figure 14a,
Figure 15a,
Figure 16a confirm that the step response of closed-loop systems is similar. However, comparing
Figure 5a and
Figure 7a with
Figure 9a,
Figure 11a,
Figure 13a and
Figure 15a more thoroughly, one can state that the transient response obtained for the nominal feedback more clearly exhibits the characteristics inherent in non-linear systems, while for the nonlinear controllers the time plots are smoother and even more characteristic for linear systems. A similar conclusion can be drawn with respect to the analysis of control inputs presented in
Figure 6a,
Figure 8a,
Figure 10a,
Figure 12a,
Figure 14a and
Figure 16a. Similar values of the performance index shown in
Table 5 also confirm that the dynamics of the closed-loop system is preserved.
Based on the outcomes presented in
Section 5.2, it can be seen that the simulation results and their counterparts obtained in experiments, cf.
Figure 5a,b,
Figure 6a,b,
Figure 7a,b,
Figure 8a,b,
Figure 9a,b,
Figure 10a,b,
Figure 11a,b,
Figure 12a,b,
Figure 13a,b,
Figure 14a,b,
Figure 15a,b,
Figure 16a,b, are fairly similar. Thus, one can cautiously conclude that the mathematical model used to describe the real system is quite close to it; however, some uncertainties can affect the results of the experiment. The similarity manifests itself in terms of the signal amplitudes, but time parameters such as the regulation time is in most cases is longer during experiment. The differences can be explained by the occurrence of effects omitted in the object dynamics model, which include, e.g., static and dynamic friction (occurring in the drive system as well as during the influence of aerodynamic phenomena), backlash and spring effects, and measurement uncertainties. It was observed that the resolutions of the encoders were not enough to obtain high-quality velocity estimates. In particular, the velocity of the second joint cannot be accurately estimated. Taking into account these limitations, adjustments to the gains were required to ensure proper operation of the real system. Especially, in experiments it was necessary to decrease gains associated with velocity components due to the impact of insufficient quality of the velocity estimation. However, decreasing these gains introduces a higher oscillatory response of the closed-loop system, which can be clearly observed in
Figure 5,
Figure 7 and
Figure 13.
In the simulation and experimental tests under consideration, attention should also be paid to the problem of input saturation, which has a significant impact on the attraction set. Despite this, it can be seen that even under input signal constraints, the alternative representation of the dynamics enlarges the convergence set.
There is another issue worth highlighting. Namely, the application of variable transformation may also have the negative effect of increasing the sensitivity of the control system to measurement noise. This may explain the appearance of higher noise in the input signal
u for the cases including transformations (cf.
Figure 10b,
Figure 12b,
Figure 14b and
Figure 16b) than for the nominal case (cf.
Figure 6b and
Figure 8b).
7. Conclusions
This paper considers the issue of LQR design for nonlinear systems using a smooth state and input transformation. The proposed design methodology is considered in the Pendubot stabilization task. The properties of the controllers studied were investigated in a simulation environment using experimental tests. Despite some limitations and technical imperfections of the experimental stand, one can conclude that the considered methods to some extent are robust to unmodelled effects and make it possible to provide satisfactory results in real applications.
The results of the tests carried out allow for a hypothesis that the controller using quasi-velocities allows one to increase the range of stabilizer convergence while maintaining the same dynamics of the closed system at the desired point. This property results from the introduction of nonlinearities in the stabilizer equation, which have a positive effect on the properties of the closed-loop system.
To the best of the authors’ knowledge, this work compares for the first time the properties of LQR controllers using different representations of Pendubot dynamics. The detailed forms of transformations and linear approximations given can be regarded as ready-made procedures that can be applied to stabilize similar mechanical systems in robotics.
In the future, the control methodology discussed in this paper can be applied for trajectory tracking, making it possible to also consider swing-up control problems addressed for a class of underactuated systems.