1. Introduction
As an efficient way to access space, flight vehicles have attracted considerable attention owing to their high civilian and military value [
1,
2]. The rapid development of various flight vehicles provides efficient and convenient tools for multiple missions, such as remote attacks, autonomous detection, and material transportation. Due to their potential applications in industry, agricultural military, and other fields, fruitful research results have emerged in relation to flight vehicles. The problems of stability analysis, control strategy design, and the formation control of flight vehicles have been studied by scholars at home and abroad [
3,
4]. For instance, tracking control in the event-triggered case was proposed in [
5]; the observer-based backstepping control strategy was presented in [
6] and an extended state observer was proposed in [
7]. However, the characteristics of high nonlinearity, complex dynamics, strong couplings, and model uncertainties seriously influence the performance of flight vehicles, putting forward higher requirements for reliability and control performance.
Nonlinear systems have created many concerns due to the nonlinear nature of most practical systems [
8,
9]. The most popular approaches to analyzing nonlinear systems can be composed of two parts: the nonlinear design method and the linear design method [
10]. The nonlinear properties of nonlinear systems are considered in the nonlinear design method, such as the backstepping and adaptive control methods. The linear design method approximates the nonlinear systems with linear systems around their operation points based on Jacobian linearization. Therefore, the controller design and analysis for complex nonlinear systems can be converted into designing controllers for linear systems, such as the gain-scheduled method, the proportional-integral-derivative (PID) method, and robust control. The linear design method can provide solvable results and stability guarantees for complex nonlinear systems. As an effective tool with which to address the design of complex nonlinear systems, switched systems theory has been investigated by researchers, and there is much literature on modeling, stability analysis, switching strategy design, fault detection, and fault-tolerance [
11,
12]. The switched systems establish the connection between complicated nonlinear and simplified linear systems [
13], attracting a great deal of attention. In recent literature, various interesting results have been presented for various problems of switched systems. As a fundamental controller design problem, stability analysis has been fully studied, and several methods have been proposed. For example, the common Lyapunov function (CLF) method was presented for the switched systems with arbitrary switching, meaning a CLF was shared for all the subsystems. Therefore, the switching among subsystems cannot increase the energy. However, in most practical situations, finding a CLF for all the subsystems is a challenge, limiting the CLF method’s applications. Moreover, the switching logic depends on the time, states, or their combination in many practical applications, motivating the investigation of average dwell time (ADT) and MDADT methods. This means that the subsystems should dwell long enough in the subsystems with poor performance, which is mainly applied to restricted switching. Compared with the existing CLF approach, the ADT and MDADT methods lead to less conservative results. In [
14], the quantized
filtering problem was investigated for switched T-S fuzzy systems, and the ADT method was utilized to guarantee the exponential stability of the error system with a given
performance. The quantization phenomenon and parameter perturbations were considered, and the fuzzy-based Lyapunov function approach was provided. In [
1], the stability of highly nonlinear switched stochastic systems containing the time-varying delay was analyzed. Lyapunov function and ADT were employed to derive sufficient conditions to ensure the
stability to avoid the inappropriate response induced by the time delay. The proposed method extends the stability results to the environment with a time delay, which is more applicable in a practical environment. Moreover, it can be deduced that the common dwell time can be employed for all the subsystems, indicating the worst situation, which results in conservativeness. The MDADT approach was presented in [
15] to derive narrower bounds on dwell time. The features of every subsystem are considered, which has its own dwell time. This means that the dwell time depends on the system mods, releasing the ADT method restrictions [
16]. Therefore, the MDADT method has been widely applied for stability analysis and stabilization. In [
17], the fault estimation observer was proposed. The MDADT method was presented to realize the augmented system’s stability and
performance. Compared with the conventional ADT approach, fewer conservative results were realized. In [
18], the stability and robust control for switched systems were verified. The multiple discontinuous Lyapunov function and MDADT were integrated to guarantee the stability and prescribed weighted performance. It turns out that the proposed method realizes small bounds on dwell time. In [
19], the event-triggered exponential
filter design was investigated, and the MDADT approach was adopted to derive the exponential stability conditions. In this study, the event-triggered communication scheme is applied to promote the resource limitation through the network. The time-varying delay and bounded disturbance are considered, and the LMI technique is employed to derive sufficient conditions to attain the desired efficiency. Compared with the conventional results, the presented approach is more applicable and less conservative. One of the primary purposes of the switched systems is to obtain fewer conservative results and narrower dwell time bounds, which has motivated researchers in recent years.
In most practical systems, a time-varying delay is inevitable due to the transmission limitation in the flight vehicles network, degrading the performance and causing asynchronous switching. The asynchronous switching indicates that the controller switching lags behind the system mode switching. Therefore, we can see unmatched and matched periods in all subsystems, which will increase the Lyapunov energy in the unmatched periods. Thus, the energy function increases in the unmatched periods and decreases in the matched ones. There are many results of time delay and asynchronous switching. In [
12], the stability and stabilization problems were studied for switched systems with impulsive switching signals under asynchronous switching. A novel Lyapunov-like function was established, and the conditions to ensure the system’s exponential stability were given by the edge-dependent switching signals. In [
20], finite-time stabilization and finite-time bounded stabilization were investigated for switched systems. The asynchronous environmental switching was considered, and the sufficient stabilization conditions were presented as nonlinear differential matrix inequalities. The results in the paper validate that asynchronous switching can affect the system’s dynamic performance, and it is essential to avoid the undesirable response induced by asynchronous switching. Moreover, the multiple event-triggered strategies for switched systems with asynchronous switching were investigated in [
21]. The controller-mode-dependent Lyapunov function was established using an asynchronous switching strategy. The multiple event-triggered schemes were applied, and the stability criteria were provided based on the ADT method. The state feedback controller was proposed to avoid the Zeno behavior. Additionally, the conventional method for time-delay systems lies in the Lyapunov–Krasovskii function. The time-varying delay bounds were considered, and the robust stability was realized for the worst case of unknown delay. In [
22], the formation-containment control problem was verified for multi-agent systems (MAS) containing time-varying delays and switching topologies. It is assumed that the leaders can communicate through switching topologies with time-varying delays. The Lyapunov–Krasovskii function method was applied to ensure the convergence of the formation-containment error. An edge-based state observer was developed to evaluate the MAS states involving the time delay. The results demonstrate that the Lyapunov–Krasovskii function method can successfully solve the stabilization problem for time-delay systems. In [
8], the adaptive fuzzy control problem was investigated for switched nonlinear systems containing the input delay. A nonlinear disturbance observer for arbitrary switching systems was presented, and a piecewise switched adaptive law was developed. Padé approximation and dynamic surface control methods were proposed to address the input delay problem. Unlike the traditional method, the proposed method realizes fewer conservative results and more relevant results. From the statement mentioned in the paper, it is evident that developing more applicable and less conservative methods for switched systems is interesting and necessary.
It is noticed that the model-dependent methods are proposed assuming that the structure or bounds of model uncertainties are known in prior [
23,
24]. However, in many situations in a realistic environment, the information of uncertainties cannot be obtained, which motivates the studies on model-free methods. With the development of computational ability, intelligent methods have been applied to the control issues of flight vehicles [
25]. Among the artificial methods, deep learning and reinforcement learning have been extensively employed in many military and economic fields. The deep neural networks and reinforcement learning were utilized in deep learning to realize online fitting and achieve better performance by trial and error. Therefore, we can see that data-based algorithms like deep learning and reinforcement learning can improve the controller efficiency in the presence of system uncertainties. Deep reinforcement learning (DRL) integrates deep learning and reinforcement learning benefits. This was widely studied, and fruitful results emerged. In [
26], the DRL algorithm was employed to design the missile’s guidance law, formulating a Markovian decision process in which the reward function was designed to realize the trade-off between accuracy, energy consumption, and interception time. The deep deterministic policy gradient algorithm was adopted, and the guidance gain was scheduled online. In [
27], the PID controller was combined with the DRL algorithm to improve the control performance. The DRL algorithm was adopted to compensate for the system uncertainties. However, in the existing literature on DRL, we can see that ensuring the algorithm’s convergence is a challenge. Accordingly, an algorithm should be developed to simultaneously ensure stability and dynamic performance.
According to the above discussion, it can be deduced that the performance enhancement and stability preservation problems should be verified simultaneously. The controller design problem for switched systems in more applicable environments has not entirely been verified. The design flexibility and control performance can be enhanced by deriving the narrower dwell time bounds. Moreover, it is essential to incorporate the benefits of conventional robust control with an intelligent algorithm. Therefore, the current study investigates the asynchronous tracking control problem and optimization for switched flight vehicles with time-varying delays. The nonlinear dynamic model and Jacobian linearization can establish the flight vehicles’ switched model. The proposed controller comprises a dynamic-based sub-controller and a learning-based sub-controller. The nominal tracking controller is proposed considering the asynchronous switching induced by the time-varying delay. The MLF and MDADT approaches are integrated to ensure stability and attenuation performance. In order to ensure stability and transient performance simultaneously, a learning-based sub-controller is proposed to compensate for the system uncertainties. The DQL algorithm is provided to achieve better convergence. Therefore, the essential novelties of the current study are summarized as follows: (1) a more applicable and less conservative asynchronous tracking controller is proposed for switched flight vehicles with time-varying delays by utilizing MLF and MDADT approaches. The LMI approach is adopted to extract sufficient conditions ensuring the stability and prescribed attenuation index; (2) the presented tracking controller consists of a dynamic-based sub-controller and a learning-based sub-controller. The former is designed for the nominal case, and the latter is provided to compensate for the model uncertainties. The advantages of conventional tracking control and an intelligent algorithm are combined; (3) the DQL is adopted, and the online scheduling is described with a Markovian process. The controller parameters are defined as the output action to simultaneously realize stability, robustness, and dynamic performance.
The remainder of the current paper is arranged as follows. The problem formulation is presented in
Section 2. In
Section 3, the intelligent
tracking controller is given. The simulation results are given in
Section 4 to evaluate the efficiency of the presented approach. Finally, the paper concludes in
Section 5.
2. Preliminaries and Problem Formulation
The current study considers the discrete-time switched systems with the time-varying delay as:
where
stands for the state vector;
describes the input signal;
stands for the output signal;
describes the external disturbance with
;
describes the time-varying delay in the systems caused by the network’s transformation limitation. We define
,
,
,
and
as the state-space matrices with proper dimensions;
represents the piecewise continuous switching signal.
We define the command signal as
. Therefore, the tracking error is described as:
Remark 1. The command signal is bounded with .
The tracking control problem of switched systems in (
1) can be described as the controller design problem, such that
The tracking error integral is defined in (
4).
Then, the tracking controller for (
1) can be proposed as:
where
and
are unknown parameter matrices to be determined.
Due to the time-varying delay in the system, the tracking controller switching always lags behind the system mode switching. All subsystems have unmatched and matched periods. The activated time instant of the ith subsystem is defined as , and the activated time instant of the corresponding controller is defined as , where denotes the unmatched period length in the ith subsystem.
Now, the following closed-loop switched systems can be derived:
where
4. Numerical Example
The flight vehicles studied in this paper are the HiMAT vehicles, which can be modeled as switched systems in the flight envelope. This paper considers the switching signals between subsystems 1, 2, 8, 12, and 18. The flight conditions of the operating points are given in
Table 1:
Therefore, the sampling time is chosen as
. The switched model of longitudinal motion dynamics can be described as:
where
stands for the state vector, and the initial value of
is chosen as
.
The current paper considers the switched systems parameters as:
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
, and
. Therefore, the MDADT in (
16) can be calculated as follows:
The ADT can be obtained as .
It is evident that the dwell time obtained by the MDADT approach is less than that obtained by the ADT method. Therefore, it leaves more room for controller design. The system can stay in the modes with a better performance for a longer time.
We set the attenuation performance
as 0.7. According to the conditions mentioned above, the nominal tracking controller can be derived using Theorem 3. The time delay and switching logic are presented in
Figure 1 and
Figure 2, respectively.
Firstly, we give a comparison between the ADT and MDADT methods, as described in
Figure 3,
Figure 4,
Figure 5 and
Figure 6. It is evident that the MDADT method can realize a better performance compared with the conventional ADT method. The attack angle response based on the MDADT method can avoid the undesirable response induced by the external disturbance. The generated controller can improve the robustness to the environment. Moreover, the actuator response is admissible.
Figure 7,
Figure 8,
Figure 9,
Figure 10 and
Figure 11 compare the MDADT approach with the presented method. It can be seen that the transient performance and robustness are enhanced using the DQL algorithm. The controller can compensate for the adverse effect induced by the system uncertainties. A better response of attack angle can be obtained based on the compensation of the uncertainties. Moreover, the actuator response of the proposed controller is admissible.
Accordingly, the proposed method can realize a better transient performance and robustness than the conventional method. The fewer conservative results can be derived through the MDADT approach. The presented method can eliminate the undesirable response induced by time-varying delays and asynchronous switching. The learning-based tracking controller compensates for the system uncertainties. Therefore, we can see that the presented method can simultaneously ensure stability, robustness, and transient performance.