1. Introduction
The development of human society is inseparable from the exploitation and use of various resources. With land resources being gradually exhausted, attention has been turned towards the deep ocean. Numerous manganese nodules, deep-sea oil and gas, hydrothermal deposits, and gas hydrates exist in the submarine world. Thus, the efficient and safe exploration of such resources has become an urgent matter. Autonomous underwater vehicles (AUVs) are an auxiliary intelligent tool for ocean exploration, which play a key role in the application of ocean environment observation [
1,
2], seabed geomorphology measurement [
3], and military reconnaissance [
4].
Owing to their power, cable-free autonomy, good masking performance, and wide search range, AUVs offer very wide application prospects in both the civil marine and coastal defense military fields. In particular, AUVs are essential in deep-sea underwater searching. As early as 1963, the “Alvin” and “Kov” underwater robot cooperation of the US to search for and salvage a lost hydrogen bomb in the Spanish trench was a successful case. In early 2014, the US used the “Bluefin Tuna” AUV to conduct a large-scale underwater search at a depth of 4500 m in the Indian Ocean for the missing Malaysia Airlines flight MH370.
Ocean Infinity, which was founded in the US in 2017, has rapidly expanded into a leading global maritime technology company. In early 2020, Ocean Infinity launched Armada, which is a new ship technology and data company, and introduced fleet robotics. The Armada fleet can carry ROVs (remote operated vehicles) and AUVs, as well as a variety of other sensors or equipment, thereby replacing traditional support vessels for seabed mapping, oil field material transport, subsea construction support, salvage and rescue, military, and other activities.
In 2011, researchers at the Ganz Artificial Life Lab in Austria unveiled the largest cluster of underwater unmanned vehicles in the world at the time: the CoCoRo AUV cluster. The project, which was funded by the European Union and led by Thomas Schmickl, consisted of 41 AUVs that could work together to accomplish tasks, with the main purpose of being used for underwater monitoring and searching. The cluster system was scalable, reliable, and flexible in terms of its behavioral potential. The researchers studied collective self-knowledge through experiments that were inspired by behavioral and psychological science, thereby allowing for the quantification of collective self-knowledge.
Furthermore, the European Commission supported a project known as Smart and Networking Underwater Robots in Cooperation Meshes. The aim was to select, combine, and integrate different and heterogeneous communication technologies, components, and solutions to achieve the best performance for the management and control of underwater vehicles when the robots completed different missions and tasks. This project was completed by testing it in the sea.
With the increasing complexity of AUV search tasks, multi-AUV (MAUV) systems have become an important research direction in the development of underwater vehicles when it is difficult for a single AUV to complete a task. MAUV systems can provide more solutions with higher work efficiency, a higher intelligence level, and better fault tolerance compared to single-AUV systems.
MAUV systems (also known as swarm agents) have mainly arisen owing to limited technology, as the intelligence of a single agent cannot be extended. Thus, it is hoped that coordination and cooperation among multiple agents can deal with complex tasks that a single agent cannot handle. This concept has received significant attention in the scientific research and engineering circles [
5,
6].
The consensus problem at the center of MAUV cooperative control has been developed over the past several years using various methods. The consensus of the system requires a suitable control protocol to be designed so that all agents can converge to a common value under the premise that they can only exchange information with their neighbors [
7]. The consensus performance is obviously affected by the dynamics of the agents and network topology. Numerous results have been studied considering these two factors [
8,
9,
10,
11,
12,
13]. However, in physical and engineering systems, the consensus problem is expected to ensure that all agents can converge in a certain trajectory to achieve the desired goal. Rapid convergence can be achieved by designing a consensus algorithm to determine the optimal weight matrix [
14]. An appropriate function was constructed by maximizing the second smallest eigenvalue of the Laplacian operator, thereby optimizing the consensus algorithm [
15]. Furthermore, the average consensus problem was realized by developing an optimal interaction graph [
16]. In another study, the consensus problem was expressed as an optimization problem using the linear matrix inequality method [
17]. An optimal linear consensus algorithm based on the linear quadratic adjuster has also been proposed [
18].
The cooperative mission was studied by [
19], analyzing the approach to solve the averaging problem through the application of assumptions that were based on linear iterative form. In [
20], each group member interacted with its neighboring states by a linear stochastic matrix until all of them reached the same limit. In [
21], a distributed algorithm was generalized for the consensus in fixed topology. In [
22], an arranged motion of particulars in a group was controlled by a specific model in order to update the information from the closest neighbor. An optimistic optimization approach with simple black box was devised in a form of a non-linear structure for controlling the agents’ behavior in [
23].
This research focuses on the analysis of consensus models, the design of consensus protocols, convergence, equilibrium, and application prospects. Many scholars have applied different model methods and carried out in-depth research and expansion of consensus theory from different directions. The consensus has developed rapidly and yielded fruitful results and has been widely applied to a variety of scientific and engineering problems, including synchronization of coupled oscillators, formation control, swarm control, optimal cooperative control, clustering, sensor networks, etc. [
24,
25].
In [
26] the problem of consensus in multi-AUV recovery systems with time varying delays was explored. A new consensus control protocol for formations was proposed. In [
27] the problem of multi-AUV formation control under constraints such as bounded communication delays and nonconvex control inputs was studied. In [
28] an improved event-triggering mechanism to coordinate the communication in heterogeneous AUVs was explored. In [
29], some effective criteria for consensus of a class of non-smooth opinion dynamics over a directed graph were presented. In [
30] the integral sliding mode control protocol was proposed to address the formation control of multi-robot systems. In [
31] the output consensus issue for linear multi-agent systems was addressed. It is clear that the convergence time of the system depends on the initial conditions.
Coordination among multiple agents is critical, but the obstacle avoidance strategies that were designed previously neither considered the optimality nor the interaction topology (consensus) issues.
The contributions of this paper are described as follows.
(1) A new consensus algorithm was studied for the single-integrator systems in an obstacle environment. (2) A novel control approach was developed to achieve multi-AUV consensus and have the minimal obstacle avoidance cost. (3) A novel nonquadratic obstacle avoidance cost function was constructed by an inverse optimal control approach. (4) The theory in this paper was verified by practical experiments.
The remainder of this paper is organized as follows: In
Section 2, background knowledge on graph theory is presented. The consensus problem is established in
Section 3. In
Section 4, the main research of this study is outlined. The simulation and data analysis are presented in
Section 5. The preliminary verification of the method using an experiment with two AUVs is described in
Section 6. Finally, in
Section 7, the conclusions are presented.
3. Problem Specification
The AUV has a single-integrator dynamics model that is expressed as follows:
or in the form of matrix
where
in which
is the state of the
th AUV,
are the control inputs of the
th AUV,
are the aggregate states of all AUVs, and
are the control inputs of all AUVs.
Figure 1 depicts multiple AUVs that are distributed in the sea, the biggest AUV can provide power and all the AUVs can exchange data. In this figure, different colors represent the changes of water depth. MAUV systems can generally reside on the seabed for a long time. When marine geological disasters occur, an AUV will sense the occurrence of the disasters and automatically identify possible disaster sites to evaluate the overall environment. When the system senses an unusual change in the environment, the AUVs will cooperate, and in this phase, they sense the nearby AUVs through the sensors that they carry and determine the location of the collection through negotiation.
In this study, the consensus problem involves the design of a distributed control law that depends on the information exchange topology such that the states of all AUVs converge to the same value, i.e., . Furthermore, it is guaranteed that obstacles along the AUV trajectory can be avoided.
Figure 2 depicts an example scene of the consensus problem with five AUVs and one obstacle. Three zones are established: the collision, the diagnostic, and the reaction zones, which are defined as follows.
Collision zone for the th obstacle: . The AUV absolutely cannot enter the interval and each obstacle is solid.
Diagnostic zone for the th obstacle: . This is the range within which the AUV can detect only one obstacle. Outside this area is the AUV safe area, within which the AUV navigates according to coordinated commands.
Reaction zone for the th obstacle: . In this area, the AUV can sense and avoid obstacles.
Where means the radius of the obstacle, means the range that the AUV can detect the obstacle.
Accordingly, the entire safety area can be represented as and the entire outside diagnostic zone can be represented as . The symbol and superscript indicate the union and complement of sets, respectively.
The following three assumptions are included in this study:
A1. All obstacles can be modeled as spheroidal objects.
A2. .
A3. It is assumed that the topology of information exchange between AUVs is unconnected.
According to A2, the diagnostic areas of multiple obstacles are completely independent. This assumption precludes the inability of the AUV to determine which obstacle to avoid after entering the crossover area. Thus, each AUV will encounter only one obstacle at a given time.
4. Consensus Algorithm for Optimal Obstacle Avoidance
In this section, the consensus problem is expressed by the problem of optimal control. A closed consensus law of obstacle avoidance, which is a linear function of based on the local communication topology, is derived by the inverse optimal control method. denotes a Kronecker product that is used to extend the dimensions and denotes the identity matrix of dimension .
For the sake of presentation, the error state is defined as follows:
where
denotes the ultimate consensus state. For example, for motion in space,
, where
, and
express the ultimate consensus position of the
x-axis,
y-axis, and
z-axis, respectively. Based on the property of the Laplacian
in Equation (2), when all AUVs reach the consensus, we obtain
The ultimate consensus state is constant at the moment and when the AUV reaches the consensus, the consensus law reaches zero.
Thus, the error of dynamics becomes
If the system (Equation (6)) is asymptotically stable, it will reach the consensus.
The function of the optimal obstacle avoidance is constructed, the formula for which consists of three cost functions, where expresses the control effort cost, denotes the consensus cost, and indicates the obstacle avoidance cost.
First, the control effort cost is
In Equation (8), is a regular quadratic, and in Equation (9), is a positive definite matrix. Furthermore, a scalar weighting parameter is defined.
Second, the consensus cost is
In Equation (10), the Laplacian matrix is established by the undirected and connected graph and it is symmetric. The weight of the consensus error is represented by .
Proposition 1 ([32]). is positive semidefinite, and when the graph is connected and undirected, .
Remark 1. Proposition 1 indicates that is a positive semidefinite matrix. The
formula in Equation (10) can ensure that the optimal control law is the linear function of
, and it is entirely dependent on the flow of information in the topology, as expressed by the proof of Theorem 1.
Finally, the obstacle avoidance cost is
where
is constructed from an inverse optimal control method using Theorem 1.
The following lemma is established for proving both the asymptotic stability and optimality of the obstacle avoidance consensus algorithm.
Lemma 1. A nonlinear controlled dynamical system [20] is modeled aswhere and the cost function is In Equation (13), denotes an admissible control.
The open sets are defined as
and
. Moreover, the continuous differentiable function
and control law
exist. Thus,
In Equation (19), is the Hamiltonian function and indicates partial differentiation with respect to .
The state feedback control law has the following form:
The solution
of the closed-loop system is locally asymptotically stable and it has a neighborhood with the origin
; thus,
Moreover, if
, the feedback control
minimizes
so that
where
represents the set of asymptotically stabilizing controllers for each initial condition
. Finally,
if
,
, and
the solution
of the closed-loop system is globally asymptotically stable.
Proof. Omitted. See reference [
23]. □
This lemma emphasizes that the steady-state solution of the Hamilton–Jacobi–Bellman equation is a Lyapunov function for the nonlinear system, thereby ensuring the stability and optimality of the system. The following theorem expresses the main result of this study.
Theorem 1. For a system of MAUVs (3) that is established by the three hypotheses with parameters and
, the feedback control law in whichis an optimal control for the consensus problem (7) within Equation (9). In Equation (25), the potential obstacle avoidance function
is defined as
and
where
means the derivative of
for
.
Moreover, when , the global asymptotic stability or consensus of the closed-loop system is guaranteed.
Proof. The following equations can be obtained using Lemma 1 for this optimal consensus problem:
where
. □
By selecting
, which is an applicable Lyapunov function then
is the solution of the Riccati equation, which is expressed later.
For the function in Equation (31),
should be a valid Lyapunov function. It must be continuously differentiable with respect to
, and in this case,
is continuously differentiable with respect to
. It can be observed from Equations (26) and (27) that in the safety area
,
will be continuously differentiable. If
and
are continuous at the boundary of the diagnostic zone, i.e.,
, this is true. As Equation (27) means that
,
is continuous at
, and thus, continuous over
. Furthermore,
Therefore, , which means that is continuous at , and thus, continuous over safety area .
As a result, and are continuously differentiable for in safety area .
The Hamiltonian for the consensus problem becomes
Setting
results in the optimal control law:
From Equation (33), it follows that
Substituting Equations (33) and (34) into (32) yields
For the consensus problem (7), we can prove that the control law (34) is an optimal solution using Lemma 1, but it is necessary to verify conditions (14) to (19). By satisfying condition (18) or causing Equation (36) to be zero, we can obtain
and demand
Using Equations (34) and (36)–(38), it can be observed that
and condition (19) is validated.
By substituting the expressions of
,
in Equation (37), a candidate function for
is obtained:
such that the Lyapunov function (31) becomes
Note that the property of in Equation (5) is used to convert into . If or , based on the property of the Laplacian matrix, will not be zero, but positive. Note that when can lead to , this is a special case of and , which also implies . Therefore, if . Furthermore, is defined by Equations (26) and (27), and it is easily shown that . When for , condition (16), i.e., for , can be met.
Subsequently,
in
is constructed by solving Equation (37):
which becomes (25). The selection of appropriate values for the weighting parameters can guarantee
. When
is valuated, a sufficiently small
can always be determined for the positive-definite term
to control the sign-indefinite term
.
Let
using Equations (37) and (38), with Equation (35).
If , and when . Thus, condition (17) can be satisfied.
Conditions (14) and (15) still need to be verified. According to Equations (31) and (34), when , , and , conditions (14) and (15) are satisfied. According to Equations (24), (28) and (32), if all AUVs assemble in the reaction area, the avoidance force of each AUV will not be zero and all AUVs will leave the reaction area until a new consensus point is reached. If the consensus point is beyond the reaction area of the obstacle, it can easily be observed that and ; thus, conditions (14) and (15) are satisfied.
The optimal control law in Equation (24) can be obtained using Equation (40) and substituting into (34). Note that, owing to Equation (5), the part containing the ultimate consensus state becomes zero. Thus, the control law (24) depends on and not . This satisfies the expectations because is not a priori.
At present, conditions (14) to (19) have been satisfied. Thus, according to Lemma 1, the control law (24) is the optimal control law for problem (7) in the sense of Equations (21) and (22), and the closed-loop system is asymptotically stable. Thus, and the consensus is achieved.
Moreover, it can easily be determined from Equation (31) that as . The closed-loop system is globally asymptotically stable. Note that the collision area is also excluded in the globally asymptotic stability area because no AUV exists to begin to avoid the obstacle.
Remark 2. Owing to the proof of Theorem 1, noting that the optimal consensus algorithm is studied by the method of inverse optimal control, as the function in is not specified a priori, it is constructed using the optimality condition in Equation (38). The obstacle avoidance can be understood according to and Equations (25), (27) and (29): if the AUV is beyond the diagnostic zone, , and thus, ; if the AUV is in the reaction area and close to the obstacle, the denominator in (see in Equation (27)) will reach zero and will increase. This indicates that the AUV is leaving an obstacle. Therefore, the obstacle avoidance ability is guaranteed according to the asymptotic stability and the optimality of the system can be ensured by Theorem 1.
Remark 3. We summarize the optimal consensus algorithm. First, and are changeable weighting parameters, where influences the consensus error and
influences the control effort. Second, the condition of
must be ensured by these parameters. The changing of these parameters is the same as in the conventional
problem for changing the weighting matrices and
: In a linear single-integrator system, it is not complicated to change and . However, the cost function of the obstacle avoidance should be a nonquadratic nonlinear function, so the linear optimal control problem differs from the . As only two parts exist in , the basic principles of the selection of the two parameters are as follows: the consensus error needs to be balanced, and for a given , the control effort also needs to select a sufficiently small weighting parameter such that the sign-indefinite term is always less than the positive term to obtain the condition .
Remark 4. According to in Equation (24), the optimal control law only needs to contain two parts: the consensus law and obstacle avoidance law.
The consensus law of the AUVs is a linear function of . Only the local information between the AUVs is required and they exchange the information using the communication topology, instead of the information of all AUVs. Therefore, in the optimal control law only requires local information for execution.