1. Introduction
The problem of control system synthesis is an important and quite challenging issue when designing automatic control systems for autonomous objects. The general problem of control synthesis was first stated in the 1960s by V.G. Boltyansky [
1] as a problem of an optimal control search from any initial condition. At that time, the solution was searched analytically using Pontryagin’s Maximum Principle [
2] as a general solution for differential equations of control objects and conjugate variables with optimal control in the right parts. Not all differential equations have general solutions.
The complexity of the control synthesis problem lies in the fact that it is necessary to find a multidimensional control function, the argument of which is the state vector of the control object. If this function is substituted into the right parts of the differential equations of the control object model, then the resulting system of equations will have the following property: any particular solution from the initial state of the state space domain will reach the terminal state with the optimal value of the quality criterion.
For linear autonomous systems, the synthesis of the control system involves designing negative feedback based on the state error [
3]. Today, the control synthesis problem is typically solved manually by engineers and control specialists to implement the developed control system. Engineers analyze the problem and the control object and establish control channels, i.e., determine the necessary control actions to compensate for specific state errors. Then, to improve control quality, regulators are inserted into these channels [
4,
5,
6]. Most commonly, linear proportional (P), proportional–integral (PI) and proportional–integral–derivative (PID) controllers are used, each with their own gain coefficients. Traditional PID control algorithms feature simple structures, ease of implementation, and, at the same time, fixed parameters and lack of adaptability to complex environments [
7]. To eliminate drawbacks, the PID control is enhanced by other control technologies such as fuzzy logic [
8,
9], etc.
This work employs a numerical approach to the control synthesis problem based on machine learning using symbolic regression [
10,
11,
12,
13,
14]. In mathematics, a regression method was previously used to approximate experimental data. The researcher studied experimental data and wrote down the mathematical expression of the function, which approximates these data up to some parameters. The parameter values were then found by the least squares method. This type of regression is subjective and only possible for functions with few arguments. Symbolic regression is a modern computational method that allows you to find not only parameters, but the structure of a function. For this purpose, an alphabet of elementary functions is defined and the rules for encoding mathematical expressions using this alphabet are stipulated. Further, a special genetic algorithm searches for the optimal expression according to a given criterion of a mathematical expression in the code space. In the genetic algorithm, the main operation of the crossover ensures that codes of new mathematical expressions appear. All symbolic regression methods differ in the form of coding.
It is important to note that the control function is often a multidimensional nonlinear function of the vector argument. In practice, artificial neural networks are rarely used to approximate this function, as it appears on the right-hand side of the model equations and significantly alters the dynamic properties of the object. Moreover, since a mathematical expression for this function is not known in advance, it is impossible to obtain a sufficient training dataset to train a neural network.
Since 2008, symbolic regression methods have been utilized to address the control synthesis problem [
15]. The control synthesis problem includes finding a control function as a function of state. It is often solved by supplying a stability property of the control object relative to a point in the state space. Then, for non-complex mathematical models, analytical methods can be used, such as a backstepping integrator [
16,
17,
18] or dynamic programming [
19,
20,
21], that provide a numerical value of control for each state of the control object, but one cannot obtain a mathematical expression for optimal control function in the appropriate form.
Here, the control synthesis problem is solved by symbolic regression. These methods involve encoding mathematical expressions based on a selected alphabet of elementary functions and searching the space of codes for the optimal mathematical expression of the function according to a specified quality criterion using a specialized genetic algorithm.
Research on the application of symbolic regression methods for solving the control synthesis problem has shown that most symbolic regression methods cannot directly address this task. The crossover operation leads to significant changes in codes, ultimately generating entirely new potential solutions that do not retain the properties of the selected parent solutions. Instead of producing offspring solutions that inherit the characteristics of their parents, completely new solutions are generated; thus, the genetic crossover operation functions more like a generator of new possibilities, transforming the evolutionary genetic algorithm into a simple random search. For the successful application of symbolic regression in finding the structure of mathematical expressions, it is essential for the crossover operation to possess the inheritance property with specific characteristics. To ensure this property, a principle of small variations in the basic solution was developed [
22], and it has been demonstrated that implementing this principle allows the crossover operation to exhibit inheritance properties. This enhances the likelihood of finding the optimal solution compared to a simple random search.
The application of the principle of small variations in the baseline solution to the problem of general control synthesis has led to the modification of established symbolic regression methods. Further research on solving the general control synthesis problem using symbolic regression has revealed that these methods produce solutions in the form of encoded mathematical expressions. However, these solutions can differ significantly not only in terms of the optimization criterion value but also in the quality of the solutions themselves.
Unlike previous studies [
23], this work employs a training dataset to address the problem of general control synthesis. Initially, the optimal control problem is solved from each point in a given set of initial states, resulting in a collection of control functions expressed as functions of time. A reference model is then integrated into the control object model, which generates optimal motion trajectories using the derived optimal control functions. The control synthesis problem is framed as an approximation task for all optimal trajectories, where the control function is sought as a function of the deviation of the object from the specified terminal state. The optimization criterion for solving the synthesis problem is the accuracy of the object’s movement along the optimal trajectory. The work includes an example of solving the control synthesis problem for a mobile robot using a supervised machine learning method, namely original variational complete binary genetic programming.
2. The Problem of General Control Synthesis
In the general control synthesis problem, it is necessary to find a control function whose argument is a state vector. The replacement of the control vector on the right part of the model with a control function should change the properties of the system so that all its particular solutions from some state space domain must reach a terminal state with an optimal value of the quality criterion.
The mathematical model of the control object in the form of an ODE system is given as follows:
where
is a control state vector,
,
is a control vector,
and
is a compact set that often defines the constraints on the control.
where
and
are the lower and upper boundaries of the control vector, respectively.
The terminal state is given as
where
is a time of reaching the terminal state,
and
is a given positive value.
The set of initial states is given as
where
is a penalty coefficient,
where
is a given small positive value.
It is necessary to find a control function in the following form:
that satisfies the restrictions on the control (2) and supplies the minimum of the quality criterion (6).
Our approach includes solving the optimal control problem. Therefore, the statement of the optimal control problem is presented here.
Analytically, this problem, (1)–(8), is solved only for very simple models and criteria, since the solution is a multidimensional function of a multidimensional argument, and this function is almost always nonlinear. The universal numerical method became possible only after the development of symbolic regression methods. These methods allow the search for the structure and parameters of any function according to a given criterion. In this study, we use a symbolic regression method to approximate many solutions to optimal control problems.
The mathematical model of a control object in the form of an ODE system is given (1). The constraints on the control (2) are given. The initial state is given as
The terminal state (3) is given. The quality criterion is given as
where
is a time of reaching the terminal state, and it is described by Equation (7) for one initial state,
.
It is necessary to find a control function as a function of time satisfying the constraints on the control (2).
If the found optimal control function (11) is placed in the right part of the mathematical model (1), then the ODE system,
will have the particular solution
from the initial state (9), which reaches the terminal state (3) with the optimal value of the given quality criterion (10).
Firstly, the optimal control problem is solved for each initial state from the given set (4) of initial states. As a result, a set of optimal control functions, parameterized by time, is obtained.
Afterward, the reference model is integrated into the mathematical model of the control object.
Any particular solution of the reference model (15) for an initial state from the given set of initial states (4), along with the corresponding optimal control function from the set of optimal control functions (13), results in the generation of the optimal trajectory in the state space.
where
,
and
.
In the second stage, the control synthesis problem in (1)–(6) and (8) is addressed with the following quality criterion:
where
,
is defined by Equation (7).
3. Symbolic Regression for Solving the Control Synthesis Problem
Symbolic regression encodes all mathematical expressions in the form of a special code and utilizes a genetic algorithm to search for the optimal mathematical expression. Let us introduce variational complete binary genetic programming (VCBGP).
Consider examples of coding a mathematical expression using this method. Consider the following mathematical expression:
This is Cardano’s formula for solving the cubic equation
at
and
.
To encode this mathematical expression, an alphabet of elementary functions is required. For the given mathematical expression (18), the following set of elementary functions is sufficient:
- -
Functions with one argument:
- -
Functions with two arguments:
- -
Arguments of mathematical expression or functions without arguments:
To construct a graph of VCBGP, it is first necessary to create a graph of binary genetic programming and then extend it to form the complete binary genetic programming. To augment the graph of binary genetic programming with the required number of levels, it is essential to insert an addition function with zero arguments or a multiplication function with unit arguments into the graph. Therefore, the set of arguments includes unit elements for both the addition and multiplication functions.
All elements of the alphabet have two subscripts: the first subscript denotes the number of arguments, while the second subscript indicates the element’s position in the set. The set of functions with one argument (19) includes the identity function . All functions with two arguments (20) are commutative, associative and have their own unit element.
Binary genetic programming encodes a mathematical expression as a computational tree. This method employs only functions with one and two arguments. The representation of a mathematical expression in the form of binary genetic programming is illustrated in
Figure 1.
In
Figure 1, the nodes of the tree display the numbers of the functions with two arguments, while the edges of the graph are labeled with the numbers of the functions with one argument. The arguments of the mathematical expression are represented at the last level of the graph or at the leaves of the tree.
Complete binary genetic programming differs from standard binary genetic programming in that it has the maximum possible number of nodes at all levels. At level in complete binary genetic programming, there should be nodes.
According to the graph of binary genetic programming, the number of levels required to encode the mathematical expression (18) should be at least
. In the binary genetic programming graph, at the third level, the first and third functions with two arguments do not contain a second argument.
Figure 2 shows the completion of the binary genetic programming graph with the second argument for functions with two arguments.
Figure 2a presents a part of the graph from
Figure 1, where a function with two arguments uses only one argument.
Figure 2b shows the correction of the graph to represent it in the form of complete binary genetic programming.
In
Figure 2 of the binary genetic programming graph, at the third level, the arguments of the functions at nodes 2 and 4 are the arguments of the mathematical expression itself, although these arguments should be located at the sixth level of the complete binary genetic programming graph.
Figure 3a presents a part of the binary genetic programming graph from
Figure 1, which contains the arguments of the mathematical expression at the fourth level.
Figure 3b shows the same part of the graph, but with two additional levels added on top.
The redundancy of nodes in the complete binary genetic programming graph, compared to standard binary genetic programming, is necessary to more efficiently represent the graph in computer memory. Since the number of elements at each level in the complete binary genetic programming graph is known, representing this graph in computer memory requires only an ordered set of integers, which indicates the function numbers at each level from left to right. By sequentially listing the function numbers from left to right, it becomes easy to determine the number of arguments for each function based on its number. In total, the code for complete binary genetic programming for a graph with
levels must contain the following elements:
Accordingly, the last
elements represent the arguments of the mathematical expression.
Let us consider the code in complete binary genetic programming.
Element , , encodes the following elements of the mathematical expression:
If , then is the argument of the mathematical expression;
If and ,, then is the number of the function with two arguments;
If and , , then is the number of the function with one argument.
To determine the number of levels, we use the following relation:
The code for the complete binary genetic programming of the mathematical expression (18) has the following form:
The layers in (25) are separated by slashes.
In search of the mathematical expression, the VCBGP employs a variational genetic algorithm. This algorithm utilizes the principle of small variations in the basic solution. According to this principle, the initial population encodes only one basic solution using the code of VCBGP. All other possible solutions are encoded by an ordered set of codes representing small variations.
In VCBGP, an integer vector of two components is used to encode a small variation,
where
is the position number in code, and
is a new value of the element in the pointed position.
In the population, each possible solution is represented as an ordered set of small variation vectors,
where
is the depth of variation;
is the number of possible solutions in the population except the basic solution.
Any possible solution, represented by the VCBGP code
is a result of applying a set of small variations to the basic solution,
where
is a VCBGP code of the basic solution.
During the crossover operation, two possible solutions, represented as ordered sets of small variation vectors, are selected randomly.
A crossover point is determined randomly, .
Two new possible solutions are generated by exchanging the tails of the selected solutions after the crossover point.
4. Computational Experiment
Consider the synthesis of a stabilization system for a wheeled mobile robot with differential drive. The mathematical model of the control object is composed of three ordinary differential equations,
where
is a state vector of the mobile robot,
is the control vector of the mobile robot.
The components of the control vector are constrained.
A set of initial states is provided.
where
,
is a ternary notation of the number
and
is the Hadamar product of vectors.
In a sequence of (38) elements 12, 13 and 14 are deleted because they correspond to the initial conditions and . In total, the initial set includes 24 points of initial states.
The terminal state is specified as
It is necessary to find a control function (8) that minimizes the following quality criterion:
where
is a penalty coefficient,
,
, and
According to the above approach, the optimal control problem was solved 24 times, once for each initial state from the set of initial states (36). A direct approach was employed to solve the optimal control problem. The time axis was divided into equal intervals of
. For each component of control, the parameter values were determined at the boundaries of the intervals, which were then connected by straight lines. As a result, a piecewise linear approximation of the control function was obtained. Considering the constraints on control, the control function had the following form:
where
is a vector of desired parameters,
is a dimension of control,
,
is a number of time intervals, and
The solution to the optimal control problem is the vector of parameters,
To search for the optimal values of the parameter vectors, a hybrid evolutionary algorithm [
24,
25,
26,
27] was employed. The obtained parameter vectors are presented in
Appendix A.
In the second stage, the control synthesis problem for stabilizing the control object at the terminal point of the state space was addressed using symbolic regression. In solving this problem, the quality criterion specified in criterion (17) was employed to minimize the deviation from the optimal trajectories identified in the previous stage. The control synthesis problem was solved through machine learning utilizing VCBGP. The optimization via the variational genetic algorithm was performed with the following parameter values, where the number of possible solutions in the population was identified as follows: , a number of generations , a number of possible crossover operation in one generation , a depth of variations , and a probability of mutation .
VCBGP obtained the following solution:
,
,
,
,
,
.
Figure 4 and
Figure 5 present the codes of the mathematical expressions (51) and (52) in the form of complete binary trees.
In
Figure 4 and
Figure 5, the numbers in the nodes, except for the leaves of the trees, represent the function numbers with two arguments: 1—
and
. Numbers near arcs are the function numbers with one argument: 1—
, 2—
, 3—
, 4—
, 5—
, 6—
, 7—
, 8—
, 10—
, 11—
, 12—
, 13—
, 14—
, 15—
, 16—
, 17—
18—
, 19—
, and 23—
. In the leaves, arguments of mathematical equations are presented,
,
,
.
Figure 6 shows robot trajectories from eight initial states (solid black lines),
,
,
,
,
,
, and
, and optimal trajectories (dots) from the same initial states found when solving the optimal control problem.
Figure 7 shows robot trajectories from eight other initial states (solid black lines),
,
,
,
,
,
, and
, and optimal trajectories (dots) from the same initial states. As can be observed, the stabilization system facilitates the motion of the robots in the vicinity of the optimal trajectories.