Solving the Control Synthesis Problem Through Supervised Machine Learning of Symbolic Regression

Diveev, Askhat; Sofronova, Elena; Konyrbaev, Nurbek

doi:10.3390/math12223595

Open AccessArticle

Solving the Control Synthesis Problem Through Supervised Machine Learning of Symbolic Regression

by

Askhat Diveev

¹

,

Elena Sofronova

^1,2 and

Nurbek Konyrbaev

^3,*

¹

Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences, Vavilova Str., 44, Build. 2, Moscow 119333, Russia

²

Applied Informatics and Intelligent Systems in Human Sciences Department, RUDN University, Miklukho-Maklaya Str., 6, Moscow 117198, Russia

³

Institute of Engineering and Technology, Korkyt Ata Kyzylorda University, Aiteke bi Str. 29A, Kyzylorda 120014, Kazakhstan

^*

Author to whom correspondence should be addressed.

Mathematics 2024, 12(22), 3595; https://doi.org/10.3390/math12223595

Submission received: 17 October 2024 / Revised: 12 November 2024 / Accepted: 13 November 2024 / Published: 17 November 2024

(This article belongs to the Special Issue Advanced Computational Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

This paper considers the control synthesis problem and its solution using symbolic regression. Symbolic regression methods, which were previously called genetic programming methods, allow one to use a computer to find not only the parameters of a given regression function but also its structure. Unlike other works on solving the control synthesis problem using symbolic regression, the novelty of this paper is that for the first time this work employs a training dataset to address the problem of general control synthesis. Initially, the optimal control problem is solved from each point in a given set of initial states, resulting in a collection of control functions expressed as functions of time. A reference model is then integrated into the control object model, which generates optimal motion trajectories using the derived optimal control functions. The control synthesis problem is framed as an approximation task for all optimal trajectories, where the control function is sought as a function of the deviation of the object from the specified terminal state. The optimization criterion for solving the synthesis problem is the accuracy of the object’s movement along the optimal trajectory. The paper includes an example of solving the control synthesis problem for a mobile robot using a supervised machine learning method. A relatively new method of symbolic regression, the method of variational complete binary genetic programming, is studied and proposed for the solution of the control synthesis problem.

Keywords:

machine learning control; optimal control; control synthesis; symbolic regression; mobile robot

MSC:

49M25; 68W50

1. Introduction

The problem of control system synthesis is an important and quite challenging issue when designing automatic control systems for autonomous objects. The general problem of control synthesis was first stated in the 1960s by V.G. Boltyansky [1] as a problem of an optimal control search from any initial condition. At that time, the solution was searched analytically using Pontryagin’s Maximum Principle [2] as a general solution for differential equations of control objects and conjugate variables with optimal control in the right parts. Not all differential equations have general solutions.

The complexity of the control synthesis problem lies in the fact that it is necessary to find a multidimensional control function, the argument of which is the state vector of the control object. If this function is substituted into the right parts of the differential equations of the control object model, then the resulting system of equations will have the following property: any particular solution from the initial state of the state space domain will reach the terminal state with the optimal value of the quality criterion.

For linear autonomous systems, the synthesis of the control system involves designing negative feedback based on the state error [3]. Today, the control synthesis problem is typically solved manually by engineers and control specialists to implement the developed control system. Engineers analyze the problem and the control object and establish control channels, i.e., determine the necessary control actions to compensate for specific state errors. Then, to improve control quality, regulators are inserted into these channels [4,5,6]. Most commonly, linear proportional (P), proportional–integral (PI) and proportional–integral–derivative (PID) controllers are used, each with their own gain coefficients. Traditional PID control algorithms feature simple structures, ease of implementation, and, at the same time, fixed parameters and lack of adaptability to complex environments [7]. To eliminate drawbacks, the PID control is enhanced by other control technologies such as fuzzy logic [8,9], etc.

This work employs a numerical approach to the control synthesis problem based on machine learning using symbolic regression [10,11,12,13,14]. In mathematics, a regression method was previously used to approximate experimental data. The researcher studied experimental data and wrote down the mathematical expression of the function, which approximates these data up to some parameters. The parameter values were then found by the least squares method. This type of regression is subjective and only possible for functions with few arguments. Symbolic regression is a modern computational method that allows you to find not only parameters, but the structure of a function. For this purpose, an alphabet of elementary functions is defined and the rules for encoding mathematical expressions using this alphabet are stipulated. Further, a special genetic algorithm searches for the optimal expression according to a given criterion of a mathematical expression in the code space. In the genetic algorithm, the main operation of the crossover ensures that codes of new mathematical expressions appear. All symbolic regression methods differ in the form of coding.

It is important to note that the control function is often a multidimensional nonlinear function of the vector argument. In practice, artificial neural networks are rarely used to approximate this function, as it appears on the right-hand side of the model equations and significantly alters the dynamic properties of the object. Moreover, since a mathematical expression for this function is not known in advance, it is impossible to obtain a sufficient training dataset to train a neural network.

Since 2008, symbolic regression methods have been utilized to address the control synthesis problem [15]. The control synthesis problem includes finding a control function as a function of state. It is often solved by supplying a stability property of the control object relative to a point in the state space. Then, for non-complex mathematical models, analytical methods can be used, such as a backstepping integrator [16,17,18] or dynamic programming [19,20,21], that provide a numerical value of control for each state of the control object, but one cannot obtain a mathematical expression for optimal control function in the appropriate form.

Here, the control synthesis problem is solved by symbolic regression. These methods involve encoding mathematical expressions based on a selected alphabet of elementary functions and searching the space of codes for the optimal mathematical expression of the function according to a specified quality criterion using a specialized genetic algorithm.

Research on the application of symbolic regression methods for solving the control synthesis problem has shown that most symbolic regression methods cannot directly address this task. The crossover operation leads to significant changes in codes, ultimately generating entirely new potential solutions that do not retain the properties of the selected parent solutions. Instead of producing offspring solutions that inherit the characteristics of their parents, completely new solutions are generated; thus, the genetic crossover operation functions more like a generator of new possibilities, transforming the evolutionary genetic algorithm into a simple random search. For the successful application of symbolic regression in finding the structure of mathematical expressions, it is essential for the crossover operation to possess the inheritance property with specific characteristics. To ensure this property, a principle of small variations in the basic solution was developed [22], and it has been demonstrated that implementing this principle allows the crossover operation to exhibit inheritance properties. This enhances the likelihood of finding the optimal solution compared to a simple random search.

The application of the principle of small variations in the baseline solution to the problem of general control synthesis has led to the modification of established symbolic regression methods. Further research on solving the general control synthesis problem using symbolic regression has revealed that these methods produce solutions in the form of encoded mathematical expressions. However, these solutions can differ significantly not only in terms of the optimization criterion value but also in the quality of the solutions themselves.

Unlike previous studies [23], this work employs a training dataset to address the problem of general control synthesis. Initially, the optimal control problem is solved from each point in a given set of initial states, resulting in a collection of control functions expressed as functions of time. A reference model is then integrated into the control object model, which generates optimal motion trajectories using the derived optimal control functions. The control synthesis problem is framed as an approximation task for all optimal trajectories, where the control function is sought as a function of the deviation of the object from the specified terminal state. The optimization criterion for solving the synthesis problem is the accuracy of the object’s movement along the optimal trajectory. The work includes an example of solving the control synthesis problem for a mobile robot using a supervised machine learning method, namely original variational complete binary genetic programming.

2. The Problem of General Control Synthesis

In the general control synthesis problem, it is necessary to find a control function whose argument is a state vector. The replacement of the control vector on the right part of the model with a control function should change the properties of the system so that all its particular solutions from some state space domain must reach a terminal state with an optimal value of the quality criterion.

The mathematical model of the control object in the form of an ODE system is given as follows:

\dot{x} = f (x, u),

(1)

where

x

is a control state vector,

x \in ℝ^{n}

,

u

is a control vector,

u \in U \subseteq ℝ^{m}

and

U

is a compact set that often defines the constraints on the control.

u^{-} \leq u \leq u^{+},

(2)

where

u^{-}

and

u^{+}

are the lower and upper boundaries of the control vector, respectively.

The terminal state is given as

x (t_{f}) = x^{f} = {[x_{1}^{f} \dots x_{n}^{f}]}^{T},

(3)

where

t_{f}

is a time of reaching the terminal state,

t_{f} \leq t^{+}

and

t^{+}

is a given positive value.

The set of initial states is given as

X_{0} = \{x^{0, 1}, x^{0, 2}, \dots, x^{0, k}\},

(4)

x^{0, j} = {[x_{1}^{0, j} \dots x_{n}^{0, j}]}^{T} .

(5)

J_{0} = \sum_{j = 1}^{K} (\int_{0}^{t_{f, j}} f_{0} (x, u) d t + p_{1} {‖x^{f} - x (t_{f, j})‖}_{2}) \to \min_{u \in U},

(6)

where

p_{1}

is a penalty coefficient,

t_{f, j} = \{\begin{array}{l} t, if t < t^{+} and | | x^{f} - x (t) | |_{2} \leq ε \\ t^{+}, otherwise \end{array}, j = 1, \dots, K,

(7)

where

ε

is a given small positive value.

It is necessary to find a control function in the following form:

u = h (x^{f} - x),

(8)

that satisfies the restrictions on the control (2) and supplies the minimum of the quality criterion (6).

Our approach includes solving the optimal control problem. Therefore, the statement of the optimal control problem is presented here.

Analytically, this problem, (1)–(8), is solved only for very simple models and criteria, since the solution is a multidimensional function of a multidimensional argument, and this function is almost always nonlinear. The universal numerical method became possible only after the development of symbolic regression methods. These methods allow the search for the structure and parameters of any function according to a given criterion. In this study, we use a symbolic regression method to approximate many solutions to optimal control problems.

The mathematical model of a control object in the form of an ODE system is given (1). The constraints on the control (2) are given. The initial state is given as

x (0) = x^{0} = {[x_{1}^{0} \dots x_{n}^{0}]}^{T} .

(9)

The terminal state (3) is given. The quality criterion is given as

J_{1} = \int_{0}^{t_{f}} f_{0} (x, u) d t + p_{1} | | x^{f} - x (t_{f}) | | \to \min_{u \in U},

(10)

where

t_{f}

is a time of reaching the terminal state, and it is described by Equation (7) for one initial state,

K = 1

.

It is necessary to find a control function as a function of time satisfying the constraints on the control (2).

u = v (t), 0 \leq t \leq t_{f} \leq t^{+},

(11)

If the found optimal control function (11) is placed in the right part of the mathematical model (1), then the ODE system,

\dot{x} = f (x, v (t)),

(12)

will have the particular solution

x (t, x^{0})

from the initial state (9), which reaches the terminal state (3) with the optimal value of the given quality criterion (10).

Firstly, the optimal control problem is solved for each initial state from the given set (4) of initial states. As a result, a set of optimal control functions, parameterized by time, is obtained.

V^{*} = {v^{*, 1} (t), \dots, v^{*, K} (t)} .

(13)

Afterward, the reference model is integrated into the mathematical model of the control object.

\dot{x} = f (x, u),

(14)

{\dot{x}}^{*} = f (x^{*}, v^{*} (t)) .

(15)

Any particular solution of the reference model (15) for an initial state from the given set of initial states (4), along with the corresponding optimal control function from the set of optimal control functions (13), results in the generation of the optimal trajectory in the state space.

x^{*} (t) = x^{*} (t, x^{0, j}), j = 1, \dots, K,

(16)

where

x^{0, j} \in X_{0}

,

v^{*, j} (t) \in V^{*}

and

j \in {1, \dots, K}

.

In the second stage, the control synthesis problem in (1)–(6) and (8) is addressed with the following quality criterion:

J_{1} = \sum_{j = 1}^{K} (\int_{0}^{t_{f, j}} | | x^{*} (t, x^{0, j}) - x (t, x^{0, j}) | |_{2} d t + \underset{t}{\max | | x^{*} (t, x^{0, j}) - x (t, x^{0, j}) | |_{2}}) +

p_{1} \sum_{j = 1}^{K} (| | x^{f} - x (t_{f, j}, x^{0, j} | |_{2}) \to \min_{h (x^{f} - x) \in U},

(17)

where

t_{f, j}

,

j \in {1, \dots, K}

is defined by Equation (7).

3. Symbolic Regression for Solving the Control Synthesis Problem

Symbolic regression encodes all mathematical expressions in the form of a special code and utilizes a genetic algorithm to search for the optimal mathematical expression. Let us introduce variational complete binary genetic programming (VCBGP).

Consider examples of coding a mathematical expression using this method. Consider the following mathematical expression:

y = \sqrt[3]{- \frac{x_{2}}{q_{1}} + \sqrt{\frac{x_{2}^{2}}{q_{1}^{2}} + \frac{x_{1}^{3}}{q_{2}^{3}}}} + \sqrt[3]{- \frac{x_{2}}{q_{1}} - \sqrt{\frac{x_{2}^{2}}{q_{1}^{2}} + \frac{x_{1}^{3}}{q_{2}^{3}}}} .

(18)

This is Cardano’s formula for solving the cubic equation

y^{3} + x_{1} y + x_{2} = 0,

at

q_{1} = 2

and

q_{2} = 3

.

To encode this mathematical expression, an alphabet of elementary functions is required. For the given mathematical expression (18), the following set of elementary functions is sufficient:

-: Functions with one argument:

$F_{1} = {f_{1, 1} (z) = z, f_{1, 2} (z) = - z, f_{1, 3} (z) = \sqrt{z}, f_{1, 4} (z) = z^{2},$

$f_{1, 5} (z) = z^{- 1}, f_{1, 6} (z) = \sqrt[3]{z}, f_{1, 7} (z) = z^{3}},$

(19)
-: Functions with two arguments:

$F_{2} = {f_{2, 1} (z_{1}, z_{2}) = z_{1} + z_{2}, f_{2, 2} (z_{1}, z_{2}) = z_{1} z_{2}}$

(20)
-: Arguments of mathematical expression or functions without arguments:

$F_{0} = {f_{0, 1} = x_{1}, f_{0, 2} = x_{2}, f_{0, 3} = q_{1}, f_{0, 4} = q_{2}, f_{0, 5} = q_{3}, f_{0, 6} = 1} .$

(21)

To construct a graph of VCBGP, it is first necessary to create a graph of binary genetic programming and then extend it to form the complete binary genetic programming. To augment the graph of binary genetic programming with the required number of levels, it is essential to insert an addition function with zero arguments or a multiplication function with unit arguments into the graph. Therefore, the set of arguments includes unit elements for both the addition and multiplication functions.

All elements of the alphabet have two subscripts: the first subscript denotes the number of arguments, while the second subscript indicates the element’s position in the set. The set of functions with one argument (19) includes the identity function

f_{1, 1} (z) = z

. All functions with two arguments (20) are commutative, associative and have their own unit element.

Binary genetic programming encodes a mathematical expression as a computational tree. This method employs only functions with one and two arguments. The representation of a mathematical expression in the form of binary genetic programming is illustrated in Figure 1.

In Figure 1, the nodes of the tree display the numbers of the functions with two arguments, while the edges of the graph are labeled with the numbers of the functions with one argument. The arguments of the mathematical expression are represented at the last level of the graph or at the leaves of the tree.

Complete binary genetic programming differs from standard binary genetic programming in that it has the maximum possible number of nodes at all levels. At level

l

in complete binary genetic programming, there should be

2^{l - 1}

nodes.

According to the graph of binary genetic programming, the number of levels required to encode the mathematical expression (18) should be at least

L = 6

. In the binary genetic programming graph, at the third level, the first and third functions with two arguments do not contain a second argument. Figure 2 shows the completion of the binary genetic programming graph with the second argument for functions with two arguments.

Figure 2a presents a part of the graph from Figure 1, where a function with two arguments uses only one argument. Figure 2b shows the correction of the graph to represent it in the form of complete binary genetic programming.

In Figure 2 of the binary genetic programming graph, at the third level, the arguments of the functions at nodes 2 and 4 are the arguments of the mathematical expression itself, although these arguments should be located at the sixth level of the complete binary genetic programming graph.

Figure 3a presents a part of the binary genetic programming graph from Figure 1, which contains the arguments of the mathematical expression at the fourth level. Figure 3b shows the same part of the graph, but with two additional levels added on top.

The redundancy of nodes in the complete binary genetic programming graph, compared to standard binary genetic programming, is necessary to more efficiently represent the graph in computer memory. Since the number of elements at each level in the complete binary genetic programming graph is known, representing this graph in computer memory requires only an ordered set of integers, which indicates the function numbers at each level from left to right. By sequentially listing the function numbers from left to right, it becomes easy to determine the number of arguments for each function based on its number. In total, the code for complete binary genetic programming for a graph with

L

levels must contain the following elements:

S = \sum_{i = 1}^{L} 2^{i} = 2 + 4 + \dots + 2^{L} = 2^{L + 1} - 2

(22)

Accordingly, the last

2^{L - 1}

elements represent the arguments of the mathematical expression.

Let us consider the code in complete binary genetic programming.

B = (b_{1}, \dots, b_{M}) .

(23)

Element

b_{i}

,

i \in {1, \dots, M = 2^{L + 1} - 2}

, encodes the following elements of the mathematical expression:

If $2^{L} < i \leq 2^{L + 1}$ , then $b_{i}$ is the argument of the mathematical expression;
If $2^{k} < i \leq 2^{k + 1}$ and $k \mod 2 = 0$ , $k < L$ , then $b_{i}$ is the number of the function with two arguments;
If $2^{k} < i \leq 2^{k + 1}$ and $k \mod 2 \neq 0$ , $k < L$ , then $b_{i}$ is the number of the function with one argument.

To determine the number of levels, we use the following relation:

L = \log_{2} (M + 2) - 1,

(24)

The code for the complete binary genetic programming of the mathematical expression (18) has the following form:

B = (1, 1, / 6, 6, 1, 1, / 1, 2, 2, 2, 1, 1, 1, 1, / 3, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /

4, 7, 1, 1, 1, 1, 1, 1, 4, 7, 1, 1, 1, 1, 1, 1, 2, 2, 1, 1, 1, 1, 1, 2, 2, 2, 1, 1, 1, 1, 1, 2, /

5, 1, 5, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 5, 1, 5, 1, 5, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,

3, 2, 4, 1, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 3, 2, 3, 2, 4, 1, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 3, 2),

(25)

The layers in (25) are separated by slashes.

In search of the mathematical expression, the VCBGP employs a variational genetic algorithm. This algorithm utilizes the principle of small variations in the basic solution. According to this principle, the initial population encodes only one basic solution using the code of VCBGP. All other possible solutions are encoded by an ordered set of codes representing small variations.

In VCBGP, an integer vector of two components is used to encode a small variation,

w = {[w_{1} w_{2}]}^{T},

(26)

where

w_{1}

is the position number in code, and

w_{2}

is a new value of the element in the pointed position.

In the population, each possible solution is represented as an ordered set of small variation vectors,

W_{i} = (w^{i, 1}, \dots, w^{i, d}), i = 1, \dots, H,

(27)

where

d

is the depth of variation;

H

is the number of possible solutions in the population except the basic solution.

Any possible solution, represented by the VCBGP code

B_{i},

is a result of applying a set of small variations to the basic solution,

B_{i} = W_{i} \circ B_{0} = w^{i, d} \circ \dots \circ w^{i, 1} \circ B_{0},

(28)

where

B_{0}

is a VCBGP code of the basic solution.

During the crossover operation, two possible solutions, represented as ordered sets of small variation vectors, are selected randomly.

W_{α} = (w^{α, 1}, \dots, w^{α, d}), W_{β} = (w^{β, 1}, \dots, w^{β, d}),

(29)

A crossover point is determined randomly,

c \in {1, \dots, d}

.

Two new possible solutions are generated by exchanging the tails of the selected solutions after the crossover point.

W_{H + 1} = (w^{α, 1}, \dots, w^{α, c}, w^{β, c + 1}, \dots, w^{β, d}),

(30)

W_{H + 2} = (w^{β, 1}, \dots, w^{β, c}, w^{α, c + 1}, \dots, w^{α, d})

(31)

4. Computational Experiment

Consider the synthesis of a stabilization system for a wheeled mobile robot with differential drive. The mathematical model of the control object is composed of three ordinary differential equations,

{\dot{x}}_{1} = 0.5 (u_{1} + u_{2}) \cos (x),

(32)

{\dot{x}}_{2} = 0.5 (u_{1} + u_{2}) \sin (x_{3}),

(33)

{\dot{x}}_{3} = 0.5 (u_{1} - u_{2}),

(34)

where

x = {[x_{1} x_{2} x_{3}]}^{T}

is a state vector of the mobile robot,

u = {[u_{1} u_{2}]}^{T}

is the control vector of the mobile robot.

The components of the control vector are constrained.

u^{-} = - 10 \leq u_{i} \leq 10 = u^{+}, i = 1, 2

(35)

A set of initial states is provided.

X_{0} = {x^{0, 1}, \dots, x^{0, K}},

(36)

x^{0, k} = x^{0} - Δ_{0} + {(i)}_{3} ⊙ Δ_{0}, k = 1, \dots, K = 24

(37)

where

k = j_{i} - 1

,

{(i)}_{3}

is a ternary notation of the number

i

and

⊙

is the Hadamar product of vectors.

j_{i} = \{\begin{array}{l} i, if i \in {0, \dots, 11} \\ i - 3, if i \in {15, \dots, 27} \end{array}, i = 0, \dots, 11, 15, \dots 26,

(38)

x^{0} = {[0 0 0]}^{T},

(39)

Δ_{0} = {[5 5 \frac{π}{18}]}^{T} .

(40)

In a sequence of

i

(38) elements 12, 13 and 14 are deleted because they correspond to the initial conditions

x_{1}^{0, i} = 0

and

x_{2}^{0, i} = 0

. In total, the initial set includes 24 points of initial states.

The terminal state is specified as

x^{f} = {[x_{1}^{f} x_{2}^{f} x_{3}^{f}]}^{T} = {[0 0 0]}^{T} .

(41)

It is necessary to find a control function (8) that minimizes the following quality criterion:

J_{3} = p_{1} | | x^{f} - x (t_{f, k}, x^{0, k}) | | + \sum_{k = 1}^{24} t_{f, k} \to \min_{h (x^{f} - x)},

(42)

where

p_{1}

is a penalty coefficient,

p_{1} = 2

,

t_{f, k} = \{\begin{array}{l} t, if t < t^{+} and | | x^{f} - x (t, x^{0, k}) | | \leq ε \\ t^{+} . otherwise \end{array}, k = 1, \dots, 24,

(43)

ε = 0.025

, and

t^{+} = 1

| | x^{f} - x | | = \sqrt{{(x_{1}^{f} - x_{1})}^{2} + {(x_{2}^{f} - x_{2})}^{2} + {(\sin (x_{3}^{f}) - \sin (x_{3}))}^{2}} .

(44)

According to the above approach, the optimal control problem was solved 24 times, once for each initial state from the set of initial states (36). A direct approach was employed to solve the optimal control problem. The time axis was divided into equal intervals of

Δ t = 0.2

. For each component of control, the parameter values were determined at the boundaries of the intervals, which were then connected by straight lines. As a result, a piecewise linear approximation of the control function was obtained. Considering the constraints on control, the control function had the following form:

u_{i} (t) = v_{i} (t) = \{\begin{array}{l} u^{+} if u^{+} \leq {\tilde{u}}_{i} (t) \\ u^{-} if {\tilde{u}}_{i} (t) \leq u^{-} \\ {\tilde{u}}_{i} (t), otherwise \end{array}, i = 1, 2,

(45)

where

{\tilde{u}}_{i} (t) = q_{i + (j - 1) m} + (q_{i + j m} - q_{i + (j - 1) m}) \frac{t - j Δ t}{Δ t},

(46)

j Δ t \leq t < (j + 1) Δ t, j = 1, \dots, S + 1,

(47)

q = {[q_{1} \dots q_{m (S + 1)}]}^{T}

is a vector of desired parameters,

m

is a dimension of control,

m = 2

,

S

is a number of time intervals, and

S = ⌈\frac{t^{+}}{Δ t}⌉ = ⌈\frac{1}{0.2}⌉ = 5 .

(48)

The solution to the optimal control problem is the vector of parameters,

\tilde{q} (x^{0, j}) = {[{\tilde{q}}_{1} (x^{0, j}) \dots {\tilde{q}}_{12} (x^{0, j})]}^{T} .

(49)

To search for the optimal values of the parameter vectors, a hybrid evolutionary algorithm [24,25,26,27] was employed. The obtained parameter vectors are presented in Appendix A.

In the second stage, the control synthesis problem for stabilizing the control object at the terminal point of the state space was addressed using symbolic regression. In solving this problem, the quality criterion specified in criterion (17) was employed to minimize the deviation from the optimal trajectories identified in the previous stage. The control synthesis problem was solved through machine learning utilizing VCBGP. The optimization via the variational genetic algorithm was performed with the following parameter values, where the number of possible solutions in the population was identified as follows:

H = 512

, a number of generations

G = 128

, a number of possible crossover operation in one generation

R = 128

, a depth of variations

d = 7

, and a probability of mutation

p_{μ} = 0.75

.

VCBGP obtained the following solution:

u_{i} (t) = h_{i} (x^{f} - x) = \{\begin{array}{l} u^{+} if u^{+} \leq {\tilde{h}}_{i} (x^{f} - x) \\ u^{-} if {\tilde{h}}_{i} (x^{f} - x) \leq u^{-} \\ {\tilde{h}}_{i} (x^{f} - x), otherwise \end{array}, i = 1, 2,

(50)

{\tilde{h}}_{1} (x^{f} - x) = sgn (A + B) \sqrt{| A + B |}

(51)

{\tilde{h}}_{2} (x^{f} - x) = sgn (sgn (C) \sqrt{| C |} + D) \sqrt{| sgn (C) \sqrt{| C |} + D |}

(52)

A = (q_{3} {(Δ_{1})}^{- 1} q_{1}^{2} Δ_{2} \cos ({(μ (Δ_{2}) \tanh (Δ_{2}))}^{3} E),

B = q_{1}^{3} sgn (Δ_{3}) \sqrt{| Δ_{3} |} \exp (sgn (q_{1}) \ln (| Δ_{3} |)) + F,

C = {(\tanh (Δ_{3}) (Δ_{1} - Δ_{1}^{3}))}^{- 1} + Δ_{1} (Δ_{2} - Δ_{2}^{3}) + (- q_{2} + q_{1}) {(Δ_{3}^{3} + 1)}^{3}, D = q_{1} Δ_{1} + ρ_{18} (\sqrt[3]{Δ_{1}} + \tanh (Δ_{3})) + G, E = ρ_{17} (Δ_{3}) sgn (Δ_{3}) \sqrt{| Δ_{3} |} - {(ρ_{17} (Δ_{3}) sgn (Δ_{3}) \sqrt{| Δ_{3} |})}^{3}, F = q_{1} Δ_{1}^{2} + ρ_{17} (Δ_{3}) + ρ_{18} (Δ_{3}) - {(q_{1} Δ_{1}^{2} + ρ_{17} (Δ_{3}) + ρ_{18} (Δ_{3}))}^{3}, G = \exp (- ρ_{18} (Δ_{3}) Δ_{2}) \exp (\arctan (Δ_{3}) + \tanh (Δ_{3})), μ (α) = \{\begin{array}{l} α, if | α | < 1 \\ sgn (α), oherwise \end{array}, ρ_{17} (α) = sgn (α) \ln (| α | + 1), ρ_{18} (α) = sgn (α) (\exp (α) - 1), ρ_{19} (α) = sgn (α) \exp (- | α |),

Δ_{1} = x_{1}^{f} - x_{1}

,

Δ_{2} = x_{2}^{f} - x_{2}

,

Δ_{3} = x_{3}^{f} - x_{3}

,

q_{1} = 13.65918

,

q_{2} = 10.41113

,

q_{3} = 15.86060

.

Figure 4 and Figure 5 present the codes of the mathematical expressions (51) and (52) in the form of complete binary trees.

In Figure 4 and Figure 5, the numbers in the nodes, except for the leaves of the trees, represent the function numbers with two arguments: 1—

f_{2, 1} (z_{1}, z_{2}) = z_{1} + z_{2}

and

f_{2, 2} (z_{1}, z_{2}) = z_{1} z_{2}

. Numbers near arcs are the function numbers with one argument: 1—

f_{1, 1} (z) = z

, 2—

f_{1, 2} (z) = z^{2}

, 3—

f_{1, 3} (z) = - z

, 4—

f_{1, 4} (z) = sgn (z) \sqrt{| z |}

, 5—

f_{1, 5} (z) = z^{- 1}

, 6—

f_{1, 6} (z) = \exp (z)

, 7—

f_{1, 7} (z) = \ln (| z |)

, 8—

f_{1, 8} (z) = \tanh (z)

, 10—

f_{1, 10} (z) = sgn (z)

, 11—

f_{1, 11} (z) = \cos (z)

, 12—

f_{1, 12} (z) = \sin (z)

, 13—

f_{1, 13} (z) = \arctan (z)

, 14—

f_{1, 14} (z) = z^{3}

, 15—

f_{1, 15} (z) = \sqrt[3]{z}

, 16—

f_{1, 16} (z) = μ (z) = \{\begin{array}{l} z, if | z | < 1 \\ sgn (z), otherwise \end{array}

, 17—

f_{1, 17} (z) = ρ_{17} (z) = sgn (z) \ln (| z | + 1)

18—

f_{1, 18} (z) = ρ_{18} (z) = sgn (z) (\exp (z) - 1)

, 19—

f_{1, 19} (z) = ρ_{19} (z) = sgn (z) \exp (- | z |)

, and 23—

f_{1, 23} (z) = z - z^{3}

. In the leaves, arguments of mathematical equations are presented,

Δ_{1} = x_{1}^{f} - x_{1}

,

Δ_{2} = x_{2}^{f} - x_{2}

,

Δ_{3} = x_{3}^{f} - x_{3}

.

Figure 6 shows robot trajectories from eight initial states (solid black lines),

x^{0, 1}

,

x^{0, 3}

,

x^{0, 7}

,

x^{0, 9}

,

x^{0, 13}

,

x^{0, 15}

x^{0, 19}

, and

x^{0, 21}

, and optimal trajectories (dots) from the same initial states found when solving the optimal control problem.

Figure 7 shows robot trajectories from eight other initial states (solid black lines),

x^{0, 4}

,

x^{0, 6}

,

x^{0, 10}

,

x^{0, 12}

,

x^{0, 16}

,

x^{0, 18}

x^{0, 22}

, and

x^{0, 24}

, and optimal trajectories (dots) from the same initial states. As can be observed, the stabilization system facilitates the motion of the robots in the vicinity of the optimal trajectories.

5. Conclusions

To address the synthesis problem of a stabilization system, machine learning through symbolic regression is employed. To ensure the desired quality properties of the stabilization system, a training sample is utilized during the machine learning process. This training sample is derived from solving the optimal control problem for a set of initial states. The quality criterion in the control synthesis problem is defined as the minimum sum of deviations from all optimal trajectories. This work presents a method for solving the control synthesis problem based on machine learning with supervision. An example of synthesizing a control stabilization system for a mobile wheeled robot with differential drive is demonstrated using symbolic regression techniques. The variational complete binary genetic programming, as a method of symbolic regression, is employed. This symbolic regression method utilizes the principle of small variations in the basic solution and encodes a mathematical expression in the form of complete binary computational trees.

6. Future Work

In future work, it is planned to investigate the influence of stabilization quality on the sensitivity of control systems to external disturbances and inaccuracies in mathematical models. This research aims to enhance the automatic construction of control systems for robots. The investigations will focus on defining the requirements for stabilization systems and determining the values of quality criteria in optimal control problems.

Author Contributions

Conceptualization, A.D. and E.S.; methodology, A.D.; software, A.D. and E.S.; validation, E.S.; formal analysis, A.D.; investigation, E.S. and N.K.; resources, A.D.; writing—original draft preparation, A.D. and E.S.; writing—review and editing, E.S.; visualization, N.K.; supervision, A.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Science Committee of the Ministry of Science and Higher Education of the Republic of Kazakhstan (Grant No. AP14869851).

Data Availability Statement

The parameter vectors in (50) are given in Appendix A.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

The obtained parameter vectors in (50) are as follows:

\tilde{q} (x^{0, 1}) = \tilde{q} ({[5 5 - \frac{π}{18}]}^{T}) = [19.91 - 15.76 - 19.49 - 15.42

- 15.74 - 14.64 - 16.34 - 19.05 - 16.07 1.09 - 15.11 - 0.76]^{T},

\tilde{q} (x^{0, 2}) = \tilde{q} ({[\begin{matrix} 5 & 5 & 0 \end{matrix}]}^{T}) = [13.24 - 12.31 - 18.89 - 12.34

- 15.80 - 15.51 - 15.60 - 15.38 - 15.38 - 0.09 4.79 15.57]^{T},

\tilde{q} (x^{0, 3}) = \tilde{q} ({[5 5 \frac{π}{18}]}^{T}) = [0.42 - 15.62 - 7.35 - 9.82

- 15.36 - 19.96 - 13.24 - 18.00 - 19.04 2.78 5.69 2.70]^{T},

\tilde{q} (x^{0, 4}) = \tilde{q} ({[0 5 - \frac{π}{18}]}^{T}) = [- 0.56 4.93 - 3.65 19.10

18.81 17.91 15.72 19.51 14.01 - 18.16 0.35 - 15.12]^{T},

\tilde{q} (x^{0, 5}) = \tilde{q} ({[\begin{matrix} 0 & 5 & 0 \end{matrix}]}^{T}) = [- 19.87 15.10 - 6.52 - 19.39

11.30 10.70 18.42 0.78 7.72 - 19.83 - 3.87 - 15.18]^{T},

\tilde{q} (x^{0, 6}) = \tilde{q} ({[0 5 \frac{π}{18}]}^{T}) = [7.77 - 12.37 1.49 - 13.00

- 19.97 - 13.00 - 11.94 - 8.04 - 15.28 15.09 15.07 - 0.58]^{T},

\tilde{q} (x^{0, 7}) = \tilde{q} ({[- 5 5 - \frac{π}{18}]}^{T}) = [0.45 13.65 6.19 15.09

18.80 13.20 15.90 12.47 12.19 - 1.00 15.32 0.80]^{T},

\tilde{q} (x^{0, 8}) = \tilde{q} ({[\begin{matrix} - 5 & 5 & 0 \end{matrix}]}^{T}) = [- 14.21 16.60 20.00 18.30

16.58 15.82 11.94 17.87 14.91 - 1.98 10.48 - 0.72]^{T},

\tilde{q} (x^{0, 9}) = \tilde{q} ({[\begin{matrix} - 5 & 5 & \frac{π}{18} \end{matrix}]}^{T}) = [- 19.45 13.20 9.59 1.21

15.10 15.88 14.76 19.74 17.81 - 1.29 9.59 1.21]^{T},

\tilde{q} (x^{0, 10}) = \tilde{q} ({[- 5 0 - \frac{π}{18}]}^{T}) = [15.30 7.12 14.39 8.88

20.00 16.51 - 2.20 15.61 - 15.28 9.23 0.94 15.28]^{T},

\tilde{q} (x^{0, 11}) = \tilde{q} ({[\begin{matrix} - 5 & 0 & 0 \end{matrix}]}^{T}) = [18.59 10.53 10.52 10.83

15.05 11.20 8.33 13.81 - 5.01 0.86 19.32 12.07]^{T},

\tilde{q} (x^{0, 12}) = \tilde{q} ({[- 5 0 \frac{π}{18}]}^{T}) = [7.11 15.22 8.90 16.00

15.82 20.00 15.86 - 2.20 0.43 - 3.17 - 4.21 - 10.21]^{T},

\tilde{q} (x^{0, 13}) = \tilde{q} ({[- 5 - 5 - \frac{π}{18}]}^{T}) = [15.00 - 15.01 14.84 14.26

10.71 15.44 15.13 15.42 - 0.11 20.00 0.02 0.08]^{T},

\tilde{q} (x^{0, 14}) = \tilde{q} ({[\begin{matrix} - 5 & - 5 & 0 \end{matrix}]}^{T}) = [15.83 - 14.18 15.80 18.81

10.71 15.12 10.56 10.43 0.71 16.86 0.38 15.66]^{T},

\tilde{q} (x^{0, 15}) = \tilde{q} ({[- 5 - 5 \frac{π}{18}]}^{T}) = [15.72 - 10.22 15.73 20.00

19.99 14.51 11.68 12.01 - 1.57 14.77 8.08 0.27]^{T},

\tilde{q} (x^{0, 16}) = \tilde{q} ({[0 - 5 - \frac{π}{18}]}^{T}) = [- 15.22 7.06 - 14.27 - 1.41

- 14.89 - 18.08 - 10.10 - 16.69 20.00 - 16.34 - 18.44 - 9.81]^{T}

\tilde{q} (x^{0, 17}) = \tilde{q} ({[0 - 5 0]}^{T}) = [14.71 19.02 - 16.29 - 5.67

- 13.19 - 16.21 - 9.02 - 18.99 18.51 - 8.58 0.12 4.02]^{T}

\tilde{q} (x^{0, 18}) = \tilde{q} ({[0 - 5 \frac{π}{18}]}^{T}) = [14.13 - 19.06 16.30 7.88

18.35 15.53 6.56 17.53 - 20.00 8.35 7.57 3.36]^{T},

\tilde{q} (x^{0, 19}) = \tilde{q} ({[5 - 5 - \frac{π}{18}]}^{T}) = [- 15.58 9.55 - 15.51 - 17.87

- 18.55 - 9.21 - 9.38 - 16.67 0.38 - 15.68 - 10.49 - 0.41]^{T},

\tilde{q} (x^{0, 20}) = \tilde{q} ({[5 - 5 0]}^{T}) = [- 15.28 8.94 - 16.20 - 12.29

- 15.44 - 15.34 - 16.44 - 14.72 1.13 - 15.36 3.47 - 1.41]^{T},

\tilde{q} (x^{0, 21}) = \tilde{q} ({[5 - 5 \frac{π}{18}]}^{T}) = [- 15.70 11.35 - 15.84 - 10.74

- 15.09 - 15.58 - 16.42 - 16.70 - 0.02 - 15.04 1.81 - 15.35]^{T},

\tilde{q} (x^{0, 22}) = \tilde{q} ({[5 0 - \frac{π}{18}]}^{T}) = [- 1.95 - 11.22 - 19.73 - 16.85

- 15.81 - 19.09 - 4.55 0.29 - 15.16 0.53 15.98 0.85]^{T},

\tilde{q} (x^{0, 23}) = \tilde{q} ({[\begin{matrix} 5 & 0 & 0 \end{matrix}]}^{T}) = [- 12.52 - 20.00 - 20.00 - 13.34

- 19.29 - 20.00 - 15.86 - 19.33 - 0.74 2.20 - 17.81 7.20]^{T}

\tilde{q} (x^{0, 24}) = \tilde{q} ({[5 0 \frac{π}{18}]}^{T}) = [- 14.34 - 1.92 - 12.25 - 19.96

- 19.43 - 9.35 - 0.18 - 15.75 0.93 15.30 - 15.14 16.00]^{T} .

References

Boltyanskiy, V.G. Mathematical Methods of Optimal Control, 2nd ed.; Nauka: Moscow, Russia, 1969; 408p. (In Russian) [Google Scholar]
Chertovskih, R.; Ribeiro, V.; Gonçalves, R.; Aguiar, P.A. Sixty Years of the Maximum Principle in Optimal Control: Historical Roots and Content Classification. Symmetry 2024, 16, 1398. [Google Scholar] [CrossRef]
Åström, K.J.; Kumar, P.R. Control: A perspective. Automatica 2014, 50, 3–43. [Google Scholar] [CrossRef]
Aström, K.J.; Hägglund, T. Advanced PID Control; International Society of Automation: Research Triangle Park, NC, USA, 2006. [Google Scholar]
Bhattacharyya, S.P.; Keel, L.H. Linear Multivariable Control Systems; Cambridge University Press: Cambridge, UK, 2022. [Google Scholar]
Polyak, B.T.; Khlebnikov, M.V. New Criteria for Tuning PID Controllers. Autom. Remote Control 2022, 83, 1724–1741. [Google Scholar] [CrossRef]
Borase, R.P.; Maghade, D.K.; Sondkar, S.Y.; Pawar, S.N. A review of PID control, tuning methods and applications. Int. J. Dynam. Control 2021, 9, 818–827. [Google Scholar] [CrossRef]
Sio, K.C.; Lee, C.K. Stability of fuzzy PID controllers. IEEE Trans. Syst. Man. Cybern. Part A Syst. Hum. 1998, 28, 490–495. [Google Scholar] [CrossRef]
De Silva, C.W. Intelligent Control: Fuzzy Logic Applications; CRC Press: Boca Raton, FL, USA, 2018. [Google Scholar]
Udrescu, S.-M.; Tegmark, M. AI Feynman: A Physics-Inspired Method for Symbolic Regression. Sci. Adv. 2020, 6, eaay2631. [Google Scholar] [CrossRef] [PubMed]
La Cava, W.; Burlacu, B.; Virgolin, M.; Kommenda, M.; Orzechowski, P.; de França, F.O.; Jin, Y.; Moore, J.H. Contemporary Symbolic Regression Methods and their Relative Performance. Adv. Neural Inf. Process. Syst. 2021, 2021, 1–16. [Google Scholar] [PubMed]
Kronberger, G.; Burlacu, B.; Kommenda, M.; Winkler, S.M.; Affenzeller, M. Symbolic Regression; CRC Press: Boca Raton, FL, USA; Taylor Francis: Abingdon, UK, 2024. [Google Scholar]
Makke, N.; Chawla, S. Interpretable scientific discovery with symbolic regression: A review. Artif. Intell. Rev. 2024, 57, 2. [Google Scholar] [CrossRef]
Molnar, C.; Casalicchio, G.; Bischl, B. Interpretable Machine Learning—A Brief History, State-of-the-Art and Challenges. arXiv 2010. [Google Scholar] [CrossRef]
Diveev, A.I.; Sofronova, E.A. Application of network operator method for synthesis of optimal structure and parameters of automatic control system. In Proceedings of the 17-th IFAC World Congress, Seoul, Republic of Korea, 5–12 July 2008; pp. 6106–6113. [Google Scholar]
Kokotovic, V. The Joy of Feedback: Nonlinear and Adaptive. IEEE Control Syst. Mag. 1992, 12, 7–17. [Google Scholar]
Chen, F.; Jiang, R.; Zhang, K.; Jiang, B.; Tao, G. Robust Backstepping Sliding Mode Control and Observer-based Fault Estimation for a Quadrotor UAV. IEEE Trans. Ind. Electron. 2016, 63, 5044–5056. [Google Scholar] [CrossRef]
Krstic, M.; Kanellakopoulos, I.; Kokotovic, P. Nonlinear and Adaptive Control Design; Wiley: New York, NY, USA, 1995; 592p. [Google Scholar]
Kolosov, G.E. Optimal Design of Control Systems. Stochastic and Deterministic Problems (Pure and Applied Mathematics: A Series of Monographs and Textbooks); Imprint CRC Press: Boca Raton, FL, USA, 1999; 424p. [Google Scholar]
Bertsekas, D.P. Dynamic Programming and Optimal Control; Athena Scientific: Belmont, MA, USA, 2005; 840p. [Google Scholar]
Bertsekas, D.P. Abstract Dynamic Programming, 3rd ed.; Athena Scientific: Belmont, MA, USA, 2023; 401p. [Google Scholar]
Sofronova, E.; Diveev, A. Universal Approach to Solution of Optimization Problems by Symbolic Regression. Appl. Sci. 2021, 11, 5081. [Google Scholar] [CrossRef]
Konstantinov, S.V.; Diveev, A.I.; Sofronova, E.A.; Zelinka, I. Optimal Control System Synthesis Based on the Approximation of Extremals by Symbolic Regression. In Proceedings of the European Control Conference 2020, St. Petersburg, Russia, 12–15 May 2020; pp. 2021–2026. [Google Scholar]
Diveev, A. Hybrid Evolutionary Algorithm for Optimal Control Problem. Lecture Notes Netw. Syst. 2023, 543 LNNS, 726–738. [Google Scholar] [CrossRef]
Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the IEEE International Conference on Neural Networks, Perth, Australia, 27 November–1 December 1995; Volume IV, pp. 1942–1948. [Google Scholar]
Mirjalili, S.; Mirjalili, S.M.; Lewis, A. Grey Wolf Optimizer. Adv. Eng. Softw. 2014, 69, 46–61. [Google Scholar] [CrossRef]
Lambora, A.; Gupta, K.; Chopra, K. Genetic Algorithm—A Literature Review. In Proceedings of the 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon), Faridabad, India, 14–16 February 2019; pp. 380–384. [Google Scholar] [CrossRef]

Figure 1. The graph of binary genetic programming for the mathematical expression (18).

Figure 2. Addition of BGP graph to complete BGP graph. (a) part of the graph (incomplete), (b) part of the graph (complete).

Figure 3. Addition of levels in BGP graph for transformation to complete BGP graph. (a) part of the graph (incomplete), (b) part of the graph (complete).

Figure 4. The VCBGP code for the mathematical expression of (51).

Figure 5. The VCBGP code for the mathematical expression of (52).

Figure 6. Robot trajectories from eight initial states (solid black lines),

x^{0, 1}

,

x^{0, 3}

,

x^{0, 7}

,

x^{0, 9}

,

x^{0, 13}

,

x^{0, 15}

x^{0, 19}

, and

x^{0, 21}

, and optimal trajectories (dots) from the same initial states.

Figure 6. Robot trajectories from eight initial states (solid black lines),

x^{0, 1}

,

x^{0, 3}

,

x^{0, 7}

,

x^{0, 9}

,

x^{0, 13}

,

x^{0, 15}

x^{0, 19}

, and

x^{0, 21}

, and optimal trajectories (dots) from the same initial states.

Figure 7. Robot trajectories from eight initial states (solid black lines),

x^{0, 4}

,

x^{0, 6}

,

x^{0, 10}

,

x^{0, 12}

,

x^{0, 16}

,

x^{0, 18}

x^{0, 22}

, and

x^{0, 24}

, and optimal trajectories (dots) from the same initial states.

Figure 7. Robot trajectories from eight initial states (solid black lines),

x^{0, 4}

,

x^{0, 6}

,

x^{0, 10}

,

x^{0, 12}

,

x^{0, 16}

,

x^{0, 18}

x^{0, 22}

, and

x^{0, 24}

, and optimal trajectories (dots) from the same initial states.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Diveev, A.; Sofronova, E.; Konyrbaev, N. Solving the Control Synthesis Problem Through Supervised Machine Learning of Symbolic Regression. Mathematics 2024, 12, 3595. https://doi.org/10.3390/math12223595

AMA Style

Diveev A, Sofronova E, Konyrbaev N. Solving the Control Synthesis Problem Through Supervised Machine Learning of Symbolic Regression. Mathematics. 2024; 12(22):3595. https://doi.org/10.3390/math12223595

Chicago/Turabian Style

Diveev, Askhat, Elena Sofronova, and Nurbek Konyrbaev. 2024. "Solving the Control Synthesis Problem Through Supervised Machine Learning of Symbolic Regression" Mathematics 12, no. 22: 3595. https://doi.org/10.3390/math12223595

APA Style

Diveev, A., Sofronova, E., & Konyrbaev, N. (2024). Solving the Control Synthesis Problem Through Supervised Machine Learning of Symbolic Regression. Mathematics, 12(22), 3595. https://doi.org/10.3390/math12223595

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Solving the Control Synthesis Problem Through Supervised Machine Learning of Symbolic Regression

Abstract

1. Introduction

2. The Problem of General Control Synthesis

3. Symbolic Regression for Solving the Control Synthesis Problem

4. Computational Experiment

5. Conclusions

6. Future Work

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI