1. Introduction
The problem of predicting chaotic dynamics has been studied for a long time in dynamical systems [
1,
2,
3,
4,
5]. Chaos is a non-periodic solution with unstable parameters in a dynamic system, but it has a limitation within a strange attractor. This has been introduced into widely known nonlinear equations, such as Lorenz, Rossler [
1], and Van der Pol. The classical method to analyze a dynamic system uses a time series and a phase portrait. On the other hand, after the Cooley–Tudkey algorithm in the 1960s, research and development of many dynamic systems have led to studies, such as higher-order spectrum analysis (HOSA), logistic maps, the Lyapunov exponent, and recurrence plots to analyze methods, including high-dimensional nonlinearity. Among them, the Lyapunov exponent has been developed and studied in many papers as the most popular method for quantifying chaos. Wolf et al. [
6] developed a numerical algorithm determining the Largest Lyapunov exponent. On the other hand, this algorithm cannot be applied to dynamic systems including discontinuities, and can only be used in smooth dynamic systems. Muller’s algorithm, developed to overcome this problem, can be used in a nonsmooth dynamical system, making it possible to calculate the Lyapunov exponent by applying it to impact and friction problems, including discontinuities [
7].
In mechanical vibration, chaotic signals are a phenomenon that occurs in various applications, such as friction-induced vibration, impact noise, and fault diagnosis. In particular, mechanical vibrations can generate a chaotic phenomenon because they contain strong nonlinearity stemming from the discontinuity of friction and impact. Vibration problems occurring in general friction systems, including nonlinearity, were analyzed for stability via linearization [
8,
9,
10]. On the other hand, after frictional vibration occurs, the nonlinear signal of the vibration increases and can become chaotic. Kang [
11] examined the chaos phenomenon caused by friction-induced vibration using a 2-DOF model that includes the negative slope of a continuous friction curve. In particular, the negative slope was found to be a control parameter capable of expressing chaotic characteristics. In addition, studies on the parameters that express the chaotic characteristics occurring in dynamic systems, including friction, have been investigated in many papers [
2,
3,
4,
5].
In the impact condition, the collision force is extremely discontinuous at the moment of impact. The resulting kinetic phenomena for impact motion have been studied extensively to assess the chaos phenomenon. Serweta et al. [
12,
13] investigated the chaotic characteristics by calculating the Lyapunov exponent for a dynamic system with discontinuities through Muller’s procedure for Hertz’s and Newton’s contact models. Kang [
14] calculated the Lyapunov exponent for the truncated number of modes of the impact beam under distributed contact using the continuous beam model.
In addition to the theoretical approach, the vibration signal can be analyzed using the visualization method of the time series. The analysis method of visualization of the vibration signals, including high-order nonlinearity, was developed in studies, such as Gauss wavelets [
15,
16] and recurrence plots [
17]. Marwan et al. [
17] introduced various recurrence plot analyses as a visualization method of dynamic characteristics occurring in a complex system and analyzed the dynamic characteristics. The phase space is reconstructed using the time delay method. The embedding dimension and time delay must be determined to express the reconstructed phase space by the time delay method. The time delay method is a dynamic problem that can be determined using the Mutual information method [
18]. The embedding dimension is a geometric problem that is being studied to obtain the minimum physical embedding dimension [
18,
19]. In this study, the embedding dimension was determined using the method of false nearest neighbors (FNN) proposed by Kennel. Despite the many studies to understand the characteristics of dynamic systems, the decision on the visualized dynamical characteristics was made by the subjective judgment of the engineer. Therefore, uncertainty is always present, and a dynamical property cannot be judged without a trained engineer.
As studies of visualization of time series have been carried out, methods of distinguishing the characteristics of dynamic signals through the procedure of quantifying recurrence characteristics have been proposed. In particular, recurrence quantification analysis (RQA) can determine recurrence patterns through indices expressed as such in the recurrence rate (RR), the determinism (DET), and the average diagonal line length [
3,
20,
21,
22]. Cluster numbers provided the visualization of the bifurcation [
21]. However, these methods for quantifying recurrence patterns provide indices as a reference, and a professional engineer is required to reinterpret these numbers. On the contrary, the Lyapunov exponent is known as the most deterministic tool to determine chaos when the governing equation is explicitly known [
20].
In the recent rapid development of artificial intelligence, however, many algorithms using machine learning have been developed. In particular, image classification developed numerous algorithms based on a convolutional neural network. In 2015, ResNet in the algorithm proposed by He et al. [
23] exceeded the human cognitive ability in image classification. A network adopting the Residual Learning method has an error rate of 4%, which is lower than the error of 5% of human classification ability and has improved learning ability over the existing learning speed. Based on the above results, this paper proposes a convolutional neural network (CNN) model that can classify chaos by machine learning. The model verified the appropriateness of the proposed CNN model by calculating the Lyapunov exponent. Therefore, the present method is proposed in order to more quickly and accurately determine chaos characteristics of general complex signals after learning the predetermined data using the Lyapunov exponent of explicitly known analytical equations.
2. Methods
Figure 1 shows the vibro-impact model with mass, linear spring, and nonlinear elastic contact. The spring
and
are linear spring coefficients, and the system can be under external kinematic excitation forcing. The spring
at the left end has an amplitude
and an excitation frequency
, and is excited harmonically. The distance between the impact surface at the static equilibrium position of the system is
. In addition, the nonlinear elastic model of the collision force was defined as Hertz’s contact model [
12]. The dynamic behavior of the system can be made dimensionless for the generalization and efficiency of interpretation, as shown in
Figure 1b.
Consider a mass m attached to a spring stiffness coefficient
,
and coefficient of viscous damping
. If the impact force is 0, it is only a single degree-of-freedom equation of motion by excitation. The equation of motion is as follows:
Using the dimensionless time,
and the coordinate transformation
, the dimensionless equation of motion can be written as:
where prime is the differentiation with respect to
, and
is the natural frequency of the undamped system.
is the relation between the excitation and natural frequency.
is the dimensionless damping ratio,
is the relation between the spring coefficient, and
is the dimensionless displacement related to the magnitude of excitation forcing,
The most popular Hertz’s model of contact is the nonlinear elastic model, and the dimensionless collision force is given by:
where
is the dimensionless ratio of the stiffness of the surface, which is dependent on the elastic properties and geometry of the rigid body. The dimensionless equation of motion with impact force can be written as:
where
is the Heaviside function. For numerical integration, Equation (4) can be rewritten in the state space such that:
The dynamical equation of motion can be converted to the state-space vector form as follows:
Here,
is the initial condition of the perturbed solution, and Equation (7) is taken for the Taylor series at
such that:
By letting
substitute into perturbation Equation (8), the variation equation can be written as:
where
,
, and
denotes the Jacobian matrix, identity matrix, and solution of the variational equation, respectively. The spectrum of the Lyapunov exponent for the linearized Equation (9) is estimated through Wolf’s algorithm using the QR-factorization orthonormalization [
6,
24].
In this model, Equation (6) represents a nonsmooth dynamical system with discontinuous impact effects. The dynamic system, including the discontinuity at the moment of impact, can be rewritten from Muller’s method. The discontinuity occurs at the point of impact (
).
The perturbed trajectory is given by:
and the perturbed trajectory satisfies the following equation.
where each interval of discontinuities is smooth, and
and
are the indicator function and the transition condition, respectively. The plus and minus denote the right- and left-sided limits, and
in which
are the Jacobian matrix of the indicator function and transition condition at point
, respectively, and
and
. For the impact oscillator with Hertz’s model of contact, the Jacobian matrix of transition condition and indicator function becomes the following matrix:
The recurrence plot (RP) is used as a tool to visualize the recurrence pattern of a dynamical system. A recurrence matrix is expressed by a series of solution vectors. The corresponding RP is based on the following recurrence matrix as follows:
where
is an L-2 norm,
is the measured points, and
is a system’s trajectory in its phase space. Here, the element of phase space indicates the possible state of the system for the time evolution law. In such cases, phase space needs to be reconstructed. The most popular method of reconstruction is the time delay method
where
is the discrete-time series,
is the sampling rate,
is the embedding dimension,
is the time delay, and
are unit vectors. The reconstruction does not change the dynamical properties, and the reconstructed phase space can be expressed through an appropriately selected embedding dimension and time delay. In general, the time delay can be determined appropriately using the Mutual information method.
During time-delay reconstruction, all self-crossing of the trajectory for the attractor’s dimension
can disappear if it is set to the embedding dimension
. From a physical point of view, it is very important to determine the minimum embedding dimension to minimize the numerical calculation of the Lyapunov exponent. From Equation (1) in dimension
,
is the
th nearest neighbor of
, and the square of the Euclidean distance between the two vectors is
As it expands on dimension
to dimension
by time-delay embedding, the Euclidean distance between
th neighbors for dimension
can be written as follows:
Here, the error for the minimum embedding dimension may be determined through the rate of change of the Euclidean distance as follows:
where
is the threshold. According to a lecture, false neighbors can be identified clearly if
[
18,
19]. Another condition for determining false neighbors is defined based on the fact that the actual value of
is similar to the attractor’s standard deviation,
, using the finite data onto the noise signal. Thus, the Euclidean distance from the dimension
becomes
. The second criterion for determining false neighbors is as follows:
They are decided as false neighbors based on the conditions of Equations (28) and (29).
3. Results
This study aimed to show that it is possible to classify chaos signals and non-chaos signals using a CNN and verify it using the Lyapunov exponent.
Figure 2 shows the flow chart of the proposed methodology. First, the nonlinear time series data onto the parametric dynamic system were obtained by numerical analysis using the Runge–Kutta method. Second, the obtained time series data were determined using the FNN algorithm to determine the embedding dimension, and the phase space was reconstructed. Third, the reconstructed data were presented as an unthresholded recurrence plot to visualize the dynamic characteristics. The Lyapunov exponent corresponding to the change in the control parameter was calculated, and it was trained and verified using the proposed CNN model.
Table 1 shows the layer type, filter size, and shape of each layer of the proposed CNN model.
For image classification using a CNN, many sophisticated models have been developed that transcend human cognitive ability, but the proposed model is simply composed of a two-stage structure to distinguish only chaos and non-chaos, as shown in
Figure 3. In each stage, the convolution layer, activation function, and pooling layer were included. The proposed model had two convolution layers with a 32 − 3 × 3 filter and a 64 – 3 × 3 filter, respectively.
As mentioned above, the unthresholded recurrence plot is a tool to visualize the recurrence characteristics of a dynamic system. In the case of chaotic signals, the repetition characteristics can occur in a very short interval. Therefore, the filter size was set as small as possible. In addition, two max pooling layers of 2 × 2 were used. At the end of the last stage, through two convolution layers, the feature maps were flattened into a column vector, and the features of the image were classified into a fully connected layer for the two types of signals. Softmax was used as the activation function of the output.
One of the gradient-based optimization methods was used to estimate the minimized cost function of the proposed model. The Adam optimizer is an optimization function based on the gradient decent algorithm and achieved faster convergence [
25].
The weight initialization is a very important problem. Many problems can occur if the weight setup is wrong, such as convergence problems and local minimum problems. LeCun initialization follows a Gaussian distribution and uniform distribution of weight initialization for effective backpropagation [
26]. Xavier initialization sets the weight depending on the number of previous and next nodes [
27]. This is the most generalized method, but the output value showed inefficient results when used in the ReLU function. He initialization was developed to compensate for this [
28]. In the proposed model, the weight initialization was performed using the He initialization method following a Gaussian distribution.
The Lorenz system was used as the typical chaotic system for preliminary analysis and CNN verification.
was selected as the control parameter; the other parameters were
and
. The initial conditions were
and
. The Lorenz equation is described in the state space such that:
where
is the state vector and
,
, and
are parameters.
Figure 4 shows the Lorenz system with chaotic attractors.
In the Lorenz system, infinitesimal differences in the parameters or initial conditions alter the state of the system. As shown in
Figure 4a, the trajectory shows a butterfly shape in phase space and moves in an infinite trajectory at a bounded steady state.
Figure 4b illustrates the time series result of
.
Figure 5a presents the Lyapunov exponent of the Lorenz equation with the change in the control parameter.
Figure 5b shows the unthresholded recurrence plot of visualized chaos and non-chaos. [1 0] is a chaos image and [0 1] is a non-chaos image.
The explanation of critical A for the Lyapunov exponent of the Lorenz system has been investigated [
6]. To summarize this system briefly, if
, it represents a strange attractor for the control parameter
and the initial condition. On the other hand, strange attractors are not displayed under some conditions. On the other hand, if
, the system was represented as a stable attractor. In 3D phase space, the Lyapunov exponent has four types of attractors, such as stable fixed points, stable limit cycles, stable two-torus, and strange attractors (Chaos), but only chaos and non-chaos were distinguished in this system. In other words, the unthresholded recurrence matrix can be obtained from the flow of the proposed method in the control parameter representing the chaotic attractor (
) of the dynamic system, which is classified using the CNN.
The Lorenz system consisted of 3800 datasets and 200 × 200 pixels images. The time series analysis and Lyapunov exponents were calculated with a time step of 0.01 and an orthonormalization of 0.1, respectively. The dataset was divided into three main parts. The total dataset was divided into 70% of the training dataset and 30% of the testing dataset. From the divided training dataset, it was divided into 20% of the validation dataset. The dataset samples used for training are described in
Table 2. Because the images are generated sequentially as the Lyapunov exponent increases, errors due to the sequential dataset were removed by shuffling the dataset.
Figure 6 presents the results of the proposed CNN model. The chaos characteristics were trained for 2128 training data samples and simultaneously verified by 532 data samples during each epoch. The tests were then performed on 1140 data samples using the trained CNN model. The batch size was set to 10, and the learning rate of the optimization function was 0.0001. The accuracies and losses were collected as the training and verification data for each epoch, and are plotted in
Figure 6a,b. As shown in the learning results, the accuracies were illustrated as a logarithmic function and reached almost 100%. The losses were indicated as a negative exponential function and converge close to 0%. This suggests that the proposed model detects the recurrence properties of the Lorenz system very well.
Figure 7 shows the feature map in the first layer.
Figure 7a shows the filter map for weight and basis using the He Gaussian method, and
Figure 7b presents the feature map corresponding to each filter. In
Figure 7b, the white image means that the feature cannot be found by the corresponding filter. Overall, the filter finds chaos characteristics well. Over 100 epochs, the proposed CNN model classified the chaotic characteristics with 96% accuracy. This suggests that the proposed CNN model can analyze complex dynamic characteristics that humans cannot recognize with high accuracy. Besides this, the experimental results for 1140 datasets not used for learning were classified with 95% accuracy. This shows that the CNN can extract more complex dynamical properties using generalized features from the unthresholded recurrence plot. Therefore, a numerical experiment was performed on the vibro-impact problem, as below.
In the case of the Lorenz system, it is relatively easy to distinguish the recurrence properties because it is not a system that oscillates for all control parameters. In other words, the system does not vibrate if the Lyapunov exponent reaches a stable equilibrium. Therefore, it is not well expressed for the recurrence properties of the dynamical system. On the other hand, the vibration system with discontinuity has a highly complex recurrence property. Therefore, the classification procedure of the proposed methodology was performed by applying Muller’s method to Equation (6).
Figure 8 shows the Lyapunov exponent of the vibro-impact model calculated for the control parameter
; the other parameters and initial conditions were
,
,
, and
. In this study, only chaos and non-chaos dynamic characteristics were distinguished for the system with impact and excitation, so other detailed types of attractors were not considered. In the analysis results, the Lyapunov exponent was changed drastically if the control parameter increases as infinitesimal.
Figure 9 plots the characteristics of typical attractors for each generated type.
Figure 9a and b shows the time series plot and the corresponding phase portrait for the attractor, respectively, for the case
. The other cases
are shown in
Figure 9c,d. The attractor showed a stable limit cycle without impact, and the sign of the largest Lyapunov exponent was
, as shown in
Figure 9a. If the sign of the largest Lyapunov exponent is zero and the others are negative, it means that the system does not diverge and the oscillation is stable. In other words, it was not made a chaotic trajectory. On the other hand,
Figure 9b shows that an attractor diverges within the bounded steady state to reach the critical region. A chaotic trajectory different from the periodic or quasi-periodic solution was plotted. Here, the sign of the largest Lyapunov exponent is generated as
. This suggests that it was generated as a strange attractor.
Therefore, some of the unthresholded recurrence plots are shown in
Figure 10 for classification into two types:
and others. [1 0] and [0 1] below the image represent chaos and non-chaos, respectively. As shown in
Figure 10, the images by the chaos system and by the non-chaos system always oscillated. Therefore, images were expressed in a very complex form. The complex systems require more sophisticated neural networks. Therefore, we constructed a neural network using only eight residual blocks based on ResNet, one of the most sophisticated neural networks. In total, 3800 datasets were used for training and consisted of 200 × 200 pixels images. The results are shown in
Figure 11.
The proposed CNN model was converged with 95% accuracy over the 200 epochs and appeared in the shape of a logarithmic function. Here, the reason for the fluctuation of accuracy up to the initial 80 epochs was to give a large running rate for rapid convergence of the experiment, and the learning was performed by decreasing the running rate after 80 epochs. On the other hand, the loss was converged over 200 epochs to 0% in the shape of the negative exponential function. The proposed procedure and method of classification using the CNN model was found on the chaotic characteristics of the vibro-impact model well and classified with high accuracy. As a result of experimenting with 1140 datasets not used for training, the chaotic signal was classified with a high accuracy of approximately 92%. This shows that if the characteristics of a relatively very complex dynamic system can be visualized, the characteristics of the signal can be analyzed clearly by the CNN.