1. Introduction
Bearings are among the most important and useful rotating machinery components [
1]. A lack of timely diagnosis and replacement of bearings can disrupt the functionality of machinery. For instance, 40–50% of all electrical motor failures are associated with bearing failure [
2]. Prompt fault detection in bearings can reduce financial loss and health risks.
The vibration signals of bearings usually exhibit a nonlinear behavior due to the effects of coupling and nonlinear interactions, friction, damping, and stiffness [
3], and faults at different signal scales impact signal complexity. Hence, measurements of signal complexity at various scales can contribute to diagnosis and, thus, are commonly used.
Entropy is a measure of the disorder and predictability of the signal. It is one of the most powerful concepts used to evaluate signal characteristics [
4]. Several entropies have been introduced to date, such as sample entropy (SampEn) and permutation entropy (PerEn). We recently introduced dispersion entropy (DispEn) [
5] and demonstrated its advantage over PerEn and SampEn [
5]. In addition to being fast, DispEn can provide a better representation of dynamic signal changes. PerEn considers only the order of the amplitudes with respect to each other, but DispEn takes into account the values of the amplitudes. Unlike SampEn, DispEn is also defined in short series [
6]. Moreover, DispEn is relatively insensitive to noise [
3]. Rostaghi et al. investigated the potential applications of DispEn in rotating machinery diagnosis and demonstrated its superiority over PerEn and approximate entropy (ApEn) [
3]. Liu et al. combined DispEn and wavelet packets to extract the features used for bearing diagnosis [
7]. They calculated the DispEn of each wavelet packet. Li et al. computed the intrinsic mode function (IMF) components of the signals via an improved complete ensemble empirical mode decomposition and used DispEn of the first few IMF components for bearing diagnosis [
8]. Zhenzhen et al. employed variational mode decomposition (VMD) and DispEn for bearing diagnosis [
9].
Disorder and complexity have different physical meanings [
10,
11]. Accordingly, conventional entropies cannot represent complexity without using other algorithms. Therefore, Costa et al. introduced the multiscale algorithm in 2002 to show complexity and analyze non-stationary and nonlinear signals [
12]. They utilized this algorithm for SampEn. Subsequently, this algorithm was used for various entropies and enhanced multiple times. Aziz et al. introduced multiscale permutation entropy (MPerEn) [
13]. Wu et al. introduced refined composite multiscale entropy (RCMSampEn) [
14], Humeau-Heurtier et al. refined composite multiscale permutation entropy (RCMPerEn) [
15], and Azami et al. refined composite multiscale dispersion entropy (RCMDispEn) [
16].
Wang et al. utilized MDE for feature extraction in bearing diagnosis [
17]. Congzhi et al. calculated the RCMDispEn of vibration signals and classified them using the support vector machine (SVM) for bearing diagnosis [
18]. Zhang et al. utilized RCMDispEn and an improved SVM based on the whale optimization algorithm for fault detection of rolling bearings [
19]. Lou et al. employed the RCMDispEn and the deep belief network-extreme learning machine optimized by the improved firework algorithm for rolling bearing sub-health recognition [
20].
Various techniques have been used along with RCMDispEn for bearing diagnosis. These techniques include the fast ensemble empirical mode decomposition [
21], adaptive sparest narrow-band decomposition [
22], improved empirical wavelet transform [
23,
24], VMD [
25], and improved VMD (IVMD) [
26].
Costa et al. introduced generalized multiscale entropy (GMSE) in 2015 [
27]. Generalized algorithms use other statistical properties, such as variance, for coarse-graining. Costa et al. specifically proposed and utilized the standard deviation (SD) and variance [
27]. Wei et al. stated that, unlike the first moment, the second moment simultaneously separates the high and low frequency contents during coarse-graining [
28], and employed variance-based generalized multiscale fuzzy entropy for diagnosis in rotating machinery [
28]. Zheng et al. utilized generalized composite multiscale permutation entropy-based variance and the Laplacian score for bearing diagnosis [
29]. Liu et al. detected bearing faults using generalized composite multiscale amplitude-aware permutation entropy-based variance and dual-tree complex wavelet packet transform [
30].
Because of the advantages of DispEn-based algorithms over the SampEn-, FuzEn-, and PerEn-based algorithms [
6], the present paper investigates refined composite generalized multiscale dispersion entropy (RCGMDispEn) based on variance and skewness with RCMDispEn for bearing fault diagnosis. It is worth mentioning that generalized multiscale dispersion entropy is proposed in this study for the first time to probe the properties of time series related to higher moments (the second and third moments, i.e., variance and skewness).
A combination of several classifiers was used to overcome the limitations of each classifier and achieve higher efficiency [
31,
32,
33]. In numerous studies, several classifiers have been used with a classifier utilizing the results of the other classifiers for final classification [
32,
34,
35]. Belaout et al. combined several Sugeno ANFIS to construct an output vector and introduced a multiclass ANFIS based on the winner-takes-all rule [
36]. Similarly, multiclass FCM-ANFIS was used in this study to classify different kinds of faults.
The rest of the paper is organized as follows.
Section 2.1 reviews the theory of DispEn, and
Section 2.2 and
Section 2.3 introduce the calculation of GMDispEn and RCGMDispEn, respectively.
Section 3 introduces the theory behind combining ANFIS networks. In
Section 4, RGMDispEn and GMDispEn methods are compared to MDispEn and RCMDispEn, respectively, in terms of diagnosis capability in simulated bearing signals.
Section 5 uses three different datasets to demonstrate that simultaneously using RCGMDispEn and RCMDispEn in practical applications can provide better efficiency than MDispEn and RCMDispEn. Finally,
Section 6 concludes the paper.
2. Generalized Refined Composite Multi-Scale Dispersion Entropy
2.1. Dispersion Entropy
The DispEn for the time series
with a length of
N can be calculated in six steps [
5]:
Step 1. The signal is normalized between 0 and 1. The series
is obtained according to (1) from the normal cumulative distribution function (NCDF) of the series
:
Here, and denote the SD and mean value of the time series , respectively.
Step 2. Each member of the time series
y is mapped to an integer between 1 and
c (Equation (2)):
c is the class parameter and indicates the number of classes that can be members of the time series . is the ith member of the classified series .
Step 3. All the template vectors
(
) are created as follows:
where
m and
d, respectively, denote the embedding dimension and time delay. The embedding dimension is the dimension of the state space used for reconstruction.
Step 4. Each series
is mapped to a pattern
based on its values, while the following holds:
The number of possible dispersion patterns that can be attributed to each series
is equal to
, because each
has
m members, and each of them can be an integer from 1 to
c [
5].
Step 5. For every
dispersion patterns
, the relative frequency is obtained using Equation (5); i.e., the number of dispersion patterns
that are attributed to the series
is divided by the total number of
m-dimensional series created.
is the probability of dispersion pattern .
Step 6. DispEn with the embedding dimension
m and number of classes
c is calculated according to Equation (6):
To calculate the normalized DispEn (NDispEn) according to Equation (7), DispEn is divided by the largest possible DispEn.
When
m or
c is too large, the computation time is high, although it makes the DispEn values more reliable [
5]. In addition, if the embedding dimension
m is too small, the dynamic changes may not be detected in the signal, whereas large
m may cause DispEn to be unable to observe small variations [
5]. Based on the abovementioned facts and previous studies [
3,
5], the parameters
m = 2 and
c = 8 were used to calculate DispEn.
2.2. Generalized Multiscale Dispersion Entropy
Multiscale dispersion entropy (MDispEn) and generalized MDispEn (GMDispEn) compute DispEn in several consecutive scales based on the first and other momenta. The nth-moment-based generalized MDispEns are displayed as GMDispEnn. They are implemented as follows:
The signal is coarse-grained up to where the time series
, which is the time series
x with the scale
and the
nth moment, is constructed [
19]:
For MDispEn, based on the first moment:
For GMDispEn
2, based on the second moment (variance):
For GMDispEn
3, based on the third moment (skewness):
where
.
The DispEn of the signal is computed. Here, the mean and the SD of the main signal are used for mapping based on the NCDF before coarse-graining. This approach is similar to keeping r constant while calculating the multiscale entropy (MSE) such that r = 0.15 D (original signal) for all scales.
With a change in , often carried out by adding 1 to , Steps 1 and 2 are repeated until the desired scale is reached.
For DispEn, the parameters must be set in such a way that the number of possible dispersion patterns becomes smaller than the signal length . Because the signal length for GMDispEn is reduced to due to coarse-graining, is recommended for GMDispEn.
2.3. Generalized Refined Composite Multi-Scale Dispersion Entropy
In the calculation of the RCMDispEn and the
nth-moment-based RCGMDispEn (RCGMDispEn
n) with a scale factor of
,
different time series are constructed by coarse-graining based on the first and higher momenta in order and with different starting points. The relative frequency of the dispersion patterns is calculated from every
time series. The
kth coarse-grained time series
from the series
is obtained based on the
nth moment and the scale
as follows:
where
.
Hence, for every scale factor, RCGMDispEn
n is defined as follows:
where
.
is the relative frequency of the dispersion pattern
in the time series
.
In RCGMDispEn, coarse-grained time series with a length of are considered. Thus, the total number of samples calculated in RCGMDispEn is . Therefore, RCGMDispEn with a length of produces reliable results. This special property is significant in short-length signals.
It must be noted that the scale starts from 2 for calculating GMDispE
2 and RCGMDispE
2, and from 3 for calculating GMDispE
3 and RCGMDispE
3 [
37,
38].
4. Analysis of a Simulated Bearing Signal
The vibration signal of ball bearing with an outer race fault was simulated as follows:
where
and
represent the impulse series and the harmonic series, respectively, and
n(
t) denotes the noise.
Based on previous studies [
48,
49],
was modeled using Equation (24):
represents the resonance frequencies of the bearing, which are significantly higher than the fault frequency . represents a small random change in the interval between two impulses. The ball slipping effect changes the period randomly to . Hence, for every k, was considered to be a random number from a normal distribution with a zero mean and standard deviation of .
Two sinusoidal functions were employed for the harmonic part of Equation (23) [
50,
51]:
In this simulation, the characteristic frequency of the fault and the damping factor were assumed to be and . Moreover, represent the resonance frequencies of the bearing, and denotes the magnitude of the impulse amplitude, which is a measure of the damage intensity. In addition, the rotor frequency was considered to be , and and represent the amplitudes of the first and second harmonics of the rotor, respectively. The signal of a healthy bearing was modeled by eliminating the fault impulses.
Fifty independent healthy and faulty bearing signals with a length of 2048 data points and a sampling frequency of 40 kHz were simulated. Moreover, Gaussian noise was added to them with the variance ratio of signal to noise of 0.257 [
52].
Figure 2 shows an example of these signals.
MDispEn, GMDispEn
2, GMDispEn
3, RCMDispEn, RCGMDispEn
2, and RCGMDispEn
3, were calculated for the simulated signals, with the results displayed in
Figure 3. In this figure,
p-values smaller than 0.05 are identified with asterisks. According to
Figure 3, RCMDispEn, RCGMDispEn
2, and RCGMDispEn
3 possess higher fault distinguishing capability than MDispEn, GMDispEn
2, and GMDispEn
3, respectively, and their results have a smaller standard deviation. Distinguishing abilities of the bearing faults using the generalized methods are also displayed.
Hedges’
g effect size [
53] was used to evaluate the capability of these methods in discriminating the faulty from healthy ball bearing signals. The results are shown in
Table 1. As can be seen, the GMDispEn
2, GMDispEn
3, RCGMDispEn
2, and RCGMDispEn
3 algorithms effectively show the differences between the healthy and the faulty conditions, similar to MDispEn and RCMDispEn. RCMDispEn, RCGMDispEn
2, and RCGMDispEn
3 have larger size effects and better fault separation capability than MDispEn, GMDispEn
2, and GMDispEn
3, respectively.
5. Analysis of the Experiments
5.1. Analysis of the Vibration Signals Acquired from the Case Western Reserve University Dataset
This section uses datasets from Case Western Reserve University (CWRU), US [
54], with a sampling frequency of 48 kHz. The experimental set-up includes a three-phase induction motor, a torque transducer, and a dynamometer. The ball-bearing vibration signals were collected using an accelerometer installed on the motor housing at the drive end of the motor.
The signals consisted of 10 different fault conditions: healthy, ball fault, inner race fault, and outer race fault with intensities of 0.021″, 0.007″, and 0.014″. The shaft rotating speeds were 1772, 1750, and 1730 rpm.
A detailed description of the data set is shown in
Table 2. For each condition, 180 samples with a length of 2048 were separated from the dataset signals with no overlap between any two samples.
Specifically, 72, 18, and 90 signals were used for training, validation, and testing, respectively. MDispEn, GMDispEn2, GMDispEn3, RCMDispEn, RCGMDispEn2, and RCGMDispEn3 were calculated for all the signals, and their values were used in 20 scales as features for fault detection and classification. A binary vector was used as the target vector for every bearing condition. This binary vector had a length of 10 because 10 conditions were being studied. This research employed 10 FCM-ANFIS, each of which detected one element in the target vector.
The faulty conditions classification using multiclass FCM-ANFIS was performed 20 times with different inputs. The results of classifying these features are displayed in
Figure 4 and
Table 3. In this example, RCMDispEn, RCGMDispEn
2, and RCGMDispEn
3 performed better at classification than MDispEn, GMDispEn
2, and GMDispEn
3, respectively. Moreover, the simultaneous use of RCMDispEn, RCGMDispEn
2, and RCGMDispEn
3 as the classifier inputs produced the most accurate classification.
Table 4 represents the confusion matrix of the best performance using these inputs.
5.2. Analysis of the Signals Acquired from the PHMAP 2021 Data Challenge Dataset
Part of the PHMAP 2021 data challenge dataset [
55] was used in this section. The studied equipment consists of an oil injection screw compressor, containing a 15 kW and 3600 rpm motor and a 7200 rpm screw axis. This paper used data acquired using an accelerometer installed on the motor with a sampling frequency of 10,544 samples per second.
Three fault conditions were examined: (1) high Looseness of V-belt, (2) faulty bearing, and (3) fault-free condition. Three hundred independent signal samples with a length of 1024 samples were separated for each fault condition.
MDispEn, RCMDispEn, GMDispEn
2, RCGMDispEn
2, GMDispEn
3, and RCGMDispEn
3 were calculated for all the signals, and their values were used in 20 scales as features for fault detection and classification. For each condition, 120, 30, and 150 samples were used for training, validation, and testing, respectively. These data were classified 20 times using multiclass FCM-ANFIS. The results are displayed in
Figure 5 and
Table 5. As can be seen, the highest accuracy was achieved by the combined use of RCMDispEn, RCGMDispEn
2, and RCGMDispEn
3 as inputs. However, the mean accuracy of RCMDispEn and RCGMDispEn
2 as simultaneous inputs was greater than that of other inputs. These results confirm the proposal of this paper regarding the use of generalized multiscale entropies with multiscale entropies to improve the results. The best classification results are displayed in
Table 6.
5.3. Analysis of Vibration Signals Acquired from the Paderborn University Dataset
The data used in this section were from the ball bearing data collected in the Mechanical Engineering Construction and Drive Technology (KAt) Research data center, Paderborn University, Germany [
56,
57].
The classification of the datasets used in the present work is presented in
Table 7, which represents three different fault conditions: (1) inner race damage, (2) outer race damage, and (3) healthy. The vibration signals corresponding to different bearing fault conditions under different operating conditions, shown in
Table 8, were collected with a sampling frequency of 64,000 Hz.
A signal with a length of 1024 was separated from the beginning of every measured vibration signal, with 60 signals separated from each dataset, to obtain a total of 300 signals for each fault condition.
MDispEn, RCMDispEn, GMDispEn
2, RCGMDispEn
2, GMDispEn
3, and RCGMDispEn
3 were calculated for all the signals, and their values were used in 20 scales as features for fault detection and classification. For each condition, 120, 30, and 150 samples were used for training, validation, and testing, respectively. These data were classified 20 times using multiclass FCM-ANFIS. The results, displayed in
Figure 6 and
Table 9, confirm the suggestion made by the present study. Specifically, the highest classification accuracy corresponds to the features extracted by the combination of RCMDispEn, RCGMDispEn
2, and RCGMDispEn
3. Moreover, the smallest classification accuracy corresponds to the features extracted by RCMDispEn, RCGMDispEn
2, and RCGMDispEn, separately.