1. Introduction
Induction motors are widely employed as prime movers in power system applications such as transportation and petrochemical industries, due to their low cost, simplicity of control, and high performance. However, they might be subjected to many electrical and mechanical defects as they operate for a long time. Moreover, the initial fault, if not detected at the early stage, can result in a downtime of the whole motor and increased production loss [
1]. Therefore, Condition Monitoring (CM) and fault diagnosis are very significant to ensure machine availability and reduce maintenance costs [
2]. Overloading, abrasion, unbalanced loads or electrical stress can slightly damage any of the components of the induction motor.
Condition monitoring technique based on machine learning operates by analysing historical data collected from the machine using various sensors, and under various operating conditions. The output signals that are gathered from the sensors are time-series signals, and various analysis techniques such as time-domain analysis, frequency-domain analysis, and time-frequency domain analysis are applied to extract the energy of the initial features. Time-domain analysis techniques are simple to implement with a basic understanding of the signal such as root mean square method, high-order statistics, and the short impulse method [
3]. The frequency-domain determines the signature function without any prior knowledge such as Fourier transform [
4], envelope analysis [
5], and high-order spectral analysis [
6]. In addition, in the frequency-domain analysis, no previous information is required to determine the signature features [
7,
8]. However, it is ineffective for nonstationary signals. Therefore, time-frequency domain methods [
9] such as wavelet transform, Short-Time Fourier Transform (STFT), and Hilbert–Huang Transform (HHT) are employed to overcome the issue of analysing the nonstationary signal [
4,
10]. For these benefits, time-frequency analysis has been growing in popularity as it performs well for both stationary and nonstationary signals [
9].
In the diagnosis of induction motor failure, three interests of research can be investigated: (1) signature extraction-based approaches; (2) model-based approaches; and (3) knowledge-based approaches [
11]. By surveying fault signatures in the time and/or frequency domain, signature extraction-based techniques can be achieved by monitoring the signal as served in the traditional techniques such as vibration analysis [
12,
13,
14], electromagnetic field monitoring [
15], motor current signal analysis (MCSA) [
16], infrared signal analysis [
17], acoustic signal analysis [
18], and partial discharge measurement [
19]. Model-based techniques use mathematical modelling to check the performance and predict the failures under different conditions. Furthermore, as model-based techniques can offer warnings and predict incipient faults, their accuracy is mostly reliant on explicit motor models, which are not always accessible. On the other hand, knowledge-based approaches that use learning techniques can overcome this issue such as machine learning, and motor or load characteristics. The knowledge-based approach emerges as a promising research topic for induction-motor failure diagnostics with the continuous advancement of machine learning algorithms. The stator current and vibration signal are the most often utilized signals among machine learning-based defect diagnostic systems, either alone or in combination with other signals. In [
20], the Short-Time Fourier Transform algorithm (STFT) is used to process the quasi-steady vibration signals to continuous spectra for the neural network model training. The effectiveness of the proposed method is demonstrated through experimental results, and it has been shown that a robust induction machine condition monitoring and diagnosis system can be achieved. A novel monitoring scheme applied to diagnose bearing faults was proposed in [
21]. First, some statistical-time features were determined from vibration signal, and the effectiveness of this scheme has been verified by experimental results. A convolutional discriminative feature learning method was proposed in [
22] for induction-motor fault diagnosis. Firstly, Back-Propagation Neural Network (BPNN) was used to create local filters that capture discriminative information. Secondly, to extract final features from these local filters, a feed-forward convolutional pooling architecture was created. Then, the learned attributes were passed into the support vector machine classifier, which identified six classes. The experimental results indicate that the proposed approach has considerable performance and is effective for diagnosing induction motor faults. Alternatively, the stator current signal has gained attention in the induction motor fault diagnosis task. In [
23], a technique based on a statistical analysis of the harmonics of the stator current was presented utilizing permanent magnet synchronous machines (PMSMs). The stator current and wavelet and short-time Fourier transformations were used to assess bearing deterioration. Motor Current Signature Analysis (MCSA) successfully diagnosed single Broken Rotor Bar (BRB) faults as stated in [
24]. A new induction-motor diagnosis methodology was proposed in [
25], which is based on creating a two-dimensional time-frequency plot illustrating the time-frequency evolution of the essential aspects in an electrical machine transient current. It was demonstrated that these wavelets provide efficient filtering in the region next to the major frequency, as well as a high degree of information in time-frequency maps. The stator current was represented by a combined voltage and machine learning approach in [
26] to forecast faulty operating mode evolution on an induction machine.
Recently, there has been an increase in using Artificial Intelligence (AI) approaches in the process of fault classification. Artificial intelligence approaches such as expert systems, Neural Networks (NN), Fuzzy Logic (FL), Support Vector Machine (SVM), and Genetic Algorithms (GA) can be implemented for this purpose. The objective is to use feature extraction and feature selection techniques, then the training stage to teach the model with these features for categorizing the relative class. It has been concluded in many research studies that the Feature Selection stage (FS) is a significant process when building a robust machine learning application for fault diagnosis. The reason for selecting some specific features is to achieve a reduction of data, shorten the time of learning, improve the classification, and lower the measurement costs [
13].
There are three categories of FS techniques: filter, wrapper, and embedded [
27]. The embedded method that combines the wrapper and the filter methods has good convergence compared with the wrapper method [
28].
The process is to apply the filter stage for minimizing the number of features in a subset. This follows the wrapper stage that uses several local or global search algorithms. Because of the benefits listed above, the hybrid approaches are increasingly attracting the attention of many researchers. In recent years, evolutionary algorithms have achieved a great deal of attention for their capacity to solve FS such as Genetic Algorithm (GA) which is gaining popularity [
29], Particle Swarm Optimization (PSO) [
30], Artificial Bee Colony Algorithm (ABC) [
31], Grey Wolf Optimizer (GWO) [
32], Flower Pollination Algorithm(FP) [
33], and Differential Evolution Algorithm (DE) [
34].
In [
35], GA was successfully implemented as a feature selection tool to reduce the data dimension. In [
36], a novel feature selection technique based on bare-bones PSO (BBPSO) with mutual knowledge was suggested. The findings showed that the suggested method achieves a superior feature subset and is a highly competitive FS algorithm. Furthermore, a multiobjective PSO-based approach for ranking features based on their frequency (RFPSOFS) in the archive set was proposed [
37]. These rankings were utilized to fine-tune the archive set. A system for feature selection based on multiobjective Gray Wolf Optimization was proposed in [
38]. The proposed system depended on the data description and the classifier used, and it achieved much robustness and stability compared against different common searching methods such as particle swarm optimization and genetic algorithm. ZorarpacI and Özel proposed a hybrid approach of Differential Evolution and Artificial Bee Colony for feature selection [
34], the proposed method’s performance was also compared to research in the literature that uses the same datasets. The experimental results showed that the proposed hybrid technique can select excellent features for classification tasks to increase the classifier’s run-time performance and accuracy. To predict rock tensile strength, a new artificial neural network (ANN)-based model with Invasive Weed Optimization (IWO) was proposed in [
39]. The suggested hybrid of the IWO–ANN model showed a greater degree of prediction accuracy. Using FTIR and NIR datasets, Invasive Weed Optimization (IWO) was also applied to create a simple and creative variable selection approach [
40]. The results showed that the performance of IWO was robust. In [
39], a prediction model was proposed using Invasive Weed Optimization (IWO) with Technique-Based Artificial neural network (ANN) for rock tensile strength; the results showed that the IWO–ANN model is a suitable alternative solution for a robust and reliable engineering design. An efficient swarm intelligence approach to feature selection based on Invasive Weed Optimization (IWO) was proposed in [
40]; the result has been shown to be very adaptive and powerful to environmental changes.
A feature selection technique based on the Invasive Weed Optimization algorithm (IWO) has been implemented in a few applications to decrease the number of obtained features and achieve both a strong learning process and low classification error. Invasive Weed Optimization (IWO) is a continuous stochastic numerical algorithm that was proposed by Locas [
41]. It is a swarm intelligence metaheuristic algorithm that is inspired by the invasive weed’s colonization behaviour along the journey to find an appropriate place for growth and reproduction. This technique offers several advantages, including a simple structure, easily understood, and program characteristics. Moreover, the results achieved by using this algorithm are quite reliable.
Concisely, the main investigation of the present work is to extend the proposed work in [
42]:
An effective machine learning system-based fault diagnosis of induction motor using experimental data is suggested.
Both the current and vibration motor signals are selected to be recorded simultaneously for condition monitoring.
As the different motor loadings between the training and testing processes can deeply influence the fault diagnosis [
15], experiments in this study were conducted for three motor loadings, namely 0% no-load, 50% half-load, and 100% full load to investigate the impact of the operating conditions.
Matching Pursuit (MP) and Discrete Wavelet Transform (DWT) in the time-frequency domain are applied for features extraction.
Some statistical features such as mean, median, standard deviation, and others were calculated from the raw signals.
A comparison of study-based feature selection methods using the Invasive Weed Optimization algorithm (IWO) and Genetic Algorithm (GA) was performed to reduce the number of the extracted features.
This research investigates the classification results of three different algorithms: k-Nearest Neighbour (KNN), Support Vector Machine (SVM), and Random Forest (RF) that were trained using k-fold cross-validation.
The rest of this paper is organized as follows:
Section 2 presents methods that include information about the experimental testbed and dataset collection in the lab;
Section 3 presents the research methodology and the suggested model in this research;
Section 4 provides the results that validate the proposed model with comparisons;
Section 5 presents the discussion; and lastly, a conclusion with future work is drawn in
Section 6.
2. Materials and Methods
2.1. Bearing Damage
In rolling bearings, rolling elements such as balls or cylindrical rollers are placed between the inner and outer races. Pitting or flaking can occur in the bearing components, due to wearing or material fatigue [
1]. That means if there is any early damage to the bearing, shock pulses with certain frequencies appear in the frequency domain. These characteristic frequencies are dependent on the affected section of the bearing and can be determined by using the geometry and the mechanical rotational frequency
. The characteristic frequencies of each types of fault are calculated in the frequency domain as in the following equations [
43]:
where as
is the contact angle of the balls,
is the number of balls or cylindrical rollers, mechanical rotational frequency
,
is the diameter of the ball, and
is the ball pitch diameter.
2.2. Broken Rotor Bar Damage
Rotor bar and its bearing have undergone significant alterations. However, stator core, stator windings, and housing structure have all been left out and no significant adjustments have been made to them. Broken Rotor Bar (BRB) fault can occur due to the following aspects: thermal unbalance, over-loaded during starting, frequent start at rated voltage thermal stress [
44].
If a BRB fault occurs, the current flow in that bar will be interrupted. As a result, the near faulty bar in the rotor will be inaccessible. Therefore, the response to this imbalance, an Unbalanced Magnetic Pull (UMP), is generated and rotates at the same rate as the rotating speed. It modulates at a frequency that is the same as several slip frequencies and has several poles.
Frequencies components in the frequency domain of broken rotor bars being induced in the stator winding can be visible around the principal slot harmonics in the current spectrum as follows:
where fs represents the supply frequency, k is an integer, and s indicates the slip.
These sidebands frequencies are dynamic and vary with the operating condition of the motors [
45].
2.3. Stator Damage
Stator faults can be categorized as stator winding: stator core laminations, and the frame of the stator. The frequency components that can appear in the frequency domain due to this fault in the current spectrum are given by [
46]:
where
short turns frequency,
is the supply frequency, p is the number of pole pairs, k = 1, 3, 5; n = 1, 2.
2.4. Experimental Testbed
The dataset used in this work was obtained from Three-Phase Squirrel Cage Induction Motor (Clarke motor 80B/4, Cardiff, UK). The proposed work was executed at Wolfson Centre for Magnetics, Machines and Transformers Laboratory, Cardiff University, UK. The stator current and vibration signals were chosen in this research to be recorded because any initial motor faults can create unbalance inside the motor, which will be immediately reflected on stator currents and vibration signals.
The test rig shown in
Figure 1 consisted of a 4-pole, 50 Hz, 0.75-HP, 230/400 V, 1380 rev/min (Model Clarke 6430439) induction motor that was connected to a dynamometer that can allow the motor load to be controlled through the application of opposite torque and to be adjusted by the dynamometer’s control knob. The dynamometer also displays the rotational speed of the motor in revolutions per minute (RPM).
In order to collect the motor vibration signal, an overhead laser vibrometer (model OFV-3001) was utilized. The vibrometer was connected to an oscilloscope that displayed the vibration signal on a screen. Two important factors in the vibration measurement need to be calibrated before collecting the data, which are velocity and displacement range. The first one has been set to 25 mm/s/V and the latter was set to 125 μm/V.
To record the current signal passing through the motor stator, a current transformer was connected between the motor and the data acquisition card (NATIONAL IN-STRUMENTS IN USB-6211).
The current and the vibration signals were recorded to produce a dataset that included both the healthy and faulty behaviours of the induction motor. Three categories of motor defects are proposed in this work that are artificially generated in the lab, including bearing defect, broken rotor bar defect, and stator defects. Including the healthy case, data were collected for eight different motor conditions as shown in
Figure 2. The ball bearing fault was created by removing one ball with its cage as shown in
Figure 3a, the outer bearing fault was made by drilling a 0.25 cm hole into the outer bearing race as demonstrated in
Figure 3b, and the inner bearing fault was generated by drilling the same hole into the inner bearing race as shown in
Figure 3c.
One_Broken Rotor Bar fault (1BRB) was produced artificially by drilling a hole with 4.2 mm diameter and a 16 mm depth dimension in the rotor bar to cut the bar resistance as presented in
Figure 4a. The Five-Broken Rotor Bar (5BRB) and Eight-Broken Rotor Bar (8BRB) faults were created with the same diameter and depth dimensions used for One-Broken Rotor Bar (1BRB) fault, and the holes were separated by a certain angle as displayed in
Figure 4b,c, respectively. The stator fault was an open circuit in one turn of the stator winding.
A test rig was set and operated under various speed conditions to investigate the impact of the operating conditions on the proposed model. The load applied to the motor may be determined by looking at the rotational speed of the motor, as the rotational speed of the motor decreases as the load increases. Three different speeds were considered during these experiments: 1480 rpm (no-load), 1450 rpm (half-load), 1380 rpm (full-load). In this study, the full-load was set as 1380 rpm, because it is the rated speed of the used induction motor, and it is unlikely that this load would be exceeded under normal industrial operating conditions. The half-load was set to speed 1450 rpm, and no-load was when the motor ran without loading at speed 1480 rpm.
2.5. Dataset Acquisition
In this section, the induction motor was placed inside the experimental test rig. A large part of the data capture process was executed with the proposed materials. The motor phase currents and the vibration signals were acquired. Several experimental measurements, each for 20 s, were taken for every motor condition as presented in
Table 1. As all the stator current and the vibration measurement equipment have a USB connection, the obtained signals were recorded and stored in flash memory.
The sampling frequency for vibration measurements was 15 kHz and the number of sampled data of the current measurements was 20,365 points with maximum frequency of 2 kHz. In each test, three-phase stator currents (, , and ) and vibration signals were recorded simultaneously considering different load conditions by changing the rotational speeds through the use of an eddy-current brake.
4. Results
In this section, the proposed application is assessed based on the experimental data of the current and the vibration signals.
In order to verify whether the proposed model in combination with the proposed feature selection techniques to select discriminative features benefits the fault detection procedure, the acquired signals with same extracted features were used as inputs to the classification algorithms. When the current signal was applied, instead of fourteen features, invasive weed optimization (IWO) selected eight features that indexed in [
5,
10,
11,
16,
22,
23,
29,
34] with the best cost of 0.0006 as illustrated in
Figure 8a for creating the final feature matrix. On the other hand, the feature index [
2,
4,
5,
7,
11,
14,
15,
17,
23,
26,
28] selected from the vibration signal with the best fitness equal to 0.0034 is shown in
Figure 8b.
A similar pattern was conducted using Genetic Algorithm (GA) for comparing the performance and the superiority of the proposed IWO algorithm. Based on the implementation of GA using the current signal, nineteen features were carefully chosen by instead of all original features. The indexed positions of the selected features are given in the following vectors [
3,
5,
9,
10,
11,
12,
13,
14,
16,
17,
19,
21,
22,
23,
25,
26,
29,
33,
34] with loss curve displayed in
Figure 9a. A feature size of eighteen was selected from the vibration signal that indexed in [
2,
3,
4,
5,
7,
9,
11,
13,
14,
15,
16,
17,
19,
20,
23,
26,
27,
28] with loss curve shown in
Figure 9b. As stated in
Table 6, IWO selected a minimum of eight optimal features when the current signal was applied, and it can select eleven optimum features from the vibration signal. Moreover, GA selected a minimum series of 19 features and 18 features applying the current and the vibration signals, respectively. The classification results based on the utilization of IWO and GA for features selection are shown in
Table 7 and
Table 8, respectively.
Next, the optimal feature subsets that were selected by IWO and GA were applied to three machine learning models, KNN, SVM, and RF, for classification into their respective classes. The optimum values for the parameters of KNN, SVM, and RF were properly set as follows: neighbour = 1, C = 20, kernel = RBF, and n_estimators = 250. Each classifier’s performance was evaluated using the specificity, accuracy, sensitivity, prediction, the F1-score, and receiver operating characteristic (ROC) curve. Moreover, the area under the curve (AUC) was considered in ROC because it gives an excellent indication of how well a classification model performed on a dataset. The AUC curves ranged between 0 and 1. If this value is around or less 0.5, that means the classifier has not performed well with misclassification. On the other side, when the value is close to 1, that represents the efficient model. The ROC curves of the proposed model were presented in
Figure 10a,b applying the current and vibration signals, respectively.
It can be stated that the highest classification accuracy was gained by Random Forest classier (RF) which was 99.9% when the model was trained with 10-fold cross-validation by applying the current signal. And it was 99.7% when the model was trained with the same 10-fold cross-validation by applying the vibration signal. Furthermore, when the SVC model was trained with 5-fold cross-validation, the accuracy was 97.7%, and it was further raised to 98.4% when the model was trained with 10-fold cross-validation using the current signal. In addition, the classification accuracy using the KNN classifier, was less equal to 97.1% and 91.5% with 10-fold cross-validation applying the current and vibration signals, respectively. The other evaluation parameters such as specificity, precision, recall, and F1-score were given the outcome in the same representations as to the accuracy. Furthermore, The AUC score for RF was the same, which was 1 for the current and vibration signal, while it was 0.99 and 0.98 for SVC, and it was 0.97 and 0.95 for KNN, which indicates that RF and SVC models perform well in comparison with the KNN classifier.
In addition, it can be confirmed that as stated in
Table 9, RF achieved the highest accuracy by applying the current signal, which was 99.2% and 99.6% with the utilization of 5-fold and 10-fold cross-validation, respectively. Furthermore, it was 98.9 and 99.1% with the utilization of 5-fold and 10-fold cross-validation applying the vibration signal, respectively. When the model was trained using SVC, the highest accuracy was achieved with the use of 10-fold cross-validation, which was 98.4% for the current signal, and 99.4% for the vibration signal. Additionally, when the model was trained using KNN, the best accuracy was obtained using the current signal when the model was trained using 10-fold cross-validation, which was 93.7%. The other evaluation parameters represent the same outcome.
Moreover, it can be seen from
Table 7 and
Table 8 that both IWO and GA produced better results using 10-fold cross-validation when compared to 5-fold cross-validation. For further comparison, GA and IWO are compared against each other as shown in
Table 9 applying the stator current and vibration signals. When IWO was coupled with KNN and SVM classifiers, IWO achieved better results in current signal data for all measurements.
Regarding the IWO with RF classifier (IWO–RF), IWO managed to achieve better results in vibration signal data in all evaluation measurements. However, for current data, IWO–RF can achieve similar sensitivity results and better classification accuracy against IWO–RF. Therefore, GA–RF managed to achieve better results than IWO–-RF in precision, sensitivity, and F1-score measurements.
6. Conclusions and Future Work
In this paper, a novel and hybrid model is proposed and implemented for fault diagnosis in an induction motor. In order to enhance the performance of the proposed application, some optimization algorithm-based feature selection were adopted to select the discriminative features. Then, the model was further compared with the number of executed features. Three machine learning classifiers were trained with cross-validation strategy to detect the induction motor faults. To validate the robustness of the proposed model, the current and the vibration signals from different motor states, including the healthy and seven faulty conditions, were applied. Forty of the initial features were extracted separately from each signal using Matching Pursuit (MP) and Discrete Wavelet Transform (DWT) in the time-frequency domain. A reduction of the redundant data was achieved. A minimum of eight features from the current signal and eleven features from the vibration signal were precisely selected by applying Invasive Weed Optimization (IWO). On the other hand, Genetic Algorithm (GA) achieved eighteen features from the vibration signal and nineteen features from the current signal. The selected features were utilized to train three machine learning classifiers, KNN, SVM, and RF, for faults diagnosis. The overall accuracy of the classification algorithms that trained with 10-fold cross-validation was satisfactory, indicating that the proposed model is promising for this application. Additionally, to validate the effectiveness of the suggested model, a comparison between the optimization algorithm-based feature selection on the same dataset was conducted. The comparison results indicated that the proposed methods that use fewer statistical features have a better equivalent accuracy. However, it was noticed that the Invasive Weed Optimization algorithm (IWO) selected fewer features with greater overall classification accuracy. The result showed that the highest classification accuracy of this model was achieved by coupling Invasive Weed optimization algorithm and Random Forest classifier (IWO–RF), which was 99.9% when the model was tested by the current signal.
In future research work, the applicability of the proposed methodology on other datasets will be investigated where the diagnosis of the faults considering the current and vibration signals are at an early stage. Furthermore, it is recommended to enhance the IWO algorithm by applying modification or hybridizing with another metaheuristic algorithm to produce more accurate fault diagnostic systems for induction motors. Moreover, the convolutional neural network will be proposed to extract the initial feature from the current and the vibration signals. For the current and vibration data to be used as inputs to a convolutional neural network, they will be converted from a one-dimensional time series to an image using the Gramian Angular Field algorithm (GAF). For each time series of current and vibration, two images will be created, one image using the GASF method and the other image using the GADF method. The obtained images will be classified using the same proposed machine learning classifiers.