Remaining Useful Life Estimation of Rotating Machines through Supervised Learning with Non-Linear Approaches
Round 1
Reviewer 1 Report
Intelligent condition-based monitoring research is a hot spot for industries nowadays. The authors proposed a novel and robust method to predict the remaining useful life estimation of rotating machines through supervised learning with non-linear approach. As bearing failure is one of the most common and typical causes of machines, so it is very meaningful to be taken as the research subject in this manuscript. The machine learning framework was illustrated in details and the results showed the validity and accuracy. This method is highly valuable for advanced predictive maintenance of various machines in industry.
The manuscript is scientifically sound and well organized. The PRONOSTIA Bearing Dataset are widely used in this field, which are selected to support this proposed method. The figures, tables and schemes are appropriate and easy to interpret and understand. Detailed results are clearly shown and are discussed to draw the conclusions, which are consistent with the arguments presented.
As this manuscript is written and structured in good condition, I suggest this manuscript could be accepted in its present form.
Reviewer 2 Report
Comments are given in .pdf file.
Comments for author File: Comments.pdf
Author Response
Dear Reviewer,
Thank you for your time and effort reviewing our paper. We have addressed each of your comments. Responses to each of your comments are responded to individually below.
In this paper, authors introduce an intelligent condition-based monitoring (CbM) method to predict rolling element bearing fault modes. The method is based on feature extraction techniques and machine learning methods (ML). The proposed method is tested and validated on the PRONOSTIA platform dataset,
 
The paper is well-structured in general, and the results of the paper are generally written clearly.
>> Thank you for the complimentary comments of our work.
However, there are some issues that should be addressed. Therefore, I propose a major revision of this manuscript.
Abstract, line 5 – since you are using support vector machines (SVM) and k-nearest neighbors (kNN), that are two traditional, classical methods, it is not appropriate to state that you use “advanced machine learning….”.
>> We would like to thank you for your time and effort in reviewing our paper. Please see our responses to your comments below. We agree with this point, and we have since changed the text in the paper’s Abstract to the following "Machine learning algorithms are currently being investigated to accurately predict the health of machines and equipment in real time."
You should give a clear explanation and strong reasons why you have chosen Short-time Fourier Transform (STFT) and Envelope Analysis (EA) for feature extraction and SVM and kNN for classification.
>> Thank you for this comment. We agree that the rationale for choosing these approaches was not clear in the original manuscript submission, hence we have now adjusted the paper accordingly, by adding in the following text to the second last paragraph of the introduction: "The rationale for using Fourier and EA based feature extraction was as a result of detailed vibration signal analysis conducted on the bearing signals from the dataset. This motivated the incorporation of non-linear feature compression of the multidimensional feature space using Octave bands as the prognostic information signatures are highly concentrated in the lower portion of the spectra." AND "SVM and k-NN algorithms were chosen because of their robustness for supervised learning problems, in particular problems with datasets of limited size, which greatly limits the suitability of applying other more advanced deep learning approaches e.g. ANN and LSTM. " AND "The time frequency analysis conducted on the vibration signals also motivated the investigation of non-linear wear state models as the bearing degradation typically do not follow linear trends."
Figure 1, part b) – it is not clear if the data presented on this graph is yours, or taken from literature. If it is taken, please cite a source.
>> Excellent point, while the figure 1(b) is our own plot/graphics the data was sourced from Gyftakis and Cardoso 2019 which is reference number [6]. So we have adjusted the caption text to include the reference to this paper as well as an earlier paper by Tavner 2008.
I suggest that you to be consistent in usage of terms algorithm, method and model in the manuscript.
>> Thank you for this suggestion. We have adjusted the text and now we only use the word "model" for the wear state classes, e.g. linear and nonlinear models, the word "method" is used to describe the overall Machine Learning approach, and the word "algorithm" is used to refer to the SVM and kNN supervised learning algorithms. Also, the word "recipe" is used to differentiate the various different combination of feature sets e.g. different variations of STFT and EA along with various the ML algorithms e.g. SVM with different kernel functions, and k-NN with different values of k, and distance metrics. This is beneficial as it increases the clarity and readability of the paper. Similarly, along the same lines we previously used the terms Experimental "Procedure", Experimental "Setup" and Experimental "Framework", accordingly, we have adjusted text for this also to increase consistency and clarity by simply using the wording 'Experimental procedure.'
I suggest adding a subsection 3.1. Data; where you should elaborate the used dataset in details: number of instances, number of attributes, number of classes, is the dataset balanced, etc.
>> Thank you for this comment. As the dataset is detailed well in the previous section, section 2 Vibration Signal Analysis, hence, we don’t want to duplicate the text. We believe it is better placed there as it serves two purposes one to introduce the dataset prior to the vibrational signal analysis and data visualisation, for instance Figure 2, Table 1 and Figure 3, and signal analysis, but also for the subsequent sections of the paper. However, it is a good point, so in order to help the reader, we have now broken-down section 2 down into two subsections, "Dataset" and "Signal Analysis" this will help to improve the accessibility and clarity of the content for the reader.
Subsection 3.4.1. Support vector machines should be reorganized – you should first state some theoretical background of SVM, together with appropriate references. The last part of the subsections should be devoted to parameters that you aim to optimize. The same comments should be applied for 3.4.1. k-nearest neighbors.
>> Thank you for this comment. We have slightly adjusted the text SVM subsubsection (3.4.1) as suggested and added references to the kNN subsubsection (3.4.2) at the beginning. So now both sections have consistency in how they are presented.
Table 3 – what is the justification why are you using these combinations of method and distance? Also, these values for k are very unusual. You should explain that.
>> Thank you for this comment. This work has experimented with using different k values with different weighting metrics to try to optimise the ML approach, similarly for SVMs we have explored 6 different kernel function types. But as you correctly point out in the original manuscript we didn’t state our rationale why, hence we have now added the following text to clear up the confusion as to why we try different types, for SVM "to try to optimise the ML method for this particular RUL classification problem." and for kNN the following sentence "Similar to the case of SVM with experimenting different kernel function types here we have experimented with different weighting metrics to try to optimise the ML method for RUL estimation."
Consider to move part of the 5. Result section that deals with error metrics to 4. Experimental procedure. Also, why you do not use traditional ML metrics for classification such as recall, precision, F1?
>> Thank you for this comment. This is an excellent observation, we have changed this as requested and moved that to the end of the experimental procedure section, Section 4, we also broke that section down into three subsections (4.1. ML Method Recipes, 4.2. Round Robin Framework, 4.3. Performance Metrics to improve the structuring, this is consistent with the other sections of the paper where subsections are also used to similar effect.
Maybe you should present numbers along with percentages in confusion matrices presented in Fig. 7 and Fig 8.
>> Thank you for this comment, this is a very good point. Some background, during the preparation of the manuscript we considered closely several different ways of presenting the data for the confusion matrices, one of which was with the absolute numbers in the matrix. However, we found during the analysis and writing the discussion section that it is much easier for the data to be read and interpreted when it is presented in percentage format. As Tables 4, 5, 6 and 7, uses classification accuracy and MAE metrics, which summarises the performances and hence does not allow delving deeper to see the performances at the ‘class-level’. As the purpose of the confusion matrices is to offer more granularity of the results for the individual classes. Hence, we feel that the confusion matrices are best left in this format, as they effective expand on the previous results tables and they complement the subsequent plots in figure 9, as the diagonal values of the confusion matrices are the points used in Figure 9 along with the mean value. However upon reflection this last point might have been missed in the text (not clear), hence we have highlighted this point in the Discussion section, Section 6, text as follows: "These points along with a mean value correspond to the diagonal values for the confusion matrices in Figure~7 and 8. "
The paper lack with clear comparison with some traditional approach that deals with this problem. Is it possible to add some results from literature?
>> Thank you for this comment. On the method comparison while it is extremely difficult to compare like with like, as we have tested our ML method (with various recipes) on all of the bearings samples in the dataset for condition 1 which is more challenging, compared with other papers which typically only test it on a subset e.g. 5 bearings. However, we have added text to the discussion section, which gives a ballpark comparison of other approaches our paper uses two metrics the classification accuracy and the MAE, it was not possible to find other papers which use both of these metrics. We added a section of text to the discussion comparing 3 prior works on the same dataset, "Prior work by Sutrisno et al., Singleton et al., Lei et al., presented ML methods which achieved percentage accuracy scores of 76.2%, 67.20%, 77.44%, respectively using the PRONOSTIA bearing dataset. However, these proposed methods utilised a framework where only bearings S.01 and S.02 are used for training the algorithm, and the remaining 5 bearings are used for testing. The round-robin experimental framework presented in this paper presents the mean percentage accuracy of all 7 bearing signals whereas the prior work only presents the mean of 5. Also, the MAE performance metric was used for analysis purposes to ascertain the severity of the misclassifications."
Reviewer 3 Report
The paper is very well written. Nice research work was carried out.
Thank you.
Author Response
Dear Reviewer,
Thank you for your time and effort reviewing our paper.
The paper is very well written. Nice research work was carried out.
Thank you.
>> We would like to thank you for your time and effort in reviewing our paper. Thank you for the complimentary comments of our work.
Reviewer 4 Report
This paper has introduced a valuable machine learning (ML) approach to estimate the RUL of rolling element bearings in rotating machines. The proposed ML recipes and approaches comprise of signal processing techniques and ML algorithms applied to real-world vibration signals which were acquired from the outer-race of bearings degraded over time using an accelerated ageing test-rig.
The research contents and logical flow in the paper were quite interesting. however, the authors missed one major thing which is limitation of the paper.
Before the publication, It is recommended to cover the limitation and further works of the research in-depth such as the number of datasets and the used methods for comparisons.
Author Response
Dear Reviewer,
Thank you for your time and effort reviewing our paper. We have addressed and replied to each of your comments below.
This paper has introduced a valuable machine learning (ML) approach to estimate the RUL of rolling element bearings in rotating machines. The proposed ML recipes and approaches comprise of signal processing techniques and ML algorithms applied to real-world vibration signals which were acquired from the outer-race of bearings degraded over time using an accelerated ageing test-rig.
>> We would like to thank you for your time and effort in reviewing our paper. Thank you for the complimentary comments of our work.
The research contents and logical flow in the paper were quite interesting. however, the authors missed one major thing which is limitation of the paper.
Before the publication, It is recommended to cover the limitation and further works of the research in-depth such as the number of datasets and the used methods for comparisons.
>> Thank you for this comment, we have added further text on the limitation of the works and outlines future avenues to explore... "Testing the versatility and robustness of the proposed ML method with the various recipes on different bearing types and sizes under different speed and load conditions. Also work could explore vibration data acquisition from research testbeds where the shaft speed changes. This will require developing extensive experimental campaigns to create more advanced datasets which better reflect typical real world operating conditions." On the method comparison it is difficult to compare like with like, we have tested our method on all the bearings in the dataset for condition 1 which is more challenging, the other papers typically only test it on a subset e.g. 5 bearings. However, we have added text to the discussion section, which gives a comparison of other approaches our paper uses two metrics the classification accuracy and the MAE, it was not possible to find other papers which use both of these. " Prior work by Sutrisno et al., Singleton et al., Lei et al., presented ML methods which achieved percentage accuracy scores of 76.2%, 67.20%, 77.44%, respectively using the PRONOSTIA bearing dataset. However, these proposed methods utilised a framework where only bearings S.01 and S.02 are used for training the algorithm, and the remaining 5 bearings are used for testing. The round-robin experimental framework presented in this paper presents the mean percentage accuracy of all 7 bearing signals whereas the prior work only presents the mean of 5. Also, the MAE performance metric was used for analysis purposes to ascertain the severity of the misclassifications.”
Reviewer 5 Report
Are there any overfitting issues? How do you address it?
Author Response
Dear Reviewer,
Thank you for your time and effort reviewing our paper.
Are there any overfitting issues? How do you address it?
>> Thank you for this comment, yes overfitting has been minimised by only using out of sample signals for testing the ML method with different recipes, we achieved this using the a round-robin approach that was described in Section 4, Experimental Procedure, (second paragraph). However, we agree that it is an important point, so we have since added text to the first sentence of that paragraph noting this also - which will complement the referencing to overfitting mentioned later in this paragraph as well as in the last paragraph of the discussion section.
Round 2
Reviewer 2 Report
The manuscript can be accepted the present form.