The performance of the algorithm was first tested on a synthetic dataset. In this section, the simulation set-up as well as the obtained results are described.
3.1. Simulation Set-Up
The model used for synthetic data generation consists of a 5 Degrees of Freedom (DoF) lumped parameters model capable of reproducing the bearing non-stationary vibration behaviour. As previously described by the authors [
39,
40], the chosen model is made of a set of motion equations to represent the inner ring, outer ring and bearing housing displacements and a set of equations coming from the Hertzian contact theory used to compute the contact force of the touching surfaces. This model includes a non-stationary degradation profile which allows us to take into account the evolution of surface defects that affect the bearing functioning.
Figure 11 shows a scheme of the proposed model, while
Figure 12 shows the structure of the simulation loop. In the presented scheme the orange blocks represent the model input parameters, the blue blocks represent the output and the red blocks represent the coupled variables computed during the simulation.
The proposed model can simulate a great variety of simulation scenarios characterised by different defect shapes and degradation evolution profiles. In this particular case, a rectangular defect was considered on the Outer Ring whose width evolves over time with a linear profile. This very simple scenario was chosen because the objective of the synthetic data generation is not simply to produce bearing vibration signals similar to real ones but it is also to generate a meaningful dataset that can test the performance of the proposed algorithm. By choosing this simple scenario, the authors are able to generate vibration signals whose content is known in detail, reducing the dataset uncertainty level. This choice will ease the interpretation of the obtained results and will allow the identification of possible algorithm problems. A complete list of all the utilised bearing parameters is provided in
Table 2 and
Table 3.
The rotational speed and the vertical load reported in
Table 2 were selected during the model validation process [
39]. The vertical load is in the order of magnitude of previous reusable rocket engines. For example, the SSME liquid oxygen turbopump bearings were submitted to this kind of radial loads [
43]. The typical rotational speed of reusable rocket engines are much higher, ranging from 20,000 rpm to 90,000 rpm. The selected rotational speed for data generation, 600 rpm, is quite far from the classical nominal LPRE one, but given that the model has not been validated yet on higher speed simulation conditions, it was decided to first test the proposed algorithm on a lower rotational speed for which the correct vibration generation has been ensured before. Later on, with a transfer learning approach, the proposed methodology and model will be adapted to real rocket engine vibration signals.
Table 3 includes all the parameters used to model the bearing surface defect. The defect is represented by a rectangular shape of width
B and depth
H. At the beginning of the simulation, no defect is present and a healthy signal is obtained. At time
the degradation appears. The value of
H is fixed, while the value of
B increases over time as shown by the plots in
Figure 13. The time
is the moment when
B becomes wide enough so that two bearing balls fall inside the defect simultaneously. This produces an interference in the contact forces that cancel each other out or are amplified in an unpredictable way. The authors estimated that this is the time at which failure occurs and used this value as reference to asses the precision of the RUL estimation.
3.2. Computation Set-Up
The Multi-Level MB-HMM scheme from
Figure 9 needs an a priori definition of the number of branches. In this case a MB-HMM model composed of two branches was chosen as presented in
Figure 14. The available data were split in two groups of 30 signals and each branch was trained separately using the Baum–Welch algorithm. The transition matrices,
and
, are 4 × 4 matrices since four hidden states were selected. The HI discretization was performed using six intervals. Thus the emission matrices,
and
, are 4 × 12 matrices. Equation (
22) shows the initialisation matrices
and
.
The initial transition matrix
has the same structure as previously presented in Equation (
16). The values on the diagonal were set to 0.7 and the other to 0.3 following the initial assumption that it is more probable to stay in the current state than transitioning. This hypothesis is in line with the proposed left-to-right HMM structure.
A bootstrap approach was selected to conduct the algorithm training and testing. One signal is selected from the training datasets and put aside. This signal is the testing signal. The remaining ones are processed and used to train the Multi-Level MB-HMM model. When the latter is ready, the testing signal is divided in segments whose length is equal to the length of a mission. A loop was implemented to retrieve a segment, chain it to the previous segments and compute the RUL. This approach reproduces the real-life scenario in which the reusable LPRE performs multiple missions and at the end of each mission the bearing SoH is verified.
The observed computation time to perform one loop is only a few minutes. Since the proposed method is applied off-line the observed computation time is not an issue. Of course, the RUL estimation process should be as quick as possible in order to avoid delays in the maintenance planning. Given an usual turnaround times of days or weeks in the case of reusable launchers, a computational time of some minutes is acceptable. The most time-consuming step is the wavelet decomposition, which is logical since the vibration signals are quite long. The rest of the steps concerning the HI construction and the RUL computation are fast. In particular, the authors have observed very quick RUL estimation times, . This time includes the MB-HMM training, the HI decoding and the RUL estimation. From a theoretical point of view, the computational complexity of the training, decoding and RUL computation is directly linked to the size of matrix A and matrix B. The Baum–Welch training algorithm as an asymptotic complexity of × O() and the decoding algorithm has an asymptotic complexity of O(), where is the number of iterations, N is the number of states and t is the time steps. The computation of the determinant has an asymptotic computation complexity of O() where n is the matrix size. Even though the presented asymptotic complexity is not very fast, the fact that the proposed HMM model has a minimised size compensates for that and in the end the computation time has an order of magnitude of a few seconds. Indeed, thanks to the reduced size of the A and B matrices the computation speed is increased for both algorithm training and testing. This aspect highlights the computation efficiency of the proposed algorithm, which is one of its main strengths. The interesting aspect of the proposed approach is that a complex degradation phenomenon could be approached with a small model. Indeed, given and , a total of 48 parameters was estimated during the learning step, which is small compared to AI approaches characterised by hundreds of parameters.
3.3. Results and Discussion
Following the simulation set-up presented in the previous section, 116 bearing vibration signals were generated.
Figure 15 shows, as an example, the vibration signal number 10. By looking at the figure, it is possible to clearly distinguish the vibration produced during the nominal functioning of the bearing and the vibration produced during its degraded operation. Starting from
h, the signal magnitude increases in a gradual way until
h when its behaviour becomes random. When the magnitude becomes unpredictable, failure is reached.
The described signal was processed with the MODWPT and the obtained sub-signals were used to compute the family of health indicators whose plots are reported in
Figure 16a–h. The analysis of these figures reveals that the higher energy density can be found at the lower frequencies, between 0 and 3.2 kHz. The first and second frequency bands are the most affected by the degradation. The health indicators in
Figure 16a,b show clear signs of variation all along the bearing life. The other health indicators all show an abrupt decrease in energy at
and remain constant before and after. As a consequence, they are not good candidates for RUL estimation and prediction.
All the obtained HI underwent the failure detection step.
Figure 17 and
Figure 18 show the result of the DCS statistical test for the HIs from
Figure 16a and
Figure 16b, respectively. The results concerning the other frequency bands are not reported here since no failure could be detected. In both
Figure 17 and
Figure 18 the upper plot shows the concerned health indicator together with the detected failure. The lower plot reports the DCS statistical indicator (orange) and the decision function (blue). As can be observed, the DCS presents picks of different amplitude over time and the highest do not always corresponds to the time instant when failure occurs. Anyway, the failure is properly detected, showing that the combination of DCS statistical test described in
Section 2.2.4 is capable of correctly performing the detection step even in such difficult conditions.
Figure 19a,b shows the results of the HI discretization. As before, all the available health indicator were treated but only the ones concerning the lower frequency are reported since the other did not present interesting features. In both
Figure 19a,b, the upper plot shows the two discrete sequences,
and
, and the lower plot shows the final discrete sequence,
. The
indicator (blue) is the result of a quantisation of the initial health indicator considering six intervals of equal distance. As can be noticed, the relative energy increase is still noticeable but the failure instance is less visible. Increasing the number of the quantisation intervals could help, but, at the same time, it would increase the size of the HMM
B matrix. The introduction of the
secondary discrete HI allows us to construct a better final discrete sequence where the regions before and after failure are visible. Two examples of the obtained
discrete sequences are presented in green in
Figure 19a,b to prove that the proposed discretization technique enhances the degradation profile contained in the initial HI.
All the available synthetic signals were treated like Signal 10 and the obtained discrete sequences were used to perform the MB-HMM training.
Figure 20 shows some of the used training sequences. In the figure, the different degradation speeds can be easily observed. The models,
and
, were obtained by applying the Baum–Welch training algorithm and they are reported here. Equations (
23) and (
24) concern the fast dynamic branch:
Equations (
25) and (
26) concern the slow dynamic branch:
Figure 21 and
Table 4 illustrate the final results for Signal 10.
Figure 21 shows the estimated RUL (red), the real RUL (black), the error margins at +30% (cyan), −30% (brown) and the mission duration limit (green). The estimated RUL is represented with round markers while the other variables are reported with solid lines. This different graphical representation is to underline the fact that the RUL is estimated at the end of each mission and not with a real-time approach. By looking at the figure, it is possible to notice that the RUL is predicted in an accurate way all along the bearing life.
Table 4 reports the data concerning the selection of the level (frequency band presenting the best degradation profile) and the selection of the HMM model (fast or slow). Both the correct frequency band,
, and the correct branch in the MB-HMM model, “Fast”, are selected. Moreover, it is possible to notice how the posterior probability of the selected HMM model increases mission after mission. This result shows that the proposed algorithm lowers the uncertainty in the model selection step as time goes on. Indeed, the sequences retrieved after each mission are chained, thus a longer sequence containing all the bearing history is provided to the algorithm which can perform a better informed choice. The mean average percentage error was computed to give an indication of the RUL estimation accuracy. In this case, it is equal to 2.5%. This result highlights the good prediction capabilities of the proposed MB-HMM model.
The proposed algorithm was tested as well with other synthetic signals. In particular,
Figure 22 shows the low frequency
for signals 20, 30, 90 and 110. Signal 10 is reported as well as a reference case. Signals 20 and 30 are both characterised by a slow degradation dynamics. In particular, signal 31 presents the slowest fast degradation among the simulated ones. Signals 90 and 110 are characterised by slow degradation dynamics.
Figure 23 and
Table 5 report the obtained results concerning signal 20. In this case, the RUL estimation error is higher than what is observed in
Figure 21. The RUL is underestimated from mission 1 to mission 9 and respects the −30% error limit until mission 7. The computed percentage error is 21.67%.
This result can be justified by considering the HMM model from a practical point of view. In particular, the focus should be put on the transition matrix
. By definition each element is the probability of passing from one state to another and its inverse is the mean duration of the hidden state itself. The HMM model captures the average behaviour based on the training data set which is provided. As a consequence, when the
model is used for RUL prediction, it will work perfectly for signals that are very close to the average behaviour and it will be less accurate for signals that are far from the detected dynamic. Based on these remarks, the results presented in
Figure 23 can be explained by the fact that the signal is characterised by a slower dynamics than the one detected in the training dataset. As a result, the HMM underestimates the RUL. However, given a mission duration of 0.5 h, the fact that towards the end the RUL estimation is less precise does not bother the decision-making process. Indeed, the bearing would have been already discarded or sent to maintenance.
Figure 24 and
Table 6 report the obtained results concerning signal 30. This signal was generated considering the slowest of the fast degradation dynamics. As a consequence, even if the correct health indicator and HMM model are selected, the RUL estimation respects the margins at ±30% until mission 5, so only for half of the bearing life. The obtained percentage error is equal to 38.72%. As before, this result is motivated by the fact that HMM models capture the average degradation dynamics and they cannot properly decode and predict sequences that are too far from it. A possible solution to this issue could be the definition of additional branches in the MB-HMM model. Consequently, intermediate dynamics could be taken into account, improving the RUL estimation.
Figure 25 and
Table 7 report the obtained results concerning signal 110. As for signal 10, both the frequency band and the HMM branch are correctly selected. Except for a couple of hesitations, the proposed methodology is able to clearly recognise that the current degradation dynamic is the slow one.
Similarly to observations for signals 30 and 20, when the degradation rate becomes faster, the algorithm accuracy decreases. This is the case for signal 90, which is far from the training sequences.
Figure 26 and
Table 8 show that the frequency band selection is correct but that towards the bearing end-of-life, the model selection becomes critical. As shown in this table, the fast dynamic is chosen instead of the slow one, provoking a strong underestimation of the RUL. Again, it occurs when the RUL is close to the mission duration limit. Since this signal falls in the region between the two training datasets it is not surprising that the algorithm has difficulties in selecting the correct degradation branch model in the MB-HMM. In general, a conservative behaviour is observed: the model has the tendency to classify a slow dynamic as “Fast” while the opposite rarely happens. This aspect is very promising and proves that the present methodology can be considered for later application in real rocket engine data. The percentage error is directly affected by the algorithm hesitation and the computed values are 110% for signal 90 and 27.31% for signal 110. This aspect will be addressed first in future works since the method precision and performance directly depends on it. The obtained percentage errors prove it clearly.