1. Introduction
A standard for the minimum lifetime of compression springs under cyclic loading is given in EN 13906-1 [
1]. This standard is a great help for design engineers because it allows a safe design at a reasonable design cost and a reasonable utilization of the materials used. However, the potential for lightweight design is not fully exploited. In addition, EN 13906-1 [
1] is only applicable for up to 10
cycles while valve springs are loaded for up to 10
cycles in many applications, for example in automotive engines.
A new guideline for springs and spring elements [
2] is currently being developed in a cooperation between Forschungskuratorium Maschinenbau (FKM), a German research association, the Technical University of Darmstadt and the Technical University of Ilmenau. Once published, it will allow designing for higher numbers of cycles as well as a better exploitation of lightweight potential. The design process using this new FKM guideline ‘Analytical Strength Assessment of Springs and Spring Elements’ will include specifying a parameter described as the level of trust in the data used to build the underlying model. In practice, this factor incorporates the influence of a given spring producer’s production process into the fatigue design rules of the guideline. This is implemented by raising or lowering the fatigue curve to make its predictions slightly conservative, but not too conservative for compression springs produced in the producer’s given production process.
This factor should be calibrated for a given production process using fatigue data, ideally fatigue curves. In this paper, we discuss the influence of the ultimate number of cycles on the resulting fatigue curves for compression springs manufactured from VDSiCr class valve spring wire (VDSiCr). Once tests up to 10 cycles produce conservative results for lifetime predictions up to 10 cycles, fatigue tests conducted only up to 10 cycles may be safely used to calibrate the factor described above. This is relevant for spring manufacturers because testing springs with wire diameters of 1 to 3 mm up to 10 cycles comes with a year of testing time and a six figure Euro price tag. To the author’s knowledge, there is no investigation of springs with larger diameters exceeding 10 loading cycles.
As a primer to the question why testing up to 10
, which is common practice, may not be sufficient, see
Figure 1. On the left side, a fatigue test that was interrupted after 10
cycles is presented. For each horizon, the number of run-outs, n
, and the total number of specimens tested, n
, are given. On the right side, the result of the same fatigue test continued to 5·10
cycles is presented. Looking at the diagram on the left, an engineer expecting a pronounced fatigue strength could conclude that at the lowest load level, the springs tested have an infinite lifetime. Confronted with the diagram on the right, we know that this conclusion is wrong.
In
Section 2, the statistical model used in this investigation is introduced. The numerical implementation of an algorithm that fits a model to the data under investigation is described in
Section 3. An extension to this implementation allowing an Artificial Censoring Experiment (ACE) is described there as well. In the ACE, a numerical thought experiment is conducted where we wonder what lifetime predictions we would have obtained if we would have conducted the fatigue experiment with a lower ultimate number of cycles than we actually did. This thought experiment may answer the question whether the ultimate number of cycles we actually used in our experiments influences the predicted lifetime, i.e., if our lifetime prediction is admissible. Fatigue results of compression springs manufactured from VDSiCr and the setup for our numerical experiments are presented in
Section 4. The results are presented in
Section 5. Possible extensions to other batches of compression springs manufactured from VDSiCr are discussed in
Section 6. The whole investigation and its key results are summarized in
Section 7.
2. Utilized Statistical Model for Fatigue Events
Fatigue results are customarily plotted with logarithmic scaling in both axes,
Figure 1. In this coordinate system, fatigue curves of compression springs have historically been constructed as bilinear curves descending linearly in the High Cycle Fatigue regime and running horizontally in the Very High Cycle Fatigue regime. The transition was assumed to be somewhere between 10
and 10
cycles. This model is based on the assumption a pronounced fatigue strength in this range. Fatigue failures of components cyclically loaded in the gigacycle regime and fatigue tests on material specimens in the 1980s and 1990s [
4] have challenged this assumption. Inspired by this challenge, researchers conducted fatigue experiments beyond 10
cycles [
5]. The results of these fatigue experiments disprove the assumption of a pronounced fatigue strength around 10
or 10
cycles for compression springs. This finding has been repeated in a multitude of tests on material specimens [
6,
7,
8,
9,
10,
11,
12,
13,
14] and compression springs [
3,
15,
16].
The model that is used by the upcoming FKM guideline for springs and spring elements [
2] utilizes a trilinear fatigue curve,
Figure 2. The fatigue curve kinks a first time at 10
cycles slowing down its descent and a second time at 10
cycles, running horizontally from there on. This is based on the assumption of a pronounced fatigue strength after 10
cycles, which has not been proven to be false.
However, there is also no experimental evidence available that supports the assumption of a pronounced fatigue strength beyond 10 cycles. Therefore, in this work the more conservative approach of a bilinear fatigue curve is used. The kinking of the fatigue curve occurs at different stroke stresses and different numbers of cycles depending on the properties of the batch of springs under investigation as well as the testing setup. The model used incorporates this through a variable kink point.
Fatigue events in metallic materials usually occur following a log-normal distribution [
17,
18,
19]. Experiments on material specimens [
20] as well as compression springs [
16] have shown that this is not the case for VDSiCr. A key reason for this is the presence of multiple competing failure mechanisms. Problems arising from this have been addressed in literature [
3,
21,
22,
23,
24,
25,
26,
27,
28], however consensus over a suitable solution has not yet been reached. In this investigation, a log-normal distribution in the direction of the number of cycles is assumed above and below the kink point. As evident in
Figure 1, variance is much higher in lower load levels. Two independent variances are used above and below the kink point.
4. Experimental Setup
Several datasets on the fatigue behavior of compression springs manufactured from VDSiCr are available. For ultimate numbers of cycles beyond 10
, fewer datasets are available. All available datasets were generated in a series of research projects at Technical University of Darmstadt [
3,
5,
16,
34].
For this investigation, fatigue data from the most recent research project on the Very High Cycle Fatigue properties of compression springs, IGF 18576 N [
3], are used. The data are presented in
Figure 1. Fatigue curves with 10, 50, and 90% survival probability generated from the original dataset and from an artificially censored dataset are presented in
Figure 5. The 50% fatigue curve on the left is used as a reference for lifetime prediction in the following investigation. It slices the results of all load levels at about the 50% quantile. This indicates it fits the data well.
Both curves look nearly the same up until 8.5· cycles. At over 8.5· cycles, the fatigue curve on the right predicts increasingly longer lifetimes than the one on the left, meaning that it is non-conservative and making it impermissible for fatigue design. The aim of the ACE is to identify this impermissibility with just the data on the right side available. For over 6·, the right side model again makes more conservative lifetime predictions compared to the model on the left, which is dangerous since looking only at high and low load levels, leaving out the middle, one might assume the model on the right side is conservative.
The numerical experiment is split into two runs. In Run 1, the procedure is conducted using the model described in
Section 3.1. In the analysis of Run 1, we observe some conservativeness due to an unrealistically fast descent of the fatigue curve below the kink point (unrealistically low
of 13.2 for censoring at 10
,
Figure 5). In Run 2, the exponent causing the unrealistically steep descent is fixed to the level provided in the new FKM guideline [
2],
.
For each run, 100 datasets are generated by artificially censoring in logarithmically evenly spaced intervals between 10
and 5·10
cycles. For these datasets, lifetime predictions with a 50% chance of survival are created at stroke stresses of 700, 800, 900, 1000, and 1100 MPa. The lifetime predictions for the original dataset are given in
Table 1. For higher load levels, lifetime predictions are similar. For lower load levels, Run 2 is more conservative.
5. Results
The results of Run 1 and Run 2 are displayed in
Figure 6 and
Figure 7. In
Figure 7 and the following Figures, arrows showing upwards and downwards mark data that is beyond the limits of the chart’s
y-axis. Results for Run 1 are displayed on the left, results for Run 2 on the right. For each diagram, one point refers to one optimized model.
The points at the same artificial censoring cycle in different diagrams of the same run refer to the same model. The log-likelihood value is higher for Run 1 because one additional parameter could be varied. The only case where the log-likelihood value of Run 1 would be equal to that of Run 2 is if for Run 1, , which implies that both models are equal. For artificial censoring cycles over 4·10, the likelihood function falls approximately linearly (log-lin). This is mostly caused by run-outs moving towards higher numbers of cycles in a mostly constant model and thereby reducing the computed residual likelihood of fracture .
In Run 1, for most models, the stress and the number of cycles in the kink point are roughly equal. At about 6.5·10 artificial censoring cycles, five models vary momentously in their kink point. This jump in parameters does not translate to the log-likelihood value. For artificial censoring cycles below 1.5·10, the kink point differs significantly from that at higher artificial censoring cycles. For Run 1, the stress and the number of cycles in the kink point each form another plateau. For Run 2, the stress in the kink point rises in an approximately (log-lin) linear fashion as the artificial censoring cycle reduces to 10 while the number of cycles in the kink point falls after a swift rise. These differences are systematic biases, not random errors. This is a first warning sign not to decrease the ultimate number of cycles too far.
In both runs, the exponent for the slope only differs slightly. The reason for this is that helps fit the fatigue curve above the kink point, where the observed events are not affected by the artificial censoring. Its major changes occur where the kink point changes. This is very understandable because as the kink point moves above or below load levels where events occur, the dataset fitted by changes.
For Run 1, rises with greater artificial censoring cycles. Higher exponents mean that the fatigue curve approaches a horizontal course, which implies a pronounced fatigue strength. At the given maximum of 40 it is not horizontal. Further research is necessary to properly evaluate the question of whether a pronounced fatigue strength for compression springs manufactured from VDSiCr beyond 10 cycles exists. For design purposes, one should conservatively assume it does not exist. For artificial censoring cycles around 6.5·10 cycles, very high exponents occur. For these models, lifetime predictions may be highly non-conservative and therefore must me treated with utmost care. For Run 2, is constant.
Standard deviation above the kink point, , varies between 0.13 and 0.18 for both runs. For Run 1, it establishes two plateaus with a jump at an artificial censoring cycle of around 6.5·10 cycles. For Run 2, a jump occurs earlier and a range of (log-lin) approximately linear growth occurs additionally to two plateaus at the same levels as the plateaus in Run 1.
Standard deviation below the kink point, , varies widely. One reason for its growth is that failures at the lowest load levels do not follow a log-normal distribution and have a higher scatter range than failures at medium load levels. If the data are artificially censored at a relatively low number of cycles, the lowest load levels are only factored in as run-outs. Run-outs occurring at cycle counts below the expectation value do not contribute positively to a greater standard deviation.
Lifetime predictions for the 50% quantile derived from the models in
Figure 6 and
Figure 7 for a stroke stress of 1100 MPa are displayed in
Figure 8. Again, the results derived in Run 1 are shown on the left, the results from Run 2 are shown on the right. The upper graphs show the predictions of the models derived from censored data normalized over the predictions of the model with the original dataset,
. Here, an ideal value exists and it is 1.0. The last point has per definition the value 1.0. A value of 2.0 means that the model based on a censored dataset predicts a lifetime of twice as many cycles as the model based on the original dataset. This is bad because it is non-conservative. A value of 0.5 means that the model based on a censored dataset predicts a lifetime of half as many cycles as the model based on the original dataset. This is less bad because it is conservative. However, it is not good either since it prevents engineers from fully exploiting lightweight potential. The lower charts show the same data as the upper charts with different limits. Kindly note that a value of 1.0 on the left side corresponds to a different number of cycles than a value of 1.0 on the right side,
Table 1. For
Figure 8, only the lower part is relevant. Lifetime predictions differ by less than five percent. This is acceptable.
In
Figure 9, the relative lifetime predictions for a stroke stress of 1000 MPa are displayed. Here, lifetime predictions also differ by less than five percent. The fact that up until this point, only small deviations occurred is of little surprise, considering that the lifetime prediction of the model fitted to the original dataset is 4.2·10
for both runs, which is below the lowest censoring number of cycles under investigation. The highest deviations for Run 1 occur at artificial censoring cycles under 1.5·10
, which is where the kink point is at another plateau than without censoring. Here, fewer events are used for fitting the part of the fatigue curve above the kink point. The difference in the underlying dataset explains the difference in this part of the model. There is no obvious answer to the question of which prediction is better here. For Run 2, at artificial censoring cycles below 1.5·10
, the same effect produces a similar behavior.
The greater differences for higher artificial censoring cycles in Run 2 compared to Run 1 can be attributed to the algorithm having a hard time fitting the part of the fatigue curve below the kink point due to the fixed exponent
. This is understandable if one considers that the maximum likelihood method with a normal distribution is identical to least squares if no run-outs are present. Basically, a lot of measured error is created in the process of fitting the curve below the kink point because the curve just cannot fit well as the slope is fixed. Some of the error is transferred from the part of the model below the kink point to the part above the kink point. This can be seen in the rise in the number of cycles in the kink point, while the stress in the kink point is mostly constant,
Figure 6. This does not occur for Run 1 because the transfer of error itself produces additional error overall.
In
Figure 10, the relative lifetime predictions for a stroke stress of 900 MPa are displayed. For artificial censoring cycles over 2·10
, lifetime predictions are slightly conservative. Still being within five percent, this deviation is relatively small. For artificial censoring cycles below 2·10
, lifetime predictions become more non-conservative, peaking at 19% with artificial censoring at 10
cycles. The lifetime prediction generated with the original dataset was 9.0·10
for Run 1 and 9.2·10
For Run 2. It is surprising to see momentous changes in lifetime prediction although all artificially censored tests still had ultimate numbers of cycles higher than the original lifetime prediction (although not higher than the lifetime prediction of the model based on the test with artificial censoring at 10
cycles in Run 1).
In Run 1, a jump that corresponds to the jump in the model parameters occurs around 6.5·10
cycles. This jump may also be visible is
Figure 9 although this is less obvious and may be disputed by some viewers.
In
Figure 11, the relative lifetime predictions for a stroke stress of 800 MPa are displayed. The predicted lifetime of the model based on the original data is 3.7·10
for Run 1 and 3.0·10
for Run 2. This is where we expect results to differ momentously if fatigue curves truly should not be extrapolated. Unsurprisingly, predictions with artificial censoring cycles under 1.5·10
are non-conservative, especially in Run 2, where lifetime is overestimated up to fourfold (up to 39% overestimation for run1).
Surprisingly, tests with artificial censoring cycles over 2·10
are never non-conservative beyond five percent for both runs. For Run 1, the singularity around 6.5·10
is more pronounced than in
Figure 10, two jumps can be identified. For both runs, the relative lifetime prediction can be approximated well by a (log-lin) linear function. Run 2 shows better convergence behavior than Run 1. Kindly note that convergence behavior alone does not mean that a method is superior here because the curves converge to different values.
In
Figure 12, the relative lifetime predictions for a stroke stress of 700 MPa are displayed. The predicted lifetime of the model based on the original data is 7.3·10
in Run 1 and 8.3·10
in Run 2. Here, even the prediction of the model based on the original dataset is inadmissible for fatigue design. The assumption of a correct value of 1.0 appears not to be reasonable here. For Run 1, we see very conservative results for low artificial censoring cycles with relative lifetime prediction more or less continuously rising to 1.0 towards higher artificial censoring cycles. An exception to this is the range around 6.5·10
cycles, where very non-conservative lifetime predictions are made. This is based on a very high exponent
which corresponds to the false assumption of a fatigue strength somewhere around 10
or 10
cycles.
For Run 2, the charts are identical to the charts in
Figure 11, except for the labels. This is caused by both load levels being below the kink point in all models and
being constant.
6. Discussion
In
Section 5, the results of an ACE for the batch of springs under investigation have been presented. In this section, we discuss which of these results may be generalized and used in the fatigue design of other batches of compression springs manufactured from VDSiCr.
The most alarming results of the investigation are very non-conservative results generated by tests running unil an ultimate number of cycles which is under 1.5·10. Formulated positively, the results were safe to a certain degree when the ultimate number of cycles exceeded a minimum ultimate number of cycles, which for this batch was about 1.5·10. We believe this to be a pattern having to do with the kinking of the fatigue curve and therefore being present in all batches of compression springs manufactured from VDSiCr. To our knowledge, there is no evidence supporting the idea that the kinking patterns of different batches of compression springs occur at the same number of cycles. Therefore, there is also no evidence supporting a fixed minimum ultimate number of cycles for all batches of compression springs manufactured from VDSiCr. This makes a simple recommendation like ultimate numbers of cycles of at least 2·10 impossible at the current state of research.
Despite the lack of such a simple rule, the question of whether the ultimate number of cycles of a given fatigue test is greater or lower than the minimum ultimate number of cycles of the batch under investigation can still be examined. The examination is conducted by generating relative lifetime predictions for a variable exponent
and for a fixed exponent
in one ACE each and comparing the results for censoring cycles defined just below the ultimate number of cycles. Similar lifetime predictions for both ACEs at given censoring cycles indicate that the minimum ultimate number of cycles has been exceeded by the ultimate number of cycles and the results of the fatigue test may safely be used. Strongly deviating lifetime predictions at given censoring cycles indicate that this is not the case (or some other problem with one of the models is present, this is addressed below). Looking at
Figure 11, this method would have told us that we chose the ultimate number of cycles too low if we would have tested for 1.5·10
cycles.
Conducting ACEs with just one model may also help identify a minimum ultimate number of cycles that should be exceeded. Looking at
Figure 6 and
Figure 7, one recognizes huge gradients in different optimized parameters. If these pop up close to the ultimate number of cycles, design engineers should proceed with utmost care.
What is the advantage of the approach described above? Using this approach, we could have fitted a model to the results of an experiment with an ultimate number of cycles of 5·10, correctly predicting that we do not introduce an unreasonable bias and saving 90 percent of testing time compared to the original setup. Testing up until around 6.5·10, we would have correctly predicted that there is something wrong with the lifetime prediction at a stroke stress of 800 MPa because of the jump occurring just below our ultimate number of cycles. This analysis may even be conducted during testing, allowing an optimized ultimate number of cycles. Termination criteria must be defined before testing to prevent an inadmissibly subjective influence on the result of the fatigue test.
The results for the lowest load level deviated too much depending on the method to derive legitimate recommendations for the fatigue design of compression springs manufactured from VDSiCr beyond 10 cycles. Here, we strongly recommend an ultimate number of cycles that is greater than any lifetime prediction made based on the test. Anything else would be especially imprudent because the fatigue curve may start falling quickly a second time at some number of cycles higher than 10. Kindly note that according to the logic described above, the deviations are a stark warning sign that the ultimate number of cycles in testing is not high enough to make lifetime predictions at this level.
In literature, testing of compression springs with ultimate number of cycles of e.g.,
without additional analysis like an ACE is common practice. In the light of our findings, this practice should be evaluated critically by research institutions since results may be biased. The most obvious explanation for this bias is the mismatch between the model employed and the true fatigue behavior of the springs under investigation. This mismatch has been discussed in literature [
3,
21,
22,
23,
24,
25,
26,
27,
28] but no consensus has been reached yet. In future investigations, ACEs may be used to compare the predictive power of different models without prior assumptions regarding the true fatigue behavior. The introduction of power analysis may help differentiate between non-conservative predictions due to a low number of failure events and bad predictions due to systematically excluding failure mechanisms by censoring events beyond the ultimate number of cycles.
7. Conclusions
In this paper, three algorithms were presented. The first algorithm fits a model to a set of observed events using the maximum likelihood method. The second algorithm creates artificially censored datasets by removing all information that was not present at a given number of cycles from an original dataset.
The third algorithm uses the second algorithm to create an array of sets with different artificial censoring cycles. It fits a model to each dataset and predicts lifetimes at different load levels. The lifetime predictions may be used for investigations regarding the influence of the ultimate number of cycles on the result of tests. This is a new kind of experiment, for which we propose the name Artificial Censoring Experiment (ACE).
In this paper, the fatigue behavior of compression springs manufactured from VDSiCr was investigated by means of the described ACE. The most important recommendations for engineers conducting fatigue design of such springs are:
Without further investigation, to create a fatigue design for n cycles, a fatigue experiment must be conducted with an ultimate number of cycles of at least n.
ACEs allow an extrapolation to a certain extent if a minimum ultimate number of cycles is exceeded.
The minimum ultimate number of cycles may or may not depend on the batch of springs under investigation. For the batch under investigation in this paper, it was around 5·10 cycles.
If confronted with a very high computed exponent , implying a pronounced fatigue strength, always conduct an ACE or use . The computed very high exponent is probably caused by a too low ultimate number of cycles in the underlying fatigue test. Lifetime predictions for loads below the assumed pronounced fatigue strength may be wrong by multiple orders of magnitude.
If designing for more than 10 cycles, proceed with utmost care. Due to limited data in literature, we have no knowledge of the course of the fatigue curve beyond 10 cycles. The ACEs conducted in this paper showed a dangerously high variance in predictions.
For researchers, the ACE offers an opportunity to research the influence of the ultimate number of cycles of other batches of compression springs manufactured from VDSiCr, potentially allowing a generalized rule for the extrapolation of fatigue data. Of course, this kind of research also should be conducted for other components and materials.
Further research should be conducted regarding probability distributions and potentials for lifetime predictions beyond 10 cycles. All fatigue data available for compression springs with ultimate numbers of cycles over 10 have been generated at a single institute making all datasets codependent. Therefore, additional data should be generated by other institutes.
In this paper, a basic ACE was proposed. More sophisticated experiments should be developed. For example an ACE setup allowing continuous interpretations in stress direction instead of interpretations at a very limited number of load levels would allow more objective investigations.