1. Introduction
In recent decades, due to the rapid development of sensor technologies, data transmission technologies, and monitoring systems, the tasks of controlling the aging and degradation of technical and biological objects have received additional impetus for research. Some wearing and aging models were in the focus of many investigators in the framework of shock and damage models. An excellent review and contribution of the earlier papers devoted to the topic can be found in the studies by Esary et al. [
1], Kopnov and Timashev [
2], and the references therein. Murphy and Iskandar [
3] have studied the application of a control policy with two parameters to the degradation process associated with shocks. In Singpurwalla [
4], the hazard potential notion as a random life resource has been introduced and considered. Then, this notion was investigated in different directions in Singpurwalla [
5]. The aging and degradation models suppose the study of systems with gradual failures for which multi-state reliability models were elaborated (for the history and bibliography see, e.g., Lisniansky and Levitin [
6]). Rykov and Dimitrov [
7] proposed the model for complex hierarchical reliability system. Rykov and Efrosinin [
8] studied the controllable degradation unit as the fault tolerance unit. Generalized birth and death processes as degradation models were considered in Rykov [
9]. A degrading unit with random life resources, which operates until complete failure occurs, was analyzed by Rykov and Efrosinin [
10]. Giorgio et al. [
11] proposed a Markov model for the gradual deterioration process, which progressively degrades until a complete failure occurs. The regenerative variant of the splitting method to estimate the failure probability in a controllable degradation process was provided in Borodina et al. [
12].
Let the mechanical or biological system contain a degrading unit. Degradation processes under study are assumed to have observable states or there is some observable measure parameter that can be associated with a process, e.g., signal of acoustic emission, ultrasonic method for the detection of hidden defects, measures of the gravimetric analysis, and electromagnetic flaw detection. The degradation process is broken down into discrete stages, the dwell times of which are random. Two types of systems are of special interest. In first case the degrading unit has a random time to an instantaneous failure while in the second case the degrading unit has a random time to a preventive repair. In both cases the unit is of multiple uses, i.e., after a complete failure it can be repaired. After the repair, the degradation process starts again in an absolutely new state. The unit is supplied by a monitoring system and by a controller. The monitoring system gives the information about the current degradation state and, based on this information, the controller makes a decision about a necessity to perform the preventive repair until the last degradation state is achieved. The control problem is studied in a stationary regime. The control principles of such degradation systems may be easily explained by virtue of the following examples. Some of them were introduced in Kopnov [
13].
Corrosion process of a unit with protective covering. Let a unit subjected to corrosion have a protective covering, which decreases during operation, and make it possible to trace its thickness, see
Figure 1. The problem is to find the optimal value of the initial thickness and the thickness when the covering should be renewed.
Damage process due to the fatigue crack growth. Let a unit fail due to the fatigue crack growth, as illustrated in
Figure 2. The unit can be supplied by a fatigue gauge that can be e.g., as a plane notched pattern. This gauge works with the unit and reflects its damage accumulation process. The problem is to find the optimal value of the gauge’s crack when stopping and replacement are implemented. The recovery of of such a unit can be also performed by means of the welding.
Wear of the tool of machine-tools. The wear of the machine tools is correlated with thermo-emf of the pair “cutter-blank”. A monitoring system can be installed, and the optimal value of the measured voltage can be determined to prevent failures and outputs of poor quality.
Wear of a plane bearing. The bearing is a critical unit of a piece of metallurgical equipment, hence the optimal maintenance problem of a plain bearing must be discussed. The restoration cost of the failed system is higher than the inspection and preventive replacement cost. In Ref. [
14], the authors investigate the bearing shell wear process as a degradation one.
Discharge of an external load. If the degradation process is associated with external loads, and a unit fails when the load exceeds a failure level, then partial or full discharge of loads is another example of the controllable damage process.
In this paper we consider degradation systems operating under a threshold-based policy with threshold levels . Here, the first threshold level m specifies a state, where a controller receives a signal that after a random time for one type of degenerating process an instantaneous failure may occur or for another type a preventive repair may take place. The second threshold n stands for the maximal number of gradual failure states before hitting a complete failure state. In corrosion processes the pair defines the appropriate thickness of the protective covering and the thickness where a signal should be generated. In a damage process the pair defines the maximal size of the crack and the corresponding signal size. The proposed approach for optimization of the degradation process has a number of advantages. The optimal control policy depends on only two parameters, which can be calculated relatively easily. The performance and reliability characteristics of the degraded units are obtained explicitly. Assuming that the input random variables belong to parametric families of distributions, then, in practice, for a degradation process with discrete phases, the unknown parameters of the distributions can be easily estimated from statistical data.
The paper is organized as follows. In
Section 2, we describe the mathematical model of the degrading unit with a random time to an instantaneous failure and derive the performance and reliability characteristics. In
Section 3, we develop the analysis for the mean losses and reliability function for the model of the degrading unit with random time to a preventive repair. Some illustrative numerical examples are discussed in
Section 4.
In order to make it easier to understand the description of the mathematical models and the corresponding results, we have summarized the main notations together with their descriptions in
Table 1.
3. Degrading Unit with a Partial Preventive Repair
In this section we consider a deteriorating unit that degrades according to a degradation process
, whose trajectory is illustrated in
Figure 7.
The notations associated with this model are the same as previously. According to this model, the degradation process has the set of space as before. After reaching the signal state m, two events are possible: either there is a partial preventive repair in a random time V, which occurs in a degradation state with a state-dependent repair time , or in time , where there is a transition to a complete failure state and where the unit can be completely recovered in a random time . After a preventive repair in state , we assume that the unit may not be so good as a new one, and it returns to the gradual failure state , where .
3.1. Regenerative Process with Costs
The proposed degradation process
can be treated again as a regenerative one, where the hitting times of the signal state
m are the regenerative moments. In a similar was as before, we distinguish between two types of cycles with and without a complete failure within it. The random variables
and
are then of the form,
Let us denote by the probability of a complete failure in a regenerative cycle. Then, the following statement holds.
Proposition 5. The average duration and average cost in a regenerative cycle satisfies the relationswhereand Proof. The proof of this statement is based on relations (
13) and (14), and it is similar to that applied to the first model with instantaneous failures. The details are omitted here. □
3.2. Mean Time to Failure
As there are preventative repairs in this model, the random time
T to complete failure will consist of other components. Let us introduce the following notations:
—the random time from the beginning of the regenerative cycle until the moment of a complete failure if it occurs,
—the random duration of the regenerative cycle without a complete failure, and
N—the random number of cycles to complete failure. Since the complete failure occurs in each regenerative cycle with the probability
, the number of cycles
N has a geometric distribution, i.e.,
Then the mean time to the first complete failure is of the form.
Proposition 6. The mean time to the first complete failure given that the initial state is satisfies the relation, Proof. Due to the structure of the random variable
T and the law of the total probability, we have,
Consistent with the definition introduced for random variables
and
, we obtain
and
. By substituting these expectations into the last expressions and subsequent taking into account that
after some simple algebra, we get the relation (
15). □
Now we will briefly discuss the problem of calculating the reliability function
. Let us assume that the complete failure of the system is a rare event. Further, we introduce the following notation:
—the random time from the beginning of the regenerative cycle up to the time moment of the first complete failure within the
Nth cycle. Due to the limiting theorem for the sojourn time of ergodic regenerative processes (see, e.g., [
16]) the following statement holds.
Proposition 7. Let the failure in one regenerative cycle be a rare event, i.e., the number of regeneration cycles N is significantly large. If for some , then for each the following asymptotic property can be derived,where In assumption of the Markov property, the reliability function of the degradation unit with an arbitrary can be calculated explicitly. The calculation is performed by means of conditional Laplace transforms , , of the probability density function for the residual life time given that the initial state is i. Obviously, in absorbing state F the conditional Laplace transform .
Proposition 8. The Laplace transform of the reliability Function satisfies the relation,where Proof. The time
T is a duration commencing the initial state of the degradation process 0 and ends when the degrading unit visits the complete failure state
F. The first-step analysis is employed for the Markov chain with an absorption in state
F to get the system of equations for the conditional Laplace transforms
,
.
In (
18), the equality for state i includes the term at
, which represents the transition with a rate
to a preventive maintenance state, from which the unit moves to a newer gradual failure state
after an exponentially distributed time with a parameter
. Solving recursively equations (
18) and using for convenience the notations
,
and
from the statement, we obtain explicit solutions for this system. In particular for the function
after some simple algebra we have
The statement further follows from the fact that the Laplace transform
and the conditional Laplace transform
satisfy the relation
□
The inverse transformation of the function
is required to get
. Differentiation of the function
with respect to the parameter
s at point
results in the mean life time to a complete failure, i.e.,
3.3. Numerical Examples
We next present the numerical examples involving the optimal policy for different optimization problems and the numerical inversion of the transform . In the following examples we assume that , , and . Moreover, , , , , where , and the costs , , and take the same values as for the previous model. We further fix .
Example 4. In Figure 8, we calculate the optimal policy as the cost (the figure labeled by “a”) and the cost (the figure labeled by “b”) varies. For example, the figure for the repair cost shows that if , then the optimal policy is , and if , then we obtain . From the figure for the operating cost , we can see that if , then the optimal policy is respectively equal to . Thus, the values of the optimal thresholds decrease monotonically with an increase in the cost , while we no longer observe such monotonicity with an increase in the cost . Moreover, at high operating costs, it becomes advantageous to limit the number of intermediate degradation states to a small number, i.e., by the policy , so that the unit can be repaired to an almost new state after a complete failure. The influence of the repair costs and is illustrated in Figure 9 (the figure labeled respectively by “a” and “b”). Here we observe, for example, that if , the policies are optimal, and if , we then get the optimal policies . In both cases the changes are monotonic, although the optimal control appears to be less sensitive to an increase in the cost . Example 5. Table 3 summarizes the results of the calculations of optimal threshold levels for the different optimization criteria proposed in (10). In this case, just as in the previous degradation model, we obtained completely different optimal thresholds for different optimization criteria. The highest mean time to failure is naturally given by optimizing the function , i.e., . However, we obtained the shortest mean time to failure by minimizing the steady-state probability of a complete failure state , i.e., , which is not an obvious result. Example 6. In Figure 10, we take (the figure labeled by “a”) and (the figure labeled by “b”). The control policy in both cases is . The other parameters take on the values defined above and display the real reliability function as a numerical inversion of the Laplace transform from (17), together with the asymptotic function (16). The asymptotic function is defined respectively by with , , and with , . We observe here that in the figure, which corresponds to a lower value for the probability of complete failure in a regeneration cycle , the asymptotic curve better approximates the corresponding real reliability function.