5.1. Development of ANFIS Model
The ANFIS model shown and described below was developed in the Python programming language in the PyCharm 2023.2.1 editor (Community Edition, Jet Brains), which is open to user access.
ANFIS systems exhibit a synergy of artificial neural networks and fuzzy logic (fuzzy inference system). The advantage of these systems is reflected in the combinations of their positive features, namely, the ability to learn with artificial neural networks and the use of expert knowledge with fuzzy logic.
The architecture of the ANFIS system bears resemblance to that of artificial neural networks, where, based on the set of input–output data, a corresponding fuzzy inference system is formed, and the parameters of the membership functions that transform the input data are calculated. The general structure of the ANFIS model consists of five layers (
Figure 3). Below is a brief description of the layers.
In the first layer, the input data are transformed into a system of appropriate fuzzy sets:
where
is the input argument of the first layer, and
is the membership function of the corresponding linguistic variable
.
In the second layer of the ANFIS model, the output parameters from various variables in the preceding layer are integrated. The determination of output data involves:
where
and
are two different variables.
In the third layer, the values derived from the second layer undergo a normalization process. The normalization procedure is conducted in the following manner:
The following layer involves the amalgamation of normalized values from the preceding layer with first-order polynomials:
where
,
and
are the parameters of the fourth layer model.
In the fifth and final layer, the normalized values from the preceding layer are summed using the following formula:
In
Figure 4, the general architecture of the ANFIS model described in the previous part is shown.
The training of a neuro-fuzzy system is best done by applying a back-propagation process that uses the
as the error function, defined by:
where
,
,…,
are actual values, and
,…,
are values predicted by the ANFIS model.
When the input membership function parameters are set, the output from the ANFIS model is calculated as follows:
Using
and
, the following equality is obtained:
The training process, referred to as model training, centers around adjusting parameter values based on the provided training data. Essential to this process is the utilization of the back-propagation method, an algorithm designed to minimize the error between the network’s output and the desired output.
The determination of the availability of continuous systems and its partial indicators was processed using the results obtained through questionnaires related to the expert assessment of partial indicators of availability, and to historical data on downtime and work, which include the time period from 2016 to 2019.
The ECC system’s availability is contingent on specific factors, commonly classified into two groups: partial indicators, such as reliability and maintainability. These synthetic indicators further rely on a multitude of independent parameters (sub-indicators) (
Figure 5), all treated as variables within the context of this ANFIS model.
Within this model, availability decomposes into partial sub-indicators that are assessed by experts via a questionnaire. Each component of the I ECC system, including the bucket wheel excavator, beltwagon, belt conveyors, and crushing plant, undergoes evaluation.
In the expert evaluation, 10 experts specializing in continuous systems within surface mining were interviewed. They offered assessments for the sub-indicators of availability during specific quarters, encompassing the timeframe from 2016 to 2019, for each component of the ECC system.
Data from 2016–2018 were used to train the ANFIS model (480 data points—training data set), while data from 2019 (160 data points—test data set) were used to test the obtained model. The experts gave grades in the questionnaire ranging from
(the worst grade) to
(the best grade). The layout of the questionnaire is shown in
Figure 6; in this questionnaire, the expert was required to make assessments at the quarterly level in a predetermined period of time for each part I of the ECC system. The scores obtained in this way have been used as input data for this model.
Prior to model development, a database was established concerning the durations of mechanical, electrical, and other failures within the ECC system spanning four years (2016, 2017, 2018, 2019). Information from this database is employed for calculating historical availability on a quarterly basis, serving as the output data for the ANFIS model. The availability for each quarter was computed using the Formula (1).
In
Table 1, part of the database is shown. The data were taken from the Electric Power Company of Serbia and contain information about downtimes on the specific system in the specified time period.
The system’s availability was assessed on a quarterly basis (based on the available data), and the resulting values are presented in
Table 2.
The resulting ANFIS model was given the survey results for all nine partial sub-indicators for each part of the I ECC system as input parameters, while the output represents the corresponding availability in the quarter to which the survey results refer, which was obtained based on historical data taken from the Electric Power Company of Serbia.
In the first step of the model, fuzzification was performed, which represents the transformation of partial indicator scores, using membership functions, into the corresponding -scale for = 10. Predefined fuzzy sets are not used for probability functions, but membership functions are used instead, the parameters of which are estimated within the model training process. The utilized membership functions include the Bell-shaped membership function, the Gaussian membership function, and the Sigmoid membership function.
Using IF-THEN rules that are pre-defined, the synthetic indicator is determined based on the partial sub-indicators , and , and the synthetic indicator is determined based on the partial sub-indicators , , , , and .
In the following, we will illustrate the determination of the synthesis indicator
using IF-THEN rules based on sub-indicators
,
and
. Let the IF-THEN rule be defined by IF
AND
AND
THEN
, where
i,
j and
k are in the set {
}, and
is in the set {
}. Then, the fuzzy sets come together:
where
,
and
are the input values of grades
,
and
, respectively. For partial indicators
,
and
, we assign the value
. The fuzzy set corresponding to the rating
of the indicator
is the sum of all fuzzy sets assigned the value
. In a similar way, on the basis of sub-indicators
,
,
,
,
and
, the synthesis indicator
is calculated.
In the next step, using the IF-THEN rules, as described in the previous paragraph, the availability indicator
is determined by synthetic indicators
and
. Then, the Euclidean distance of the obtained fuzzy sets from the fuzzy sets assigned to the availability indicator
is determined based on the corresponding membership functions whose parameters we estimate within this ANFIS model. The distances
,
,
,
and
determined in this way can be joined by the normalized reciprocal values of the relative distances, determined by:
These values belong to the appropriate set of grades that determine the indicator of availability, i.e.,
Finally, the linguistic description is transformed into a numerical designation:
Dividing by 5 gives the predicted value of availability, which is compared with the realized value of availability, calculated on a quarterly basis.
The IF-THEN rules used in this ANFIS model are shown in
Table 3,
Table 4 and
Table 5. So, for example, the values shown in the first type of this table are interpreted as follows:
If the partial sub-indicator is (the working environment conditions typically do not align with the requirements for the equipment in use), and if the partial sub-indicator is (write-off machine, very high level of failure) and the partial sub-indicator is (underdeveloped basic engineering), then indicator is unreliable, .
A summary of the models considered for predicting availability is given in
Table 6.
5.2. Development of Simulation Model
During the creation of the simulation model, all failures were classified into one of three types of failure (mechanical, electrical, and others). As in the case of the ANFIS model, the simulation model used data from three years (2016, 2017 and 2018) to obtain results.
In
Table 7, the experimental and theoretical frequencies of machine failures by interval are given.
The distributions of mechanical failure times, considered in the 96th percentile of the data, conform to the Weibull distribution, with parameters
,
and
. More precisely, the empirical distribution function is determined by:
The model was developed based on a total of 1238 instances of mechanical failures.
The testing of the hypothesis regarding the distribution of data was performed with the help of the Kolmogornov–Smirnov test, whose statistic value
is equal to 1.7944, so with a significance level of 0.001 we cannot reject the null hypothesis that claims that the data are in accordance with the Weibull distribution. The
Figure 7 shows the experimental and theoretical functions of the distribution of mechanical failures.
In
Table 8, experimental and theoretical frequencies of electrical failures by interval are given.
The distribution of the duration of electrical failures, considered in the 98.5th percentile of the data, is in accordance with the Weibull distribution, with parameters
,
and
. More precisely, the empirical distribution function is determined by:
The number of electrical failures on which this model was developed amounted to 908 failures. The testing of the hypothesis regarding data distribution was performed with the help of the Kolmogornov–Smirnov test, whose statistic value
is equal to 1.2804, so with a significance level of 0.05, we cannot reject the null hypothesis that claims that the data are in accordance with the Weibull distribution.
Figure 8 shows the experimental and theoretical distribution functions of electrical failure.
In
Table 9, experimental and theoretical frequencies of other failures by interval are given.
The distribution of the durations of other failures, considered in the 100th percentile of the data, is in accordance with the exponential distribution, with parameters
and
. More precisely, the empirical distribution function is determined by:
The number of other failures on which this model was developed amounted to 2030 failures. The testing of the hypothesis regarding data distribution was performed with the help of the Kolmogornov–Smirnov test whose value of the statistic
is equal to 1.5761, so with a significance level of 0.01, we cannot reject the null hypothesis, which claims that the data are in accordance with the exponential distribution.
Figure 9 shows the experimental and theoretical distribution functions of other failures.
In
Table 10, the experimental and theoretical frequencies of duration between failures by interval are given.
The distribution of the durations between failures, considered in the 95th percentile of the data, is in accordance with the Erlang distribution, with parameters
,
and
. More precisely, the empirical distribution function is determined by:
The number of times between failures on which this model was developed was 5212. The testing of the hypothesis regarding the distribution of data was carried out with the help of the Kolmogornov–Smirnov test, whose value for the statistic
is equal to 1.0192, so with a significance level of 0.2 we cannot reject the null hypothesis that claims that the data are in accordance with the Erlang distribution.
Figure 10 shows the experimental and theoretical time distribution functions between failures.
During the duration of mechanical and electrical failures, the parameter (β) of the Weibull distribution is close to unity, but less than 1, which indicates that the equipment, parts, etc., both mechanical and electrical, have an approximately constant intensity of maintenance, which is already the case with other failures (exponential distribution), which means that the total intensity of maintenance of the ECC system is approximately constant when t→∞, that is, it can be considered a function of the convenience of keeping (maintainability) the entire ECC system roughly exponential, and it represents Poisson’s recovery process. Below (
Figure 11), we show the maintainability function for different types of failures.
The obtained distribution of times between failures, of Erlang order k = 2, indicates that with time, the intensity of failure increases (which is shown in the
Figure 12) i.e., the ECC system is at the end of the “exploitation” period and the beginning of the “obsolescence” period (periods II and III of the “bathtub” curve). For k = 1 (exponential distribution), the system is in period II of “exploitation”. The following figure (
Figure 12) shows the intensity of the failure.
Figure 13 shows the frequency distributions of the considered failure types.
The algorithm of the developed simulation model is shown in
Figure 14. The theoretical distributions obtained using the K-S test are used for generating failure duration times and times between failures in the following way. For mechanical and electrical failures, Weibull distribution is used (
Figure 7 and
Figure 8), while exponential distribution is used for other failures (
Figure 9). For generating the types of failures, empirical distribution, shown in
Figure 13, is used. Times between failures are generated using Erlang distribution, as shown in
Figure 10.
The simulation experiment is performed for a time period of one year (tsim), while the simulation time (t) is calculated in seconds. The number of simulation (Nosim) experiments is one hundred.
At the beginning, for t = 0, the initial state of the system is defined as the state of the ECC system “running” (State = “1”). At this point, the time (tbf) and type (VRflr) of first failure are also generated.
While the simulation experiment is being performed, the model alternately compares simulation time (t) with the times of failure beginning (tbf) and failure end (tef).
If the simulation time (t) is equal to the time of failure beginning (tbf), the state of the ECC system is changed to “downtime” (State = “0”), and the time needed for repair is generated according to the current type of failure and the distribution of failure time duration. In other words, the time when failure ends (tef) is generated.
If the simulation time (t) is equal to the time when the failure ends (tef), the state of the ECC system is changed to “running” (State = “1”). Also, the type of failure (VRflr) and the time of next failure are generated, meaning that the time of the beginning of the next failure (tbf) is generated.
After that, the availability of the ECC system is checked. If the state of the ECC system is “1” (“running”), then the variable AECC is increased by one. Also, the current simulation time and appropriate state of the ECC system are written onto a file.
When all Nosim simulation experiments are executed, the average availability of the ECC system AECC and the stationary value of the ECC system availability ka are calculated, as are the changes in the ECC system’s availability in time.
Glossary:
tsim—duration of the simulation (s);
Nosim—number of simulations;
State—state ECC system (1—“running”; 0—“downtime”);
rnumber—random number generated by uniform distribution in the interval [0…1];
cs—current simulation;
tsim—simulation time;
TBF—time between failures (current);
DT—downtime failures (current);
tef—failure completion time (in simulation);
tbf—failure start time (in simulation);
VRflr—type of failure (1—mechanical; 2—electrical; 3—other);
AECC—system availability—ECC;
A (t)—availability of the system at a given time t;
ka—stationary availability value.
Figure 15 shows the dependence of availability on time, obtained as a result of the simulation model.
Based on the simulation model, the mean availability value is 0.8513, i.e., 85%, and the stationary value is 0.8489, i.e., 85%.