In this section, the impact of different atmospheric factors on the corrosion rate is analyzed from the viewpoints of qualitative analysis, statistical quantitative analysis, and information fusion-based quantitative analysis.
3.3. Evidence Fusion-Based Quantitative Analysis
From the viewpoint of evidence fusion, a new machine learning model to quantify the hidden influence of atmospheric factors is developed in this subsection. The following model construction process is established in Stage 1, and the results of the proposed model of Stage 2 are directly given.
The corrosion rate data in Stage 1 were classified into the three corrosion degrees of fast, medium, and slow. A commonly used unsupervised K-means method (the function
K Means in the
Python package of
Sklearn) [
33] was used to avoid the subjectivity of manual classification. The results are shown in
Figure 5. For the real-time corrosion rate
at exposure time
t, the corresponding corrosion degree
was defined as follows:
Considering that the combined behaviors of multiple factors lead to atmospheric corrosion, the factors of RH, T, SO
2, NO
2, PM2.5, and PM10 can be regarded as different independent pieces of evidence providing support information to the current degree of corrosion. It should be noted that AQI is not an independent evidence because it is a synthetic index related to other investigated pollutants. The framework of the proposed model is shown in
Figure 6.
Step 1: Evidence construction. Let
denote the atmospheric factor data sample whose label of the degree of corrosion is
. The atmospheric factor dataset on Stage 1 is
. The frame of discernment in this degree of corrosion classification problem is
. Let
and
denote the training and test sets, respectively. Let
denote the data of the atmospheric factor
belonging to the degree of corrosion
. Moreover,
and
denote the maximum and minimum value of
in the training set, respectively. Considering of the universality of Gaussian distribution in the natural world, the Gaussian kernel density estimator [
34] was employed to calculate the probability density function (PDF)
in the training set:
where
,
is the standard deviation of
in the training set and
is the total number of samples in the training set [
34]. It should be noted that
is a positive value close to zero so as to avoid generating completely conflicting evidence. In this study, let
= 0.001.
As shown in
Figure 6, the support information for the degree of corrosion
provided by evidence
is proportional to its intersection
with the PDF model
[
35]. Accordingly, the basic probability assignment function
of evidence
can be calculated by two rules in reference [
35].
Step 2: Evidence discount. Considering the different levels of importance of different pieces of evidence, the evidence discounting operation is necessary to obtain reasonable fusion results. Let
denote the importance coefficient of evidence
whose initial value equals 1 before training. For evidence
, the discounted evidence
can be calculated as follows. For
:
Steps 3 and 4: Evidence fusion and corrosion prediction. The different pieces of discounted evidence can be fused by Dempster’s rule [
20].
Then, the predicted degree of corrosion
can be derived by the Pignistic probability transformation [
36]. For
:
where
is the cardinality of set
.
In order to determine the correct importance
of atmospheric factor evidence, the vector
was optimized on the training set. The objective function
is devoted to minimizing the error between the predicted degree of corrosion and the true degree of corrosion on the training set.
By the commonly used simulated annealing optimization method [
37], the optimal importance vector
was derived and used to verify the proposed model on the test set.
In this study, 80% of the data in each stage was randomly selected as the training set and the remainder as the test set. The widely used models of ANN and SVM were also tested. The average prediction precision of these three models on the test set in five random experiments was compared. The importance of each atmospheric factor given by our model is reported.
The experimental results of Stage 1 are given in
Table 5 and
Table 6. The corresponding analyses are as follows: (1) According to
Table 5, from the viewpoint of evidence fusion, T is the greatest contributor to atmospheric corrosion in Stage 1, followed by RH and SO
2. (2) Comparing RH and T, the contaminators of SO
2, NO
2, PM10, and PM2.5 have less influence in the interactions on atmospheric corrosion in the initial atmospheric corrosion process. (3) Comparing the statistical correlation coefficients in
Table 3, the proposed model found higher correlations between the atmospheric factors and corrosion rate. (4) According to
Table 6, comparing ANN and SVM, the proposed model performed best in predicting the degree of corrosion of Q235 steel.
Similarly, the experimental results of Stage 2 are summarized in
Table 7 and
Table 8. The corresponding analyses are as follows: (1) According to
Table 7, in Stage 2, T still contributes most to atmospheric corrosion among all of the investigated factors, followed by NO
2 and SO
2. (2) According to
Table 2, the mean RH in Stage 2 is obviously lower than in Stage 1. Accordingly, the proposed model derived a lower impact of RH in Stage 2 than in Stage 1. (3) As introduced in
Table 2, the test period of Stage 2 suffered more serious air pollution. Accordingly, compared to Stage 1, the proposed model found higher correlations between the corrosion rate and contaminators of NO
2, PM2.5, and PM10 in Stage 2. (4) According to
Table 7, the contaminators of SO
2 and NO
2 have more influence on atmospheric corrosion than RH. Following
Figure 3b, the corrosion behavior in Stage 2 gradually weakened over time. The possible reason is that Stage 1 generated corrosion products. They isolated the metal surface from the atmosphere such that the adhesion of water droplets on the metal surface was affected. Therefore, the impact of RH was weakened in Stage 2, while some contaminators contributed more to the specimen’s corrosion because of the ability of damaging the existing corrosion products. (5) According to
Table 8, the proposed model outperformed SVM and ANN in terms of corrosion predication.
To sum up, according to the results of the qualitative analysis and the statistical quantitative analysis on the exposure test data of Q235 steel at Qingdao, China, it was found that most statistical correlation coefficients did not adapt to the outdoor coupled corrosion data. Therefore, a new evidence fusion-based model was proposed. According to the results of the evidence fusion-based quantitative analysis, the proposed model can discover the influence of different environmental factors on carbon steel corrosion in different exposure test periods, and can accurately predict the corrosion rate.