Next Article in Journal
A Survey on Valdivia Open Question on Nikodým Sets
Next Article in Special Issue
Generalized Bayes Estimation Based on a Joint Type-II Censored Sample from K-Exponential Populations
Previous Article in Journal
Finite Dimensional Simple Modules over Some GIM Lie Algebras
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Support Vector Machine with Robust Low-Rank Learning for Multi-Label Classification Problems in the Steelmaking Process

1
Liaoning Engineering Laboratory of Data Analytics and Optimization for Smart Industry, Shenyang 110819, China
2
Key Laboratory of Data Analytics and Optimization for Smart Industry, Northeastern University, Ministry of Education, Shenyang 110819, China
3
National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Northeastern University, Shenyang 110819, China
*
Author to whom correspondence should be addressed.
Mathematics 2022, 10(15), 2659; https://doi.org/10.3390/math10152659
Submission received: 22 June 2022 / Revised: 25 July 2022 / Accepted: 26 July 2022 / Published: 28 July 2022
(This article belongs to the Special Issue Mathematical Modeling and Optimization of Process Industries)

Abstract

:
In this paper, we present a novel support vector machine learning method for multi-label classification in the steelmaking process. The steelmaking process involves complicated physicochemical reactions. The end-point temperature is the key to the steelmaking process. According to the initial furnace condition information, the end-point temperature can be predicted using a data-driven method. Based on the setting value of the temperature before tapping, multi-scale predicted errors of the end-point temperature can be calculated and divided into different ranges. The quality evaluation problem can be attributed to the multi-label classification problem of molten steel quality. To solve the classification problem, considering that it is difficult to capture nonlinear relationships between the input and output in linear models, we propose a novel support vector machine with robust low-rank learning, which has the characteristics of class imbalance without label correlations; a low-rank constraint is used to deal with high-order label correlations in low-dimensional space. Furthermore, we derive an accelerated proximal gradient algorithm and then extend it to handle the nonlinear multi-label classifiers. To validate the proposed model, experiments are conducted with real data from a practical steelmaking problem. The results show that the proposed model can effectively solve the multi-label classification problem in industrial production. To evaluate the proposed approach as a general classification approach, we test it on multi-label classification benchmark datasets. The results illustrate that the proposed approach performs better than other state-of-the-art approaches across different scenarios.

1. Introduction

The steel industry is the foundation of all industrialized countries worldwide. Steel output is a common metric used by economists to assess the economic strength of different countries. The industrial scale, product quality, economic benefit, and distribution direction of the iron and steel industry all have a significant impact on the development of the country. In order to ensure that metal products have high elasticity, deep drawability, and surface quality, the smelting process must precisely manage the impurities, content, temperature, and composition of molten steel [1,2].
Iron is made in a blast furnace, and steel is made in the converter furnace during the smelting process of the steel plant. The blast furnace is an iron-making machine that performs the initial step in the smelting of iron and steel. The smelting of iron ore into pig iron in a blast furnace is a high-temperature operation. Iron ore, coke, and slag-forming flux (limestone) are loaded from the top of the furnace, then preheated air is blown along the furnace perimeter from the tuyere in the lower half. The carbon in the coke reacts with the oxygen in the air at high temperatures to reduce gases (some blast furnaces additionally inject powdered coal, heavy oil, natural gas, and other auxiliary fuels). During the rising process, these reducing gases heat the slowly descending furnace charge and convert the iron oxide in the iron ore to metallic iron. The blast furnace provides the molten iron required in converter steelmaking.
The converter steelmaking process is a batch process with excellent production effectiveness, rapid reaction response times, and quick smelting. Throughout the steelmaking process, it is crucial to increase the temperature of the molten steel to the necessary level. In addition, a certain amount of alloy needs to be added. Figure 1 shows the smelting process. First, the blast furnace’s molten iron is poured into a torpedo car, then it is transferred to the ladle at the reload station. Moreover, at the desulfurization station, activities for desulfurization and slagging-off are also carried out. After processing, oxygen is blown into the converter furnace so that it directly oxidizes with the high-temperature molten iron to remove impurities. Then the molten iron is changed into molten steel.
The composition and temperature are indicative of the quality of the molten steel. The “cold brittleness” and “thermal brittleness” of steel are caused by the high content of phosphorus and sulfur in molten steel, respectively. During the steelmaking process, phosphorus and sulfur should be removed from the molten steel. When the maximum amount of oxygen is present in molten steel, the thermal brittleness of steel will be exacerbated, and a high number of oxide inclusions will occur, necessitating deoxidation. The presence of inclusions will disrupt the continuity of the steel matrix, degrading the steel’s mechanical characteristics. Therefore, they should be removed. To fulfill the tapping criteria, the temperature must be raised in the steelmaking process, and a specific kind and amount of alloy must be added to ensure the steel composition meets the steelmaking type parameters. To conclude, the primary goals of steelmaking are to achieve decarburization, dephosphorization, desulfurization, and deoxidation by using oxygen, slag, alloy addition, and stirring. Inclusions are removed by raising the temperature and adjusting the composition, then the necessary liquid steel is obtained and transferred into suitable ingots or billets.
Pure oxygen is blown into the molten iron in a basic oxygen furnace (BOF) steelmaking process. This process must be continuously monitored to obtain high-quality molten steel. Based on the physicochemical reactions that take place in the BOF, the process can be split into three stages: the silicon and manganese reaction stage, the carbon oxidation stage, and the blowing stage. The reaction process in BOF steelmaking is summarized. First, steel scraps and hot metal are pumped into a converter furnace. Second, a water-cooled lance is used to blast high-purity oxygen into the molten steel. The temperature rises from around 1200 °C to 1700 °C as a result of certain carbon oxidation events occurring in the molten steel, which also produce carbon dioxide and carbon monoxide. Third, supplementary ingredients (such as limestone) are added to the molten steel to meet the quality requirements. Meanwhile, a mixture of inert gases (nitrogen or argon) is blown into the furnace from the bottom to ensure a sufficient reaction. Finally, a sublance is used to complete the sampling process. The continuous oxygen-blowing operation increases the consumption of metal iron and decreases the life of the furnace lining if the molten steel’s quality falls below the necessary standard. Furthermore, the blowing time is fixed in actual operation. When the appropriate blowing time is reached, the furnace is turned on to measure temperature and carbon, and the refining process is selected. If the molten steel continues to blow, productivity suffers significantly, and quality is difficult to ensure.
Because of the short smelting period of the converter (from a few minutes to more than ten minutes), it is difficult to manually manage the temperature to ensure that the target composition of molten steel is achieved. Moreover, due to the black-box smelting conditions, and the steelmaking process can only be controlled by operator experience. The iron and steel smelting, however, is a multiphase, high-temperature reaction process. Classic interface response theory can only describe and explain traditional research methods; these methods fall short of addressing the needs of industrial production. The study of complex metallurgical reactions has become a low-cost and high-efficiency process due to the development of artificial intelligence (AI). Many studies have shown that AI models detect and examine metallurgical reactions more accurately than traditional metallurgical theory [3,4]. The establishment of an AI model is extremely beneficial to researchers and operators. As a result, we construct a steelmaking process anomaly detection model that can learn effective information from the steelmaking process to judge whether there is an anomaly in the future smelting process, allowing operators to detect and correct problems in real time.
Considering that the BOF steelmaking process has high-temperature and complex physical and chemical reactions, it is difficult to establish a precise quality evaluation model for molten steel. This is due to the multi-label classification problem. To solve this problem, data-driven methods are applied to the steelmaking process. Overall, the contributions of the current study are as follows:
  • We present a novel multi-label classification model called the support vector machine with robust low-rank learning (SVM-RL) and derive a kernelization SVM-RL to capture nonlinear relationships between the input and output.
  • In the kernel classifier case, the surrogate least-squares hinge loss is replaced with a margin-based loss to make it smooth for efficient optimization. It is expected that this will result in avoiding some of the over-fitting problems that may be encountered, and it is used as an intermediate step to discover the structure of a dataset.
  • A multi-label classification problem is derived from the practical steelmaking process, which has a black-box property. The proposed approach effectively solves this difficulty, and different benchmark problems are used to verify the performance of SVM-RL.
The rest of this paper is organized as follows. Section 2 summarizes the related literature. Our proposed method is introduced in Section 3. Section 4 shows the experimental results for the practical problem and benchmark problems. Finally, Section 5 provides conclusion and discusses future work.

2. Related Literature

In anticipating and analyzing complicated metallurgical models, AI algorithms can considerably improve the understanding of metallurgical processes. Some researchers have recently presented a variety of valuable algorithms for solving difficult metallurgical problems. They comprise regression analyses of specific quantitative descriptions as well as abnormal judgment categorization analyses. Azadi et al. [5] addressed issues such as large-scale monitoring of blast furnace operation status, multi-phase and multi-scale physicochemical changes, and long computation times, among others. They proposed a hybrid dynamic model to predict the hot metal silicon content and the slag basicity in the blast furnace process. The suggested model demonstrated good performance using actual production data from a blast furnace. Cardoso et al. [6] developed a committee machine containing 108 independent artificial neural networks to anticipate the impurity concentration of cast iron made in a blast furnace. The results of their experiments showed that this integrated method accurately estimated the impurity concentration in blast furnaces and has practical application value. Han et al. [7] proposed a bidirectional recursive multiscale convolution depth neural network algorithm based on the continuous spectral information of the furnace mouth flame in the steelmaking process. On this basis, a dynamic prediction model for the later stages of steelmaking was established, and the effectiveness of the algorithm was demonstrated through trials. Saigo et al. [8] proposed the Einstein–Roscoe expression and a transfer learning framework based on the Gaussian process to assist in measuring and estimating regression parameters for the viscosity prediction of the steelmaking process. Experiments showed the suggested technique was effective and that it outperformed other machine learning algorithms in clay prediction. Deng et al. [9] employed FactSage technology to construct a dynamic model of different oxides in the oxidation process of ultra-low carbon steel and carried out a detailed analysis of slag impurity oxides, which provided a significant contribution to the control of the steelmaking process. In light of the challenges of fewer data and large data fluctuation in blast furnace ironmaking, Gao et al. [10] developed a migration learning approach for robust fault identification and demonstrated that the algorithm can solve the abnormal diagnosis of blast furnace ironmaking through tests. Li et al. [11] developed a nonparallel hyperplane-based fuzzy classifier model to coordinate their model’s accuracy and interpretability based on the closed smelting and hysteresis features of a blast furnace system and tested the classification impact utilizing blast furnace data. On the topic of industrial failure detection, Rippon et al. [12] created a new industrial prediction classification problem and designed a machine learning methodology. They proposed a complete representation learning prediction classification framework by comparing conventional and contemporary representation learning approaches with data from an electric arc furnace. Zhang et al. [13] constructed a non-contact intelligent prediction model based on the internet of things system for predicting carbon content in molten steel and used the support vector machine method to forecast the provided carbon content criterion. The model accurately predicted and classified the steelmaking process, according to the experimental data. Zhou et al. [14] proposed a fault-detection approach based on deep learning and multi-information fusion to handle the problems of overheating, abnormal exhaust, partial melting, and other issues. Their experiments demonstrate the algorithm’s capability. In light of the coexistence of Gaussian and non-Gaussian blast furnace ironmaking data, Zhou et al. [15] suggested an integrated technique using PCA and ICA to detect and diagnose aberrant furnace conditions in the blast furnace ironmaking process.
With better knowledge of the metallurgical mechanism model, the correlation of multiple indicators of multi-output and multi-input data has been a focus of current studies in the metallurgical process. Feng et al. [16] presented a multichannel diffusion graph convolutional network using the correlation of molten steel element concentration to predict the end-point composition of molten steel. Experiments show that by mining the data correlation, the model may successfully increase end-point prediction performance. Li et al. [17] presented a novel multi-input multi-output Takagi-Sugeno fuzzy model. It has been used in the blast furnace ironmaking process. In order to evaluate the aging status of steel ladles in steelmaking plants, Vannucci et al. [18] suggested a multi-classification approach for uneven datasets. Experiments showed the model could predict when the ladle was nearing the end of its useful life. In light of the critical function of the oxygen demand changing the trend of iron and steel businesses in the iron and steel sector, Zhou et al. [19] suggested an oxygen-demand-prediction approach based on multi-output Gaussian process regression. Their algorithm’s usefulness and performance were demonstrated by comparing it to other algorithms in real-world scenarios.

3. Problem Formulation and Optimization

A significant development across several industries is machine learning (ML). It is an artificial intelligence method that can predict outcomes. One issue with predictive modeling is classification. The goal of classification is to provide a given set of input data with a class label. In scientific research, categorization models have acted as helpful AI tools in a variety of fields, such as financial credit, risk assessment [20], signal processing, and pattern recognition [21]. The main objective of these methods is to accurately forecast diverse circumstances. The support vector machine (SVM) offers a well-known and widely used classification algorithm that has sufficient capacity to generalize, fewer local minima, and minimal dependency on a small number of factors [22], and has seen success in applications as a strong, flexible classifier with excellent accuracy. However, the approach taken in conventional formulation settings cannot determine the relative relevance of distinct characteristics, and its performance may be seriously compromised when redundant variables are employed to determine the decision rule, even when they are ineffective at guessing due to the buildup of random noise, especially in a high-dimensional space [23].
When samples are not linearly separable, the kernel form of the SVM is created through the projection of input space to higher dimensional space. Although several characteristics are accessible, not all of them will be used in the classifier’s creation. If all the available input characteristics are incorporated into the model, redundant features and excessive noise may reduce the classifier’s accuracy while increasing its complexity. Obtaining a suitable solution for the kernel SVM is therefore vital since it is the most representative of the margin-based classifiers. The objective function of SVM is:
min : 1 2 w 2 s . t . y i ( w T x i + b i ) 1
where w is the weight vector, b is the paranoid item, and yi is the ith feature value.
However, computing the original SVM directly is typically difficult as it is non-smooth. In order to convert a non-convex issue into a convex problem that can be solved by an optimization approach, the SVM algorithm employs the Lagrange multiplier method to transform the original problem into an appropriate dual problem. In the current state of machine learning algorithms, this is also the preferred technique for solutions. It builds an appropriate objective function for the optimization method without altering its original goal and simplifies or approximates problems that cannot be solved.
L ( w , b , α ) = 1 2 w 2 + i = 1 m α i [ 1 y i ( w T x i + b ) ]
where α is the Lagrange multiplier, and the second term is called the cost function term which denotes hinge loss.
The SVM algorithm has achieved excellent results for single-label classification. However, many practical production-classification problems demonstrate the importance of multiple label groups. In image classification, for example, an image frequently contains rich semantic information, such as multiple scenes, targets, and behaviors. As a result, single-objective classification’s generalization, multi-label classification, is a more practical and universal approach. Many researchers have improved the objective function of SVM, and one of them has modified the cost function term of SVM. Specifically, the hinge loss function of the replacement SVM becomes a margin-based loss. Achieving the goal of this method involves minimizing sorting loss when the margin is large and dealing with the nonlinear case utilizing the kernel technique [24].
Based on this investigation, the surrogate least-squares hinge loss is further substituted with a margin-based loss in place of punishing the conventional cost function of SVMs. To leverage high-order label correlations under the assumption of a low-dimensional label space, we explicitly put a low-rank restriction on the kernel parameter matrix, so that the chosen features that are utilized to build the rule should be few, if not even sparse, and simultaneous categorization is possible in practice.
Mathematically, the problem becomes the following.
min W 1 2 L δ ( | E Y ( ϕ ( X ) W ) | + ) 1 + λ 1 2 W F 2 s . t   Rank ( W ) k
where ϕ ( X ) = [ ϕ ( x 1 ) ϕ ( x n ) ] is the approximate function of low-dimensional sample data X to high-dimensional data. ( | E Y ( ϕ ( X ) W ) | + ) is the margin-based loss employed to approximate the thresholding 0–1 loss, E = { 1 } n × l denotes the matrix with each element equal to 1, denotes the Hadamard product of matrices,   1 denotes the l1-norm, W F 2 = j = 1 l W j 2 is the reciprocal of soft spacing distance, and Rank ( W ) is a low-rank restriction on the parameter matrix made under the supposition that the label space is low dimensional.
It is obvious that the restricted optimization problem in Equation (3) may be turned into the following unconstrained optimization problem by substituting the trace norm for the rank regularization.
min A 1 2 L δ ( | E Y K A | + ) 1 + λ 1 2 T r ( A T K A ) + λ 2 A
In recent research, Zou [25] introduced a new loss function penalty formula. The experiment demonstrated that the loss penalty effect enhanced computation performance and was smoother when dealing with categorization issues. Based on this work, we propose a novel support vector machine with robust low-rank learning (SVM-RL). SVM-RL aims to minimize the margin-based loss and maximize the soft spacing distance at the same time.
For any δ > 0 , we define a δ smoothed hinge loss function
L δ ( u ) = { 0   u 1 + δ , 1 4 δ [ u ( 1 + δ ) ] 2 1 δ < u < 1 + δ , 1 u   u 1 δ .
L δ ( ) is a brand-new classifier and convex loss based on margins in RKHS   H K is
α k = arg min α R n [ 1 n i = 1 n L δ ( y i K i T α ) + λ 1 α T K α + λ 2 L o s s ( α ) ]
Proposition 1.
Let Q ( α ) = 1 n i = 1 n L δ ( y i K i T α ) + λ 1 α T K α + λ 2 L o s s ( α ) , then we have Q ( α k ) δ 4 Q ( α S V M ) Q ( α k ) .
Lemma 1.
For any  0 < δ < δ , define  E ˜ 0 δ = { i : | 1 y i K i T α k | < δ } and  α ˜ 0 δ = α k . For k = 1 , 2 , , let
α ˜ 0 δ = arg min α R n [ 1 n i = 1 n L δ ( y i K i T α ) + λ 1 α T K α + λ 2 L o s s ( α ) ]
subject to 1 = y i K i T α ˜ k δ , i E ˜ k 1 δ and E ˜ k 1 δ = { i : | 1 y i K i T α ˜ k δ | < δ } . Then there exists a finite k such that E ˜ k δ = E ˜ k 1 δ E 0 and α ˜ k + 1 δ = α ˜ k δ . We denote E ˜ δ = E ˜ k δ and  α ˜ δ = α ˜ k δ .
Lemma 2 is that α ˜ δ will actually equal α S V M once δ is less than δ .
Lemma 2.
Suppose there exists some i such that y i K i T α S V M 1 . It holds that α ˜ δ = α S V M as long as δ < δ , where δ = min { δ 0 , 4 η } .
Remark 1.
All the proofs are similar to the reference[25], so we leave out them here owing to lack of space.
Remark 2.
Due to the superiority of the APG, Equation (3) can be guaranteed to converge to a global optimum with the O ( 1 t 2 ) convergence rate.
The main flow is outlined in Algorithm 1, which summarizes the entire APG algorithm for training and tuning SVM-RL.
Algorithm 1: Accelerated Proximal Gradient Method for Equation (5)
Input: X R n × m , Y { 1 , 1 } n × l , the kernel matrix K R n × n , tradeoff
   hyper-parameters λ 1 , λ 2 .
Output: W R n × l .
1Initialize t = 1 , b 1 = 1 , δ . Define the smooth function L δ .
2Initialize G 1 = W 0 R n × l as zero matrix.
3while (3) not converge do
4 Compute the gradient of G t f ( G t ) .
5 W t = prox ( λ 2 / L f ) ( G t 1 L f G t f ( G t ) ) .
6 b t + 1 = 1 + 1 + 4 b t 2 2 .
7 G t + 1 = W t + b t 1 b t + 1 ( W t W t 1 ) .
8 t = t + 1 .
9 Until the KKT conditions of all SVM models are satisfied
10end
11 W = W t 1 .

4. Numerical Results

To demonstrate the model’s practicality, we used actual data from the steelmaking process to solve multi-label classification tasks. Furthermore, we performed a significant number of experimental evaluations on four different sets of public data. The performance of the model was consistent and high-performing, and the results outperformed existing advanced approaches, demonstrating that the model had excellent performance and universality in multi-label classification.

4.1. Experimental Details

This section provides an overview of the experimental equipment and evaluation criteria.

4.1.1. Experimental Setting

  • Experimental platform: The experiment was performed on a computer with an Intel® Core™ i5-9400f 2.90 GHz CPU, 16 GB RAM and NVIDIA GeForce GRX 1060 6 GB GPU, with a 64-bit Windows 11 operating system. The software was developed using Matlab2016.
  • Parameter setup: Grid searching was used to find the model parameters in this paper. The parameters were as follows: λ1, λ3, σ = { 10 4 , 10 3 , , 10 2 } . A total of 60 percent of each dataset was chosen at random as training data, while the remaining 40 percent was used as test data. To avoid aberrant interference, each dataset was subjected to 10 independent experiments, with the results being averaged.

4.1.2. Evaluation Metrics

  • Single-label classification indicators: to assess the model’s performance, we employed Precision, Recall, F1-Score, and Accuracy as indicators; Table 1 is the confusion matrix of single-label classification.
  • “Precision” indicates the fraction of predicted condition positive to total prediction condition positive:
Precision = TP TP + FP
2.
“Recall” indicates the fraction of predicted condition positive to total condition positive:
Recall = TP TP + FN
3.
“F1-Score” is a combination of two contradictory indicators. They are the accuracy rate and the recall rate:
F 1 - Score = 2 × precision × recall percision + recall
4.
“Accuracy” is the overall categorization index, the number of valid classifications:
Accuracy = TP + TN ( TP + FN + FP + TN )
  • Multi-label classification indicators to assess the following: Hamming Loss, Subset Accuracy, F1-Example, Ranking Loss, Coverage, Average Precision [26] as the multi-classification problem evaluation index. Assuming a given test dataset u = { ( x i , Y i ) | 1 i < t , Y i = { 1 , 1 } } , f(xi) denotes each of the multi-label classifier prediction label sets for xi, and q is the label count.
  • Hamming Loss (Hal) calculates the binary average error ratio of Tags.   1 denotes the l1-norm.
    Hal = 1 t i = 1 t 1 q f ( x i ) Y 1
  • Ranking Loss (Ral) calculates the ratio of reverse labels.
    Ral = 1 t i = 1 t | S e t R i | | Y i + | | Y i | S e t R i = { ( p , q ) | f p ( x i ) f q ( x i ) , ( p , q ) Y i + × Y i }
  • Coverage (Cov) calculates the number steps it takes to progress down the label sort until all the fundamental real tags are covered.
    Cov = 1 q ( 1 t i = 1 t max   l k Y i r a n k ( x i , l k ) 1 )
  • Subset Accuracy (Sa) calculates the ratio of the predicted label subset that matches the actual data label. F(xi) is the total multi-label classifier prediction label set for xi.
    Sa = 1 t i = 1 t F ( x i ) = Y i
  • F1-Example (F1e) indicates the average of F1-Score for each instance.
    F 1 e = 1 t i = 1 t 2 | f ( x i ) + Y i + | | f ( x i ) + | + | Y i + |
    where r a n k ( x i , l k ) stands for the rank of the label lk in the ranking list based on F(xi) which is sorted in descending order.
  • Average Precision (Ap) calculates the average score of related labels above that of a particular label.
    Ap = 1 t i = 1 t 1 | Y i + | l j Y i + | S e t P i j | r a n k F ( x i , l j ) S e t P i j = { l k Y i + | r a n k F ( x i , l k ) r a n k F ( x i , l j ) }

4.2. Multi-Label Classification Problems in the Steelmaking Process

4.2.1. Experimental Data and Settings

The information utilized in this experiment came from an actual BOF steelmaking process. A multi-source sensor was utilized to collect the data; a flame analyzer was used to monitor real-time temperature; sublance and tossing probes sampled the quality components in the molten steel. The above means were used to collect the data from a steel plant for one year, through the screening of missing values and abnormal values. The number of data samples was 2755 in total. The input data included the following: (1) steel scrap data: addition amount of steel scrap, etc.; (2) molten iron data: addition amount of molten iron, molten iron composition, etc.; (3) oxygen lance data: height of oxygen lance, etc.; (4) auxiliary material data: types of auxiliary material, addition amount of auxiliary material, etc.; (5) oxygen blowing data: flow of oxygen, etc.; (6) alloy data: types of alloy addition, addition amount of alloy, etc. The training data points comprised 1629 samples with 20 features, and the test data points comprised 1086 samples with 20 features. The output data comprised the variance between the end-point molten steel temperature and the actual end-point molten steel temperature before tapping. If the predicted values were larger than 15 °C and 10 °C, respectively, the label was displayed as −1. On the contrary, it is marked as 1 if values were less than or equal to 15 °C and 10 °C. The SVM was the comparison method. We used a grid search to optimize the parameters of the SVM to make the results more believable. The most impressive outcomes are highlighted.

4.2.2. Experimental Results

The suggested model’s judgment performance in the steelmaking dataset is shown in Figure 2 andf Figure 3 and Table 2. The index of single-label categorization was used to analyze each task in the steelmaking process. The model produced 100% correct prediction answers for the 15 °C label problem. At the same time, the model’s accuracy for the problem of 10 °C was 80%, which was effective to fulfill the site’s actual requirements. Because of the significant nonlinear interaction between converter input features, the SVM/SVR has always been an efficient approach to deal with steelmaking challenges. It has been demonstrated via comparison trials that the proposed model’s accuracy performance can be improved by at least 20%. The results of the comparison are shown in Figure 2 and Table 2. The model’s precision exceeds 20%, which was higher than the SVM in each category. The correlation of input and output features can often be leveraged to improve performance in multi-label task prediction. Simultaneously, to compare the proposed model’s convergence performance, we demonstrated the algorithm’s loss-reduction process; the comparison of the loss function curve is shown in Figure 3. As expected, the suggested model converged on the loss function more smoothly and quickly, and the final loss function was smaller. Furthermore, the curve of the model was basically in convergence when the loss function was updated, and the loss function value approached 0 after 60 iterations. However, the RBRL convergence was unstable, and the function did not converge until the maximum number of iterations was reached. This demonstrated that the proposed model was more efficient than the RBRL.

4.2.3. Sensitivity Analysis

Finally, we examined the steelmaking data’s sensitivity to the suggested model parameters { λ 1 , λ 2 , δ } in a single-variable situation. Among these, the λ1 control model’s complexity was relatively high, and the model’s robustness was higher. λ2 controls the impact of the output tag’s relevance on the objective function. σ is the width of the RBF kernel function; the better the nonlinearity, the higher the value; however, it is easy to overfit. The optimal hyper parameters were determined by cross-validation in the experiment, and the performance effects of the other two hyper parameters were assessed once one hyperparameter was fixed. Among them, the variation range of each parameter was [10−4, 102]. The sensitivity analysis findings for each evaluation index are depicted in Figure 4. The hyper parametric influence of the data focused mostly on the kernel function width coefficient because of the significant nonlinear link between the input properties of steelmaking data. Because of the short kernel function width coefficient, the model could not fit the data adequately and performed poorly. The model performed best at the midway value of the kernel width coefficient, which was in good agreement with the theory. For λ1 and λ2, the model’s performance change trend was not evident. This suggested that λ1, λ2 had little interaction effect, λ1 controlled the complexity of the model, and λ2 adjusted the weight of the correlation between the model outputs in the objective function. The influence of these two characteristics had little bearing on the overall evaluation of each piece of work.

4.3. Benchmark Test Problems

4.3.1. Benchmark Dataset

We conducted multiple comparative experiments on four datasets covering a wide range of scales and fielded to establish the algorithm’s generalization. The details of the benchmark dataset were listed in Table 3. Six metrics were used to evaluate the multi-label classification problems.

4.3.2. Experimental Results

The multi-label classification experiments were also conducted with several state-of-the-art methods—RBRL [27], Rank-SVM [24], Rank-SVMz [28], BR [29], ML-KNN [30], CLR [31], RAKEL [32], CPNL [33] and MLFE [34]—to show the effectiveness of the SVM-RL. The comparison results with the number of best results are presented in Table 4. From the table, we can see that the proposed model has obvious advantages with a score of 17 in total. For each metric on all the datasets, SVM-RL was almost superior to the other models and had powerful generalization. Specifically, in terms of the AP, the proposed model excesses far more than any other comparison method. For the “emotions” and “scene” datasets, the results were not the best ones. Because the nuclear norm is the tightest convex approximation of the rank function, the low-rank matrix can be recovered with a high probability under certain conditions. Unfortunately, when the rank number of the low-rank matrix is larger, the low-rank solution is biased. Although some improved SVM models can achieve good performance, they are prone to over-fitting for the small and middle training samples with a single label. However, the problem can be solved by the regularization function of the proposed model. Because SVM-RL can minimize the approximate Hamming Loss with less model complexity, it is easy to control the generalization error by the mechanism; thus, the generalization ability of SVM-RL can be improved. Therefore, the experimental results verify the effectiveness of SVM-RL.

5. Conclusions

This paper was motivated by the challenging issue of solving multi-label classification problems in the steelmaking process. To address the problem, we proposed a novel support vector machine with robust low-rank learning, which had a low-rank constraint for high-order label correlations in low-dimensional space, and the accelerated proximal gradient algorithm was applied to the kernel classifier with a smooth loss function for the nonlinear multi-label classifiers. To verify the practicability of the proposed approach, we conducted experiments on a real steelmaking production process. The accuracy of the proposed model surpassed 20% of the SVM in each category. The results illustrate that SVM-RL can effectively solve the multi-label classification problem in the steelmaking process. In addition, to further validate the proposed approach, multi-label classification benchmark problems were used to test the performance of SVM-RL. The proposed model had obvious advantages with the 17 scores in total. The results indicate that SVM-RL outperforms other state-of-the-art approaches across different scenarios. Possible future work could involve exploiting sparse features in the input space and employing non-convex penalty functions and data-adaptive multi-class kernel functions when constructing the SVM in the ironmaking process.

Author Contributions

Conceptualization, Q.L. and C.L.; methodology, Q.L. and C.L.; software, C.L.; validation, Q.L. and Q.G.; formal analysis, Q.L.; investigation, Q.G.; resources, Q.L.; writing—original draft preparation, Q.L. and C.L.; writing—review and editing, Q.L. and Q.G.; supervision, Q.G.; project administration, Q.L. and C.L.; funding acquisition, C.L. and Q.G. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Major Program of the National Natural Science Foundation of China (72192830, 72192835), the National Natural Science Foundation of China (72101052), the National Key Research and Development Program of China (2021YFC2902403), Postdoctoral Science Foundation of China under Grant (2021M700720), the 111 Project (B16009), and the Program for Innovative Talents in University of Liaoning Province of China (LR2020045).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Liu, C.; Tang, L.; Liu, J. A stacked autoencoder with sparse Bayesian regression for end-point prediction problems in steelmaking process. IEEE Trans. Autom. Sci. Eng. 2019, 17, 550–561. [Google Scholar] [CrossRef]
  2. Liu, C.; Tang, L.; Liu, J.; Tang, Z. A Dynamic Analytics Method Based on Multistage Modeling for a BOF Steelmaking Process. IEEE Trans. Autom. Sci. Eng. 2019, 16, 1097–1109. [Google Scholar] [CrossRef] [Green Version]
  3. Tang, L.L.C.; Liu, J.; Wang, X. An estimation of distribution algorithm with resampling and local improvement for an operation optimization problem in steelmaking process. IEEE Trans. Syst. Man Cybern. Syst. 2020; in press. [Google Scholar]
  4. Liu, C.; Tang, L.; Liu, J. Least squares support vector machine with self-organizing multiple kernel learning and sparsity. Neurocomputing 2019, 331, 493–504. [Google Scholar] [CrossRef] [Green Version]
  5. Azadi, P.; Winz, J.; Leo, E.; Klock, R.; Engell, S. A hybrid dynamic model for the prediction of molten iron and slag quality indices of a large-scale blast furnace. Comput. Chem. Eng. 2022, 156, 107573. [Google Scholar] [CrossRef]
  6. Cardoso, W.; Di Felice, R. A novel committee machine to predict the quantity of impurities in hot metal produced in blast furnace. Comput. Chem. Eng. 2022, 163, 107814. [Google Scholar] [CrossRef]
  7. Han, Y.; Zhang, C.-J.; Wang, L.; Zhang, Y.-C. Industrial IoT for intelligent steelmaking with converter mouth flame spectrum information processed by deep learning. IEEE Trans. Ind. Inform. 2019, 16, 2640–2650. [Google Scholar] [CrossRef]
  8. Saigo, H.; Kc, D.B.; Saito, N. Einstein-Roscoe regression for the slag viscosity prediction problem in steelmaking. Sci. Rep. 2022, 12, 6541. [Google Scholar] [CrossRef]
  9. Deng, A.; Xia, Y.; Dong, H.; Wang, H.; Fan, D. Prediction of re-oxidation behaviour of ultra-low carbon steel by different slag series. Sci. Rep. 2020, 10, 9423. [Google Scholar] [CrossRef]
  10. Gao, D.; Zhu, X.Z.; Yang, C.; Huang, X.; Wang, W. Deep weighted joint distribution adaption network for fault diagnosis of blast furnace ironmaking process. Comput. Chem. Eng. 2022, 162, 107797. [Google Scholar] [CrossRef]
  11. Li, J.; Wei, X.; Hua, C.; Yang, Y.; Zhang, L. Double-hyperplane fuzzy classifier design for tendency prediction of silicon content in molten iron. Fuzzy Sets Syst. 2022, 426, 163–175. [Google Scholar] [CrossRef]
  12. Rippon, L.D.; Yousef, I.; Hosseini, B.; Bouchoucha, A.; Beaulieu, J.-F.; Prévost, C.; Ruel, M.; Shah, S.; Gopaluni, R.B. Representation learning and predictive classification: Application with an electric arc furnace. Comput. Chem. Eng. 2021, 150, 107304. [Google Scholar] [CrossRef]
  13. Zhang, C.-J.; Zhang, Y.-C.; Han, Y. Industrial cyber-physical system driven intelligent prediction model for converter end carbon content in steelmaking plants. J. Ind. Inf. Integr. 2022, 28, 100356. [Google Scholar] [CrossRef]
  14. Zhou, P.; Gao, B.; Wang, S.; Chai, T. Identification of Abnormal Conditions for Fused Magnesium Melting Process Based on Deep Learning and Multisource Information Fusion. IEEE Trans. Ind. Electron. 2021, 69, 3017–3026. [Google Scholar] [CrossRef]
  15. Zhou, P.; Zhang, R.; Xie, J.; Liu, J.; Wang, H.; Chai, T. Data-driven monitoring and diagnosing of abnormal furnace conditions in blast furnace ironmaking: An integrated PCA-ICA method. IEEE Trans. Ind. Electron. 2020, 68, 622–631. [Google Scholar] [CrossRef]
  16. Feng, L.; Zhao, C.; Li, Y.; Zhou, M.; Qiao, H.; Fu, C. Multichannel diffusion graph convolutional network for the prediction of endpoint composition in the converter steelmaking process. IEEE Trans. Instrum. Meas. 2020, 70, 3000413. [Google Scholar] [CrossRef]
  17. Li, J.; Hua, C.; Qian, J.; Guan, X. Low-rank based Multi-Input Multi-Output Takagi-Sugeno fuzzy modeling for prediction of molten iron quality in blast furnace. Fuzzy Sets Syst. 2021, 421, 178–192. [Google Scholar] [CrossRef]
  18. Vannucci, M.; Colla, V.; Chini, M.; Gaspardo, D.; Palm, B. Artificial Intelligence Approaches for the Ladle Predictive Maintenance in Electric Steel Plant. IFAC-PapersOnLine 2022, 55, 331–336. [Google Scholar] [CrossRef]
  19. Zhou, P.; Xu, Z.; Peng, X.; Zhao, J.; Shao, Z. Long-term prediction enhancement based on multi-output Gaussian process regression integrated with production plans for oxygen supply network. Comput. Chem. Eng. 2022, 107844. [Google Scholar] [CrossRef]
  20. Zou, H. The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 2006, 101, 1418–1429. [Google Scholar] [CrossRef] [Green Version]
  21. Wang, L.; Zhu, J.; Zou, H. The doubly regularized support vector machine. Stat. Sin. 2006, 589–615. [Google Scholar]
  22. Zou, H. An improved 1-norm svm for simultaneous classification and variable selection. In Proceedings of the Artificial Intelligence and Statistics, San Juan, Puerto Rico, 21–24 March 2007; pp. 675–681. [Google Scholar]
  23. Zou, H.; Yuan, M. The F∞-norm support vector machine. Stat. Sin. 2008, 18, 379–398. [Google Scholar]
  24. Elisseeff, A.; Weston, J. A kernel method for multi-labelled classification. Adv. Neural Inf. Process. Syst. 2001, 14, 681–687. [Google Scholar]
  25. Wang, B.; Zou, H. Fast and Exact Leave-One-Out Analysis of Large-Margin Classifiers. Technometrics 2021, 1–8. [Google Scholar] [CrossRef]
  26. Wu, X.-Z.; Zhou, Z.-H. A unified view of multi-label performance measures. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 10–11 August 2017; pp. 3780–3788. [Google Scholar]
  27. Wu, G.; Zheng, R.; Tian, Y.; Liu, D. Joint ranking SVM and binary relevance with robust low-rank learning for multi-label classification. Neural Netw. 2020, 122, 24–39. [Google Scholar] [CrossRef] [Green Version]
  28. Xu, J. An efficient multi-label support vector machine with a zero label. Expert Syst. Appl. 2012, 39, 4796–4804. [Google Scholar] [CrossRef]
  29. Boutell, M.R.; Luo, J.; Shen, X.; Brown, C.M. Learning multi-label scene classification. Pattern Recognit. 2004, 37, 1757–1771. [Google Scholar] [CrossRef] [Green Version]
  30. Zhang, M.-L.; Zhou, Z.-H. ML-KNN: A lazy learning approach to multi-label learning. Pattern Recognit. 2007, 40, 2038–2048. [Google Scholar] [CrossRef] [Green Version]
  31. Fürnkranz, J.; Hüllermeier, E.; Loza Mencía, E.; Brinker, K. Multilabel classification via calibrated label ranking. Mach. Learn. 2008, 73, 133–153. [Google Scholar] [CrossRef] [Green Version]
  32. Tsoumakas, G.; Katakis, I.; Vlahavas, I. Random k-labelsets for multilabel classification. IEEE Trans. Knowl. Data Eng. 2010, 23, 1079–1089. [Google Scholar] [CrossRef]
  33. Wu, G.; Tian, Y.; Liu, D. Cost-sensitive multi-label learning with positive and negative label pairwise correlations. Neural Netw. 2018, 108, 411–423. [Google Scholar] [CrossRef] [PubMed]
  34. Zhang, Q.-W.; Zhong, Y.; Zhang, M.-L. Feature-induced labeling information enrichment for multi-label learning. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018. [Google Scholar]
Figure 1. Smelting process of the iron and steel production.
Figure 1. Smelting process of the iron and steel production.
Mathematics 10 02659 g001
Figure 2. Comparison of classification accuracy of steelmaking problem models.
Figure 2. Comparison of classification accuracy of steelmaking problem models.
Mathematics 10 02659 g002
Figure 3. Comparison of loss function curve of steelmaking problem between (a) SVM-RL and (b) RBRL.
Figure 3. Comparison of loss function curve of steelmaking problem between (a) SVM-RL and (b) RBRL.
Mathematics 10 02659 g003
Figure 4. Sensitivity analysis of the hyperparameters of the model on the steelmaking dataset. (af) The sensitivity results of each evaluation with λ1 fixed. (gl) The sensitivity results of each evaluation with λ2 fixed. (mr) The sensitivity results of each evaluation with σ fixed.
Figure 4. Sensitivity analysis of the hyperparameters of the model on the steelmaking dataset. (af) The sensitivity results of each evaluation with λ1 fixed. (gl) The sensitivity results of each evaluation with λ2 fixed. (mr) The sensitivity results of each evaluation with σ fixed.
Mathematics 10 02659 g004aMathematics 10 02659 g004b
Table 1. The confusion matrix of a single label classification.
Table 1. The confusion matrix of a single label classification.
Total PopulationPredicted Condition PositivePredicted Condition Negative
True ConditionCondition PositiveTrue Positive (TP)False Negative (FN)
Condition NegativeFalse Positive (FP)True Negative (TN)
Table 2. The experimental results (mean) of comparison between RBRL, SVM, and SVM-RL on the steelmaking process dataset.
Table 2. The experimental results (mean) of comparison between RBRL, SVM, and SVM-RL on the steelmaking process dataset.
15 °C PrecisionRecallF1-ScoreAccuracy
RBRL−10.920.800.860.77
10.300.550.39
SVM−10.980.780.870.78
10.160.710.26
SVM-RL−11.001.001.001.00
11.001.001.00
10°C PrecisionRecallF1-ScoreAccuracy
RBRL−10.410.640.500.58
10.710.490.57
SVM−10.470.660.550.57
10.680.510.58
SVM-RL−11.000.740.850.80
10.551.000.71
Table 3. Statistics of the benchmark experimental datasets. (“Cardinality” indicates the average number of labels per example. “Density” normalizes the “Cardinality” by the number of possible labels. These datasets can be obtained from the reference [27]).
Table 3. Statistics of the benchmark experimental datasets. (“Cardinality” indicates the average number of labels per example. “Density” normalizes the “Cardinality” by the number of possible labels. These datasets can be obtained from the reference [27]).
DatasetInstanceFeatureLabelCardinalityDensityDomain
Emotions5937261.8690.331music
Image200029451.2400.248image
Scene240729461.0740.179image
Yeast2417103144.2390.303biology
Table 4. Experimental results of each benchmark approach (mean) on four multi-label datasets. ( ) indicates the smaller (larger) the better.
Table 4. Experimental results of each benchmark approach (mean) on four multi-label datasets. ( ) indicates the smaller (larger) the better.
MetricRank-SVMRank-SVMsBRML-kNNCLRRAKELCPNLMLFERBRLSVM-RL
emotionsHal ( ) 0.1890.2010.1830.2000.1820.1770.1830.1860.1810.167
Ral ( ) 0.1550.1490.2460.1690.1490.1920.1390.1420.1380.133
Cov ( ) 0.2940.2910.3860.3060.2830.3380.2770.2820.2770.284
Sa ( ) 0.2910.2920.3130.2850.3180.3560.3240.2910.3340.361
F1e ( ) 0.6450.6750.6200.6050.6240.6790.6840.6210.6660.682
Ap ( ) 0.8080.8190.7600.7960.8130.8010.8280.8220.8280.835
imageHal ( ) 0.1610.1770.1560.1750.1570.1540.1500.1560.1490.075
Ral ( ) 0.1430.1420.2200.1800.1440.1730.1320.1420.1330.072
Cov ( ) 0.1710.1700.2270.1980.1680.1910.1570.1650.1600.115
Sa ( ) 0.4510.4110.4820.3930.4770.5270.5330.4630.5520.686
F1e ( ) 0.6310.6700.6230.5030.6270.6800.6980.5930.6880.815
Ap ( ) 0.8230.8260.7720.7860.8260.8130.8390.8260.8360.930
sceneHal ( ) 0.0920.1130.0770.0910.0780.0750.0770.0830.0730.0884
Ral ( ) 0.0650.0720.1280.0830.0610.0870.0590.0630.0580.047
Cov ( ) 0.0680.0750.1190.0840.0640.0890.0640.0670.0620.055
Sa ( ) 0.5630.5000.6550.6150.6500.6960.6990.6170.7350.541
F1e ( ) 0.6640.7560.7170.6780.7180.7560.8020.6850.8030.595
Ap ( ) 0.8820.8740.8340.8580.8870.8750.8930.8850.8950.905
yeastHal ( ) 0.2030.2070.1880.1950.1880.1950.1920.1940.1870.184
Ral ( ) 0.1700.1720.3080.1700.1580.2440.1580.1660.1570.134
Cov ( ) 0.4460.4580.6270.4510.4360.5430.4450.4520.4360.412
Sa ( ) 0.1560.1790.1900.1770.1940.2480.1790.1720.1920.077
F1e ( ) 0.6320.6430.6230.6150.6250.6470.6300.6070.6280.635
Ap ( ) 0.7550.7650.6800.7620.7730.7270.7750.7690.7770.814
Score00000220417
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Li, Q.; Liu, C.; Guo, Q. Support Vector Machine with Robust Low-Rank Learning for Multi-Label Classification Problems in the Steelmaking Process. Mathematics 2022, 10, 2659. https://doi.org/10.3390/math10152659

AMA Style

Li Q, Liu C, Guo Q. Support Vector Machine with Robust Low-Rank Learning for Multi-Label Classification Problems in the Steelmaking Process. Mathematics. 2022; 10(15):2659. https://doi.org/10.3390/math10152659

Chicago/Turabian Style

Li, Qiang, Chang Liu, and Qingxin Guo. 2022. "Support Vector Machine with Robust Low-Rank Learning for Multi-Label Classification Problems in the Steelmaking Process" Mathematics 10, no. 15: 2659. https://doi.org/10.3390/math10152659

APA Style

Li, Q., Liu, C., & Guo, Q. (2022). Support Vector Machine with Robust Low-Rank Learning for Multi-Label Classification Problems in the Steelmaking Process. Mathematics, 10(15), 2659. https://doi.org/10.3390/math10152659

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop