Deep Semi-Supervised Just-in-Time Learning Based Soft Sensor for Mooney Viscosity Estimation in Industrial Rubber Mixing Process
Abstract
:1. Introduction
- A stacked autoencoder-based deep learning technique is used to extract the latent feature information from the process data of industrial rubber mixing, which is superior to traditional feature selection and extraction methods in handling high-dimensional, complex process data.
- An evolutionary pseudo-labeling optimization approach is proposed to expand the modeling database with limited labeled data by obtaining high-confidence pseudo-labeled data. The basic idea of this approach is to first formulate an explicit pseudo-labeling optimization problem and then solve this problem using an evolutionary approach.
- By integrating JIT learning, semi-supervised learning and deep learning into an online modeling framework, DSSJITGPR can provide much better prediction accuracy than traditional soft sensors for Mooney viscosity estimation in industrial rubber-mixing process.
2. Preliminaries
2.1. Just-in-Time Learning
2.2. Stacked Autoencoder
2.3. Gaussian Process Regression
3. Proposed DSSJITGPR Soft Sensor Method for Mooney Viscosity Estimation
3.1. Latent Feature Extraction
3.2. Acquisition of High-Confidence Pseudo-Labeled Data
3.2.1. Formulation of Pseudo-Labeling Optimization Problem
3.2.2. Solving of Pseudo-Labeling Optimization Problem
- Select an unlabeled subset from and the pseudo-labels serve as decision variables.
- Using real-number coding, is coded as a chromosome, as shown in Figure 8.
- An initial population with individuals is randomly generated within the ranges of .
- Evaluate the fitness of each individual in the population according to the reciprocal of the objection function values.
- Generate a new population by performing selection, crossover and mutation operations, and return to step (4).
- If the stopping condition for GA optimization is satisfied, the optimal solution with the highest fitness is selected and decoded as the pseudo-label estimation of the unlabeled sample data . Consequently, a pseudo-labeled data set can be obtained.
- Merge the labeled and pseudo-labeled data to form an enlarged labeled data set; subsequently, an enhanced GPR model is built.
- Evaluate the performance enhancement ration (PER) over on the validation set that is,
3.3. Implementation Procedure
4. Application to an Industrial Rubber Mixing Process
- PLS: the global PLS model.
- GPR: the global GPR model.
- ELM [55]: the global ELM model.
- SSELM [56]: the global semi-supervised ELM model;
- CoGPR: the co-training based GPR model using two sets of randomly selected input variables as different views.
- JITGPR: the JIT learning based GPR model using CWD similarity measure.
- DPLS: the deep learning based PLS model using SAE for latent feature extraction.
- DGPR: the deep learning based GPR model using SAE for latent feature extraction.
- DCoGPR: the deep learning based CoGPR model using SAE for latent feature extraction.
- DJITGPR: the deep learning based JITGPR model using SAE for latent feature extraction.
- DSSGPR: the deep learning based semi-supervised GPR model using SAE for latent feature extraction and including the pseudo-labeled data to the labeled training set.
- DSSJITGPR (the proposed method): the deep semi-supervised JITGPR model.
4.1. Process Description
4.2. Prediction Results and Analysis
- The numbers of principal components for PLS and DPLS are nine and 5five respectively.
- The number of hidden layer neurons in ELM is 455.
- The number of hidden layer neurons in SSELM is 170, and the weight coefficient of Laplacian regularization is 0.6.
- The iteration for CoGPR and DCoGPR is 70, and the number of high-confidence pseudo-labeled selected for each iteration is five.
- The number of local modeling samples for JITGPR, DJITGPR and DSSJITGPR is L =15.
- In our proposed DSSJITGPR method, the parameters of GA optimization for pseudo-label estimation are set to , . In addition, the optimal parameter combination of is selected as .
- With the SAE network structure determined as 140-70-30-5-1, the detailed parameter settings are listed in Table 1.
- Linear model-based soft sensors, i.e., PLS and DPLS, are obviously inferior to nonlinear methods due to the failure in dealing with nonlinear characteristics of the process.
- Despite the use of nonlinear modeling techniques, the prediction performance of traditional global modeling methods such as GPR, DGPR, and ELM is still very poor. Compared with global modeling, the local-learning soft sensors such as JITGPR and DJITGPR have achieved a certain degree of prediction performance improvement.
- In general, whether global or local, or supervised or semi-supervised, the prediction performance of different soft sensors was greatly improved after introducing SAE base feature extraction. These results reveal the necessity and effectiveness of SAE-based feature extraction for the high-dimensional process data.
- In most cases, the introduction of semi-supervised learning is helpful for enhancing the prediction accuracy of supervised soft sensors. However, in this case study, SSELM does not achieve significant performance enhancement, which is mainly because of improper unlabeled data introduction. In contrast, compared with GPR and DGPR, the prediction performance of Co-GPR and DCoGPR becomes worse due to the improper construction of different views or unreliable pseudo-label estimation. Compared with DGPR, the performance of DSSGPR was improved due to the use of the expanded database, indicating that the high-confidence pseudo-labeled data obtained using the proposed pseudo-labeling optimization approach are reliable. In addition, when local learning is introduced, DSSJITGPR provides much better prediction results than DSSGPR.
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zhang, Z.; Song, K.; Tong, T.-P.; Wu, F. A novel nonlinear adaptive Mooney-viscosity model based on DRPLS-GP algorithm for rubber mixing process. Chemom. Intell. Lab. Syst. 2012, 112, 17–23. [Google Scholar] [CrossRef]
- Liu, Y.; Gao, Z. Real-time property prediction for an industrial rubber-mixing process with probabilistic ensemble Gaussian process regression models. J. Appl. Polym. Sci. 2015, 132, 41432. [Google Scholar] [CrossRef]
- Jin, H.; Li, J.; Wang, M.; Qian, B.; Yang, B.; Li, Z.; Shi, L. Ensemble just-in-time learning-based soft sensor for mooney viscosity prediction in an industrial rubber mixing process. Adv. Polym. Technol. 2020, 2020, 1–12. [Google Scholar] [CrossRef] [Green Version]
- Jin, W.; Liu, Y.; Gao, Z. Fast property prediction in an industrial rubber mixing process with local ELM model. J. Appl. Polym. Sci. 2017, 134, 45391. [Google Scholar] [CrossRef]
- Zheng, W.; Liu, Y.; Gao, Z.; Yang, J. Just-in-time semi-supervised soft sensor for quality prediction in industrial rubber mixers. Chemom. Intell. Lab. Syst. 2018, 180, 36–41. [Google Scholar] [CrossRef]
- Zheng, W.; Gao, X.; Liu, Y.; Wang, L.; Yang, J.; Gao, Z. Industrial Mooney viscosity prediction using fast semi-supervised empirical model. Chemom. Intell. Lab. Syst. 2017, 171, 86–92. [Google Scholar] [CrossRef]
- Zheng, S.; Liu, K.; Xu, Y.; Chen, H.; Zhang, X.; Liu, Y. Robust soft sensor with deep kernel learning for quality prediction in rubber mixing processes. Sensors 2020, 20, 695. [Google Scholar] [CrossRef] [Green Version]
- Pan, B.; Jin, H.; Wang, L.; Qian, B.; Chen, X.; Huang, S.; Li, J. Just-in-time learning based soft sensor with variable selection and weighting optimized by evolutionary optimization for quality prediction of nonlinear processes. Chem. Eng. Res. Des. 2019, 144, 285–299. [Google Scholar] [CrossRef]
- Jin, H.; Pan, B.; Chen, X.; Qian, B. Ensemble just-in-time learning framework through evolutionary multi-objective optimization for soft sensor development of nonlinear industrial processes. Chemom. Intell. Lab. Syst. 2019, 184, 153–166. [Google Scholar] [CrossRef]
- Hinton, G.E.; Osindero, S.; Teh, Y.-W. A fast learning algorithm for deep belief nets. Neural Comput. 2006, 18, 1527–1554. [Google Scholar] [CrossRef]
- Guo, R.; Liu, H. Semisupervised dynamic soft sensor based on complementary ensemble empirical mode decomposition and deep learning. Measurement 2021, 183, 109788. [Google Scholar] [CrossRef]
- Chai, Z.; Zhao, C.; Huang, B.; Chen, H. A Deep Probabilistic Transfer Learning Framework for Soft Sensor Modeling With Missing Data. IEEE Trans. Neural Netw. Learn. Syst. 2021, 1–12. [Google Scholar] [CrossRef] [PubMed]
- Yuan, X.; Li, L.; Shardt, Y.A.W.; Wang, Y.; Yang, C. Deep Learning With Spatiotemporal Attention-Based LSTM for Industrial Soft Sensor Model Development. IEEE Trans. Ind. Electron. 2020, 68, 4404–4414. [Google Scholar] [CrossRef]
- Liu, C.; Wang, K.; Ye, L.; Wang, Y.; Yuan, X. Deep learning with neighborhood preserving embedding regularization and its application for soft sensor in an industrial hydrocracking process. Inf. Sci. 2021, 567, 42–57. [Google Scholar] [CrossRef]
- Zhu, X.; Rehman, K.U.; Bo, W.; Shahzad, M.; Hassan, A. Data-Driven Soft Sensor Model Based on Deep Learning for Quality Prediction of Industrial Processes. SN Comput. Sci. 2021, 2, 1–10. [Google Scholar] [CrossRef]
- Yuan, X.; Huang, B.; Wang, Y.; Yang, C.; Gui, W. Deep Learning-Based Feature Representation and Its Application for Soft Sensor Modeling With Variable-Wise Weighted SAE. IEEE Trans. Ind. Inform. 2018, 14, 3235–3243. [Google Scholar] [CrossRef]
- Yuan, X.; Ou, C.; Wang, Y.; Yang, C.; Gui, W. A novel semi-supervised pre-training strategy for deep networks and its application for quality variable prediction in industrial processes. Chem. Eng. Sci. 2020, 217, 115509. [Google Scholar] [CrossRef]
- Sun, Q.; Ge, Z. Gated Stacked Target-Related Autoencoder: A Novel Deep Feature Extraction and Layerwise Ensemble Method for Industrial Soft Sensor Application. IEEE Trans. Cybern. 2020, 1–12. [Google Scholar] [CrossRef]
- Jin, H.; Li, Z.; Chen, X.; Qian, B.; Yang, B.; Yang, J. Evolutionary optimization based pseudo labeling for semi-supervised soft sensor development of industrial processes. Chem. Eng. Sci. 2021, 237, 116560. [Google Scholar] [CrossRef]
- Fujino, A.; Ueda, N.; Saito, K. Semisupervised Learning for a Hybrid Generative/Discriminative Classifier based on the Maximum Entropy Principle. IEEE Trans. Pattern Anal. Mach. Intell. 2008, 30, 424–437. [Google Scholar] [CrossRef]
- Yarowsky, D. Unsupervised word sense disambiguation rivaling supervised methods. In Proceedings of the 33rd Annual Meeting of the As-Sociation for Computational Linguistics, Cambridge, MA, USA, 26–30 June 1995; pp. 189–196. [Google Scholar]
- Blum, A.; Mitchell, T. Combining labeled and unlabeled data with co-training. In Proceedings of the Eleventh Annual Conference on Computational Learning Theory, Madison, WI, USA, 24–26 July 1998; pp. 92–100. [Google Scholar]
- Sindhwani, V.; Niyogi, P.; Belkin, M. Beyond the point cloud: From transductive to semi-supervised learning. In Proceedings of the 22nd International Conference on Machine Learning, Bonn, Germany, 7–11 August 2005; pp. 824–831. [Google Scholar]
- Vapnik, V.N. An overview of statistical learning theory. IEEE Trans. Neural Netw. 1999, 10, 988–999. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Kaneko, H.; Funatsu, K. Ensemble locally weighted partial least squares as a just-in-time modeling method. AIChE J. 2016, 62, 717–725. [Google Scholar] [CrossRef]
- Zhou, Z.-H. When semi-supervised learning meets ensemble learning. In Proceedings of the International Workshop on Multiple Classifier Systems, Reykjavik, Iceland, 10–12 June 2009; pp. 529–538. [Google Scholar]
- Zhang, M.-L.; Zhou, Z.-H. Exploiting unlabeled data to enhance ensemble diversity. Data Min. Knowl. Discov. 2011, 26, 98–129. [Google Scholar] [CrossRef] [Green Version]
- Sun, Q.; Ge, Z. A Survey on Deep Learning for Data-Driven Soft Sensors. IEEE Trans. Ind. Inform. 2021, 17, 5853–5866. [Google Scholar] [CrossRef]
- Luo, Y.; Zhu, J.; Li, M.; Ren, Y.; Zhang, B. Smooth neighbors on teacher graphs for semi-supervised learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8896–8905. [Google Scholar]
- Yin, X.; Niu, Z.; He, Z.; Li, Z.; Lee, D.-H. Ensemble deep learning based semi-supervised soft sensor modeling method and its application on quality prediction for coal preparation process. Adv. Eng. Inform. 2020, 46, 101136. [Google Scholar] [CrossRef]
- Yan, W.; Xu, R.; Wang, K.; Di, T.; Jiang, Z. Soft Sensor Modeling Method Based on Semisupervised Deep Learning and Its Application to Wastewater Treatment Plant. Ind. Eng. Chem. Res. 2020, 59, 4589–4601. [Google Scholar] [CrossRef]
- Aha, D.W. Lazy Learning, 1st ed.; Springer Science & Business Media, Dordrecht: New York, NY, USA, 1997; ISBN 9789401720533. [Google Scholar]
- Yin, S.; Xie, X.; Sun, W. A Nonlinear Process Monitoring Approach With Locally Weighted Learning of Available Data. IEEE Trans. Ind. Electron. 2016, 64, 1507–1516. [Google Scholar] [CrossRef]
- Kim, S.; Kano, M.; Hasebe, S.; Takinami, A.; Seki, T. Long-Term Industrial Applications of Inferential Control Based on Just-in-Time Soft-Sensors: Economical Impact and Challenges. Ind. Eng. Chem. Res. 2013, 52, 12346–12356. [Google Scholar] [CrossRef]
- Liu, Y.; Gao, Z. Industrial melt index prediction with the ensemble anti-outlier just-in-time Gaussian process regression modeling method. J. Appl. Polym. Sci. 2015, 132, 41958. [Google Scholar] [CrossRef]
- Zheng, J.; Shen, F.; Ye, L. Improved Mahalanobis Distance Based JITL-LSTM Soft Sensor for Multiphase Batch Processes. IEEE Access 2021, 9, 72172–72182. [Google Scholar] [CrossRef]
- Li, L.; Dai, Y. Soft sensor modeling method for time-varying and multi-target chemical processes based on improved ensemble learning. Przem. Chem. 2019, 98, 1811–1816. [Google Scholar]
- Li, L.; Dai, Y. An adaptive soft sensor deterioration evaluation and model updating method for time-varying chemical processes. Chem. Ind. Chem. Eng. Q. 2020, 26, 135–149. [Google Scholar] [CrossRef]
- Huang, H.; Peng, X.; Jiang, C.; Li, Z.; Zhong, W. Variable-scale probabilistic just-in-time learning for soft sensor develop-ment with missing data. Ind. Eng. Chem. Res. 2020, 59, 5010–5021. [Google Scholar] [CrossRef]
- Hazama, K.; Kano, M. Covariance-based locally weighted partial least squares for high-performance adaptive modeling. Chemom. Intell. Lab. Syst. 2015, 146, 55–62. [Google Scholar] [CrossRef] [Green Version]
- Yuan, X.; Zhou, J.; Wang, Y.; Yang, C. Multi-similarity measurement driven ensemble just-in-time learning for soft sensing of industrial processes. J. Chemom. 2018, 32, e3040. [Google Scholar] [CrossRef]
- Kim, S.; Okajima, R.; Kano, M.; Hasebe, S. Development of soft-sensor using locally weighted PLS with adaptive similarity measure. Chemom. Intell. Lab. Syst. 2013, 124, 43–49. [Google Scholar] [CrossRef] [Green Version]
- Alakent, B. Online tuning of predictor weights for relevant data selection in just-in-time-learning. Chemom. Intell. Lab. Syst. 2020, 203, 104043. [Google Scholar] [CrossRef]
- Yuan, X.; Gu, Y.; Wang, Y.; Yang, C.; Gui, W. A Deep Supervised Learning Framework for Data-Driven Soft Sensor Modeling of Industrial Processes. IEEE Trans. Neural Netw. Learn. Syst. 2019, 31, 4737–4746. [Google Scholar] [CrossRef]
- Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning, 1st ed.; MIT Press: Cambridge, MA, USA, 2016; ISBN 9780262337373. [Google Scholar]
- Williams, C.K.; Rasmussen, C.E. Gaussian Processes for Machine Learning, 1st ed.; MIT Press: Cambridge, MA, USA, 2006; ISBN 026218253X. [Google Scholar]
- Jin, H.; Shi, L.; Chen, X.; Qian, B.; Yang, B.; Jin, H. Probabilistic wind power forecasting using selective ensemble of finite mixture Gaussian process regression models. Renew. Energy 2021, 174, 1–18. [Google Scholar] [CrossRef]
- Zhou, Z.-H.; Li, M. Tri-training: Exploiting unlabeled data using three classifiers. IEEE Trans. Knowl. Data Eng. 2005, 17, 1529–1541. [Google Scholar] [CrossRef] [Green Version]
- Li, M.; Zhou, Z.-H. Improve Computer-Aided Diagnosis With Machine Learning Techniques Using Undiagnosed Samples. IEEE Trans. Syst. Man Cybern. -Part A Syst. Hum. 2007, 37, 1088–1098. [Google Scholar] [CrossRef] [Green Version]
- Hady, M.F.A.; Schwenker, F. Co-training by committee: A new semi-supervised learning framework. In Proceedings of the 2008 IEEE International Conference on Data Mining Workshops, Pisa, Italy, 15–19 December 2008; pp. 563–572. [Google Scholar]
- Gu, S.; Jin, Y. Multi-train: A semi-supervised heterogeneous ensemble classifier. Neurocomputing 2017, 249, 202–211. [Google Scholar] [CrossRef] [Green Version]
- Zhou, Z.-H.; Li, M. Semisupervised Regression with Cotraining-Style Algorithms. IEEE Trans. Knowl. Data Eng. 2007, 19, 1479–1493. [Google Scholar] [CrossRef] [Green Version]
- Bansal, J.C.; Singh, P.K.; Pal, N.R. Evolutionary and Swarm Intelligence Algorithms, 1st ed.; Springer: Berlin, Germany, 2019; ISBN 9783030082291. [Google Scholar]
- Whitley, D. A genetic algorithm tutorial. Stat. Comput. 1994, 4, 65–85. [Google Scholar] [CrossRef]
- Huang, G.-B.; Zhu, Q.-Y.; Siew, C.-K. Extreme learning machine: Theory and applications. Neurocomputing 2006, 70, 489–501. [Google Scholar] [CrossRef]
- Liu, J.; Chen, Y.; Liu, M.; Zhao, Z. SELM: Semi-supervised ELM with application in sparse calibrated location estimation. Neurocomputing 2011, 74, 2566–2572. [Google Scholar] [CrossRef]
Symbol | Description | Value |
---|---|---|
Hnode1 | Number of nodes in the first layer | 70 |
Hnode2 | Number of nodes in the second layer | 30 |
Hnode3 | Number of nodes in the third layer | 5 |
Lrate1 | Pre-training learning rate | 0.05 |
Nepoch1 | Epoch number in pre-training | 300 |
Bsize1 | Sample batch size in pre-training | 20 |
Lrate2 | Fine-tuning learning rate | 0.07 |
Nepoch2 | Epoch number in fine-tuning | 300 |
Bsize2 | Sample batch size in fine-tuning | 20 |
No. | Method | RMSE | |
---|---|---|---|
1 | PLS | 7.4703 | 0.7889 |
2 | GPR | 5.1270 | 0.9006 |
3 | ELM | 6.7405 | 0.8279 |
4 | SSELM | 6.6534 | 0.8319 |
5 | CoGPR | 5.9398 | 0.8666 |
6 | JITGPR | 6.4427 | 0.8430 |
7 | DPLS | 5.3602 | 0.8913 |
8 | DGPR | 4.5700 | 0.9210 |
9 | DCoGPR | 4.7218 | 0.9157 |
10 | DJITGPR | 4.4269 | 0.9259 |
11 | DSSGPR | 4.5087 | 0.9231 |
12 | DSSJITGPR | 4.0415 | 0.9382 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, Y.; Jin, H.; Liu, H.; Yang, B.; Dong, S. Deep Semi-Supervised Just-in-Time Learning Based Soft Sensor for Mooney Viscosity Estimation in Industrial Rubber Mixing Process. Polymers 2022, 14, 1018. https://doi.org/10.3390/polym14051018
Zhang Y, Jin H, Liu H, Yang B, Dong S. Deep Semi-Supervised Just-in-Time Learning Based Soft Sensor for Mooney Viscosity Estimation in Industrial Rubber Mixing Process. Polymers. 2022; 14(5):1018. https://doi.org/10.3390/polym14051018
Chicago/Turabian StyleZhang, Yan, Huaiping Jin, Haipeng Liu, Biao Yang, and Shoulong Dong. 2022. "Deep Semi-Supervised Just-in-Time Learning Based Soft Sensor for Mooney Viscosity Estimation in Industrial Rubber Mixing Process" Polymers 14, no. 5: 1018. https://doi.org/10.3390/polym14051018
APA StyleZhang, Y., Jin, H., Liu, H., Yang, B., & Dong, S. (2022). Deep Semi-Supervised Just-in-Time Learning Based Soft Sensor for Mooney Viscosity Estimation in Industrial Rubber Mixing Process. Polymers, 14(5), 1018. https://doi.org/10.3390/polym14051018