An Expandable Yield Prediction Framework Using Explainable Artificial Intelligence for Semiconductor Manufacturing
Abstract
:1. Introduction
- The yield prediction framework utilizes various types of fabrication data, allowing input data expansion;
- XAI technology, i.e., SHAP, is implemented and improves the possibility of explanation for the most performant model;
- Demonstration using a real-world dataset is analyzed by SHAP values, including the discovery of factors affecting yield.
2. Proposed Framework
2.1. Data Preparation with Preprocessing
2.2. Model Optimization and Selection
2.3. Prediction and Explanation
3. Results and Discussion
3.1. Model Selection and Prediction
3.2. Explanation of the Model Using the SHAP Value Method
3.3. Discussion and Limitation
4. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Jiang, D.; Lin, W.; Raghavan, N. A Novel Framework for Semiconductor Manufacturing Final Test Yield Classification Using Machine Learning Techniques. IEEE Access 2020, 8, 197885–197895. [Google Scholar] [CrossRef]
- Espadinha-Cruz, P.; Godina, R.; Rodrigues, E.M.G. A Review of Data Mining Applications in Semiconductor Manufacturing. Processes 2021, 9, 305. [Google Scholar] [CrossRef]
- Kumar, N.; Kennedy, K.; Gildersleeve, K.; Abelson, R.; Mastrangelo, C.; Montgomery, D. A Review of Yield Modelling Techniques for Semiconductor Manufacturing. Int. J. Prod. Res. 2006, 44, 5019–5036. [Google Scholar] [CrossRef]
- Tyagi, A.; Bayoumi, M.A. Defect Clustering Viewed through Generalized Poisson Distribution. IEEE Trans. Semicond. Manuf. 1992, 5, 196–206. [Google Scholar] [CrossRef]
- Spanos, C.J. Statistical Process Control in Semiconductor Manufacturing. Proc. IEEE 1992, 80, 819–830. [Google Scholar] [CrossRef]
- Durbeck, D.; Chern, J.-H.; Boning, D. A System for Semiconductor Process Specification. IEEE Trans. Semicond. Manuf. 1993, 6, 297–305. [Google Scholar] [CrossRef]
- Bouktif, S.; Fiaz, A.; Ouni, A.; Serhani, M.A. Optimal Deep Learning LSTM Model for Electric Load Forecasting using Feature Selection and Genetic Algorithm: Comparison with Machine Learning Approaches. Energies 2018, 11, 1636. [Google Scholar] [CrossRef] [Green Version]
- Zhang, Y.; Ma, J.; Liang, S.; Li, X.; Li, M. An evaluation of eight machine learning regression algorithms for forest aboveground biomass estimation from multiple satellite data products. Remote Sens. 2020, 12, 4015. [Google Scholar] [CrossRef]
- Zhang, Y.; Wang, G.; Li, M.; Han, S. Automated Classification Analysis of Geological Structures Based on Images Data and Deep Learning Model. Appl. Sci. 2018, 8, 2493. [Google Scholar] [CrossRef] [Green Version]
- Kammerer, K.; Hoppenstedt, B.; Pryss, R.; Stökler, S.; Allgaier, J.; Reichert, M. Anomaly Detections for Manufacturing Systems Based on Sensor Data—Insights into Two Challenging Real-World Production Settings. Sensors 2019, 19, 5370. [Google Scholar] [CrossRef] [Green Version]
- Li, Z.; Rahman, S.M.; Vega, R.; Dong, B. A Hierarchical Approach Using Machine Learning Methods in Solar Photovoltaic Energy Production Forecasting. Energies 2016, 9, 55. [Google Scholar] [CrossRef] [Green Version]
- Dou, Z.; Sun, Y.; Zhang, Y.; Wang, T.; Wu, C.; Fan, S. Regional Manufacturing Industry Demand Forecasting: A Deep Learning Approach. Appl. Sci. 2021, 11, 6199. [Google Scholar] [CrossRef]
- Ge, Z.; Song, Z. Semiconductor manufacturing process monitoring based on adaptive substatistical PCA. IEEE Trans. Semicond. Manuf. 2010, 23, 99–108. [Google Scholar]
- Singgih, I.K. Production Flow Analysis in a Semiconductor Fab Using Machine Learning Techniques. Processes 2021, 9, 407. [Google Scholar] [CrossRef]
- He, Q.P.; Wang, J. Fault detection using the k-nearest neighbor rule for semiconductor manufacturing processes. IEEE Trans. Semicond. Manuf. 2007, 20, 345–354. [Google Scholar] [CrossRef]
- López de la Rosa, F.; Sánchez-Reolid, R.; Gómez-Sirvent, J.L.; Morales, R.; Fernández-Caballero, A. A Review on Machine and Deep Learning for Semiconductor Defect Classification in Scanning Electron Microscope Images. Appl. Sci. 2021, 11, 9508. [Google Scholar] [CrossRef]
- Hung, Y.-H. Improved Ensemble-Learning Algorithm for Predictive Maintenance in the Manufacturing Process. Appl. Sci. 2021, 11, 6832. [Google Scholar] [CrossRef]
- Nakata, K.; Orihara, R.; Mizuoka, Y.; Takagi, K. A comprehensive big-data-based monitoring system for yield enhancement in semiconductor manufacturing. IEEE Trans. Semicond. Manuf. 2017, 30, 339–344. [Google Scholar] [CrossRef]
- Kovacs, I.; Ţopa, M.; Buzo, A.; Pelz, G. An Accurate Yield Estimation Approach for Multivariate Non-Normal Data in Semiconductor Quality Analysis. In Proceedings of the 2017 14th International Conference on Synthesis, Modeling, Analysis and Simulation Methods and Applications to Circuit Design (SMACD), IEEE, Giardini Naxos, Italy, 12–15 June 2017; pp. 1–4. [Google Scholar]
- Jess, J.A.; Kalafala, K.; Naidu, S.R.; Otten, R.H.; Visweswariah, C. Statistical Timing for Parametric Yield Prediction of Digital Integrated Circuits. In Proceedings of the Proceedings of the 40th Annual Design Automation Conference, Anaheim, CA, USA, 2–6 June 2003; pp. 932–937. [Google Scholar]
- Chien, C.-F.; Wang, W.-C.; Cheng, J. –C. Data Mining for Yield Enhancement in Semiconductor Manufacturing and an Empirical Study. Expert Syst. Appl. 2007, 33, 192–198. [Google Scholar] [CrossRef]
- Lee, D.-H.; Yang, J.-K.; Lee, C.-H.; Kim, K.-J. A Data-Driven Approach to Selection of Critical Process Steps in the Semiconductor Manufacturing Process Considering Missing and Imbalanced Data. J. Manuf. Syst. 2019, 52, 146–156. [Google Scholar] [CrossRef]
- Jiang, D.; Lin, W.; Raghavan, N. A Gaussian Mixture Model Clustering Ensemble Regressor for Semiconductor Manufacturing Final Test Yield Prediction. IEEE Access 2021, 9, 22253–22263. [Google Scholar] [CrossRef]
- Kim, S.; Lee, K.; Noh, H.-K.; Shin, Y.; Chang, K.-B.; Jeong, J.; Baek, S.; Kang, M.; Cho, K.; Kim, D.-W.; et al. Automatic Modeling of Logic Device Performance Based on Machine Learning and Explainable AI. In Proceedings of the 2020 International Conference on Simulation of Semiconductor Processes and Devices (SISPAD), IEEE, Kobe, Japan, 23 September–6 October 2020; pp. 47–50. [Google Scholar]
- Chien, C.-F.; Liu, C.-W.; Chuang, S.-C. Analysing Semiconductor Manufacturing Big Data for Root Cause Detection of Excursion for Yield Enhancement. Int. J. Prod. Res. 2017, 55, 5095–5107. [Google Scholar] [CrossRef]
- Lee, G.T.; Lim, H.; Jang, J. Sequential residual learning for multistep processes in semiconductor manufacturing. IEEE Trans. Semicond. Manuf. 2022, 36, 37–44. [Google Scholar] [CrossRef]
- Wang, D.; Thunéll, S.; Lindberg, U.; Jiang, L.; Trygg, J.; Tysklind, M. Towards Better Process Management in Wastewater Treatment Plants: Process Analytics Based on SHAP Values for Tree-Based Machine Learning Methods. J. Environ. Manag. 2022, 301, 113941. [Google Scholar] [CrossRef]
- Senoner, J.; Netland, T.; Feuerriegel, S. Using Explainable Artificial Intelligence to Improve Process Quality: Evidence from Semiconductor Manufacturing. Manag. Sci. 2021, 68, 5704–5723. [Google Scholar] [CrossRef]
- Shapley, L.S. Stochastic Games. Proc. Natl Acad. Sci. USA 1953, 39, 1095–1100. [Google Scholar] [CrossRef] [Green Version]
- Lundberg, S.M.; Erion, G.G.; Lee, S.-I. Consistent Individualized Feature Attribution for Tree Ensembles. arXiv arXiv preprint. 2018. [Google Scholar]
- Tibshirani, R. Regression Shrinkage and Selection via the Lasso. J. R. Stat. Soc. B 1996, 58, 267–288. [Google Scholar] [CrossRef]
- Drucker, H. Improving Regressors Using Boosting Techniques. In Proceedings of the ICML, Nashville, TN, USA, 8–12 July 1997; Volume 97, pp. 107–115. [Google Scholar]
- Chen, T.; Guestrin, C. Xgboost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
- Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.-Y. Lightgbm: A Highly Efficient Gradient Boosting Decision Tree. Adv. Neural. Inf. Process Syst. 2017, 30, 3149–3157. [Google Scholar]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
- Cherkassky, V.; Ma, Y. Practical selection of SVM parameters and noise estimation for SVM regression. Neural Netw. 2004, 17, 113–126. [Google Scholar] [CrossRef] [Green Version]
- Rasmussen, C.E. Gaussian processes in machine learning. In Summer School on Machine Learning; Springer: Berlin/Heidelberg, Germany, 2023; pp. 63–71. [Google Scholar]
- Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning Internal Representations by Error Propagation; California Univ San Diego La Jolla Inst for Cognitive Science: La Jolla, CA, USA, 1985. [Google Scholar]
- Imandoust, S.B.; Bolandraftar, M. Application of k-nearest neighbor (knn) approach for predicting economic events: Theoretical background. Int. J. Eng. Res. Appl. 2013, 3, 605–610. [Google Scholar]
- Fischler, M.A.; Bolles, R.C. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 1981, 24, 381–395. [Google Scholar] [CrossRef]
- Lundberg, S.M.; Erion, G.; Chen, H.; DeGrave, A.; Prutkin, J.M.; Nair, B.; Katz, R.; Himmelfarb, J.; Bansal, N.; Lee, S.I. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2020, 2, 56–67. [Google Scholar] [CrossRef]
- Palar, P.S.; Zuhal, L.R.; Shimoyama, K. Enhancing the explainability of regression-based polynomial chaos expansion by Shapley additive explanations. Reliab. Eng. Syst. Saf. 2023, 232, 109045. [Google Scholar] [CrossRef]
- Aydin, H.E.; Iban, M.C. Predicting and analyzing flood susceptibility using boosting-based ensemble machine learning algorithms with SHapley Additive exPlanations. Nat. Hazards 2022, 1–35. [Google Scholar] [CrossRef]
- Zhang, G.; Shi, Y.; Yin, P.; Liu, F.; Fang, Y.; Li, X.; Zhang, Q.; Zhang, Z. A machine learning model based on ultrasound image features to assess the risk of sentinel lymph node metastasis in breast cancer patients: Applications of scikit-learn and SHAP. Front. Oncol. 2022, 12, 944569. [Google Scholar] [CrossRef]
- Kang, P.; Lee, H.; Cho, S.; Kim, D.; Park, J.; Park, C.-K.; Doh, S. A Virtual Metrology System for Semiconductor Manufacturing. Expert Syst. Appl. 2009, 36, 12554–12561. [Google Scholar] [CrossRef]
- Lenz, B.; Barak, B.; Mührwald, J.; Leicht, C. Virtual metrology in semiconductor manufacturing by means of predictive machine learning models. In Proceedings of the 2013 12th International Conference on Machine Learning and Applications, IEEE, Miami, FL, USA, 4–7 December 2013; Volume 2, pp. 174–177. [Google Scholar]
- Zhang, B.; Xu, L.; Chen, Y.; Li, A. Remaining useful life based maintenance policy for deteriorating systems subject to continuous degradation and shock. Procedia CIRP 2018, 72, 1311–1315. [Google Scholar] [CrossRef]
- Huang, L.; Dou, Z.; Hu, Y.; Huang, R. Textual analysis for online reviews: A polymerization topic sentiment model. IEEE Access 2021, 7, 91940–91945. [Google Scholar]
- Ma, Y.; Qiao, F.; Zhao, F.; Sutherland, J.W. Dynamic Scheduling of a Semiconductor Production Line Based on a Composite Rule Set. Appl. Sci. 2017, 7, 1052. [Google Scholar] [CrossRef]
- Lee, G.M.; Gao, X. A Hybrid Approach Combining Fuzzy c-Means-Based Genetic Algorithm and Machine Learning for Predicting Job Cycle Times for Semiconductor Manufacturing. Appl. Sci. 2021, 11, 7428. [Google Scholar] [CrossRef]
- Kim, D.; Kim, M.; Kim, W. Wafer edge yield prediction using a combined long short-term memory and feed-forward neural network model for semiconductor manufacturing. IEEE Access 2020, 8, 215125–215132. [Google Scholar] [CrossRef]
- Lundberg, S.M.; Lee, S.I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 2017, 30, 4768–4777. [Google Scholar]
Preprocessing | Count of Parameters | ||||||
Numerical Data | Categorical Data | Total Data | |||||
Original dataset | 176 | 176 | 352 | ||||
One-hot encoding | - | 1276 | - | ||||
Dimension reduction | 62 | 921 | 983 | ||||
Dataset symbol | Brief explanation | Type | Original | One-Hot Encoding | Dimension Reduction | ||
Steps | Max Label/Step | Steps | Features | ||||
R | Operating Condition | Categorical | 142 | 7 | 37 | 75 | |
U | Equipment Unit | Categorical | 34 | 58 | 34 | 846 | |
T | Process Time | Numerical | 142 | - | 28 | 28 | |
P | Sensor Parameter | Numerical | 34 | - | 34 | 34 | |
Preprocessed | Target Data | ||||||
Training Data | Test Data | ||||||
Count | 261 | 66 | |||||
Average | 0.012 | (−) 0.048 | |||||
Standard deviation | 1.021 | 0.926 |
Models | MAE | RMSE ** | Tuned Hyper-Parameters |
---|---|---|---|
RF | 0.520 | 0.648 | n_estimators = 400, min_samples_leaf = 2, max_features = ‘sqrt’, max_depth = 12 |
KNN | 0.542 | 0.653 | n_neighbors = 8, p = 1, weights = ‘distance’ |
SVR | 0.531 | 0.682 | kernel = ‘sigmoid’, gamma = ‘auto’, coef0 = 0, C = 0.5 |
LightGBM | 0.557 | 0.692 | colsample_bytree = 0.4, learning_rate = 0.01, max_depth = 5, n_estimators = 200 |
GPR | 0.549 | 0.693 | kernel = RationalQuadratic(alpha = 1, length_scale = 1), alpha = 0.5 |
XGBoost | 0.559 | 0.696 | learning_rate = 0.03, n_estimators = 100, subsample = 0.25 |
AdaBoost | 0.566 | 0.716 | learning_rate = 0.1, n_estimators = 300 |
Lasso | 0.567 | 0.726 | alpha = 0.1, tol = 0.001, max_iter = 2000, selection = ‘random’ |
RANSAC | 0.614 | 0.800 | stop_probability = 0.999, min_samples = 5, max_trials = 500 |
MLP | 0.667 | 0.825 | max_iter = 200, hidden_layer_sizes = (200, 2), activation = ‘logistic’ |
Statistical estimator * | 0.744 | 0.919 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lee, Y.; Roh, Y. An Expandable Yield Prediction Framework Using Explainable Artificial Intelligence for Semiconductor Manufacturing. Appl. Sci. 2023, 13, 2660. https://doi.org/10.3390/app13042660
Lee Y, Roh Y. An Expandable Yield Prediction Framework Using Explainable Artificial Intelligence for Semiconductor Manufacturing. Applied Sciences. 2023; 13(4):2660. https://doi.org/10.3390/app13042660
Chicago/Turabian StyleLee, Youjin, and Yonghan Roh. 2023. "An Expandable Yield Prediction Framework Using Explainable Artificial Intelligence for Semiconductor Manufacturing" Applied Sciences 13, no. 4: 2660. https://doi.org/10.3390/app13042660
APA StyleLee, Y., & Roh, Y. (2023). An Expandable Yield Prediction Framework Using Explainable Artificial Intelligence for Semiconductor Manufacturing. Applied Sciences, 13(4), 2660. https://doi.org/10.3390/app13042660