Multi-Objective Instance Weighting-Based Deep Transfer Learning Network for Intelligent Fault Diagnosis
Abstract
:1. Introduction
- We present a fault diagnosis framework based on instance-based transfer learning using multi-objective instance weighting to diagnose faults when few labeled target data are available in the target task. This framework helps achieve high diagnosis accuracy and robustness in high dissimilarity situations between and within domains due to distinct operating conditions. The proposed method uses instance weights obtained from the two complementary dissimilarity indicators to minimize the dissimilarity between domains that affect model training. The knowledge of source instances suitable for the target task can be transferred through the domain optimization process. It results in improved performance of the target diagnosis model.
- According to various instance optimization techniques used in instance-based transfer learning, the diagnosis accuracy is compared in detail. Through this comparison, the domain optimization process and effectiveness of the proposed method are confirmed.
- The accuracy of the diagnosis model using the proposed method and transfer learning is monitored in detail, changing the number of labeled target data. The case study is also conducted through the testbed identical to the actual industrial field, verifying the applicability of the proposed method for diagnosis when the target labeled data are less available at the actual industrial field.
2. Proposed Method
2.1. Data Preprocessing (Time-Frequency Domain Imaging)
2.2. Deep Residual Learning Network
2.3. Transfer Learning and Fine-Tuning Strategy
2.4. Instance-Based Transfer Learning
2.5. Multi-Objective Instance-Weighting Strategy
2.5.1. Kullback–Leibler Divergence
2.5.2. Maximum Mean Discrepancy
2.5.3. Multi-Objective Instance Weighting
2.6. Detailed Procedure of the Proposed Method
3. Case Study: Experimental Verification and Comparison Results
3.1. Experimental Setup Description
3.2. Comparison Studies
3.2.1. Comparison of Model Performance by Signal Processing Methods and the Learning Algorithms
3.2.2. Comparison between the Proposed Method and Non-Transfer Learning Method
- S1 for 130 to 180 data ranges with an average accuracy of 90% or more;
- S2 for 70 to 120 ranges with an average accuracy of 80%;
- S3 for 20 to 60 ranges;
- S4 for the entire 20 to 180 range.
3.2.3. Comparison of Model Performance According to Domain Optimization Methods
4. Discussion
- (1)
- The proposed method performs instance weighting using multi-objective instance weights for effective transfer learning to minimize the discrepancy between the target and source domain instances. Each source instance is assigned a weight that evaluates the relation to the target domain to curtail the negative transfer effect resulting from source instances with a large domain discrepancy. The MMD and KLD, which are indicators measuring dissimilarity, are used to obtain these weights. This method considers the discrepancy between the source and target domains and the discrepancy within the source domain. It includes the process of finding the optimal multi-objective instance weights and the final transfer learning process. The effect of knowledge transfer between the source and target domains is maximized through this method.
- (2)
- It was confirmed that the target task performance increased using a different but related source domain. However, some instances are not related to the source domain and have a large discrepancy. Therefore, performance improvement varies greatly depending on the technique of using the source domain. In this paper, we compared the performance of the domain optimization techniques used for this purpose. Our method of using two different indicators as instance weights causes a complementary effect of combining the advantages of the two indicators.
- (3)
- When performing intelligent fault diagnosis, a lack of labeled target data is a prevalent situation. However, data from similar work environments are usually available. When one has few labeled target data, the proposed method and transfer learning can be an excellent alternative. From the comparison results, robust performance was found even when few data exist. Through this, it was confirmed that the applicability in actual industrial fields is promising.
5. Conclusions
- (1)
- The proposed method outperforms the standard diagnosis models without transfer learning method.
- (2)
- It is verified that our multi-objective instance weighting strategy has higher performance than other optimization strategies used in transfer learning. The multi-objective instance weighting strategy can cope with deterioration of model performance due to dissimilarity between and within domains inherent in transfer learning process.
- (3)
- In particular, as the number of the training data in the target domain decreases, the improvement of the diagnosis accuracy and stability due to use of the proposed method increases.
- (4)
- The experimental setup of industrial spot welding that is actually carried out in automobile factories was used. Case study was conducted through the data collected from one accelerometer, with the realistic experimental conditions, such as difference in operating conditions and the number of data instances. It is confirmed that the proposed method has remarkable applicability for fault diagnosis in industrial sites.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Conflicts of Interest
References
- Lei, Y.; Yang, B.; Jiang, X.; Jia, F.; Li, N.; Nandi, A.K. Applications of machine learning to machine fault diagnosis: A review and roadmap. Mech. Syst. Signal. Process. 2020, 138, 106587. [Google Scholar] [CrossRef]
- Kusiak, A. Smart manufacturing. Int. J. Prod. Res. 2018, 56, 508–517. [Google Scholar] [CrossRef]
- Tao, F.; Qi, Q.; Liu, A.; Kusiak, A. Data-Driven smart manufacturing. J. Manuf. Syst. 2018, 48, 157–169. [Google Scholar] [CrossRef]
- Pandya, D.H.; Upadhyay, S.H.; Harsha, S.P. Fault diagnosis of rolling element bearing with intrinsic mode function of acoustic emission data using APFKNN. Expert Syst. Appl. 2013, 40, 4137–4145. [Google Scholar] [CrossRef]
- Li, C.; Sanchez, R.V.; Zurita, G.; Cerrada, M.; Cabrera, D.; Vasquez, R.E. Gearbox fault diagnosis based on deep random forest fusion of acoustic and vibratory signals. Mech. Syst. Signal Process. 2016, 76–77, 283–293. [Google Scholar] [CrossRef]
- Glowacz, A.; Glowacz, W.; Glowacz, Z.; Kozik, J. Early fault diagnosis of bearing and stator faults of the single-phase induction motor using acoustic signals. Measurement 2018, 113, 1–9. [Google Scholar] [CrossRef]
- Lopez-Perez, D.; Antonino-Daviu, J. Application of infrared thermography to failure detection in industrial induction motors: Case stories. IEEE Trans. Ind. Appl. 2017, 53, 1901–1908. [Google Scholar] [CrossRef]
- Garcia-Ramirez, A.G.; Morales-Hernandez, L.A.; Osornio-Rios, R.A.; Benitez-Rangel, J.P.; Garcia-Perez, A.; Romero-Troncoso, R.D. Fault detection in induction motors and the impact on the kinematic chain through thermographic analysis. Electrt. Power Syst. Res. 2014, 114, 1–9. [Google Scholar] [CrossRef]
- Glowacz, A. Fault diagnosis of electric impact drills using thermal imaging. Measurement 2021, 171, 108815. [Google Scholar] [CrossRef]
- Naha, A.; Samanta, A.K.; Routray, A.K.; Deb, A.K. Low complexity motor current signature analysis using sub-nyquist strategy with reduced data length. IEEE Trans. Instrum. Meas. 2017, 66, 3249–3259. [Google Scholar] [CrossRef]
- Yang, T.; Pen, H.; Wang, Z.; Chang, C. Feature knowledge based fault detection of induction motors through the analysis of stator current data. IEEE Trans. Instrum. Meas. 2016, 65, 549–558. [Google Scholar] [CrossRef]
- Urbikain, G.; Campa, F.J.; Zulaika, J.J.; Lopez de Lacalle, L.N.; Alonso, M.A.; Collado, V. Preventing chatter vibrations in heavy-duty turning operations in large horizontal lathes. J. Sound Vib. 2015, 340, 317–330. [Google Scholar] [CrossRef]
- Oleaga, I.; Pardo, C.; Zulaika, J.J.; Bustillo, A. A machine-learning based solution for chatter prediction in heavy-duty milling machines. Measurement 2018, 128, 34–44. [Google Scholar] [CrossRef]
- Rai, A.; Upadhyay, S.H. A review on signal processing techniques utilized in the fault diagnosis of rolling element bearings. Tribol. Int. 2016, 48, 289–306. [Google Scholar] [CrossRef]
- Goyal, D.; Vanraj; Pabla, B.S.; Dhami, S.S. Condition monitoring parameters for fault diagnosis of fixed axis gearbox: A review. Arch. Comput. Methods Eng. 2017, 24, 543–556. [Google Scholar] [CrossRef]
- Glowacz, A.; Glowacz, W. Vibration-Based fault diagnosis of commutator motor. Shock Vib. 2018, 2018, 7460419. [Google Scholar] [CrossRef]
- Palacios, R.H.C.; Goedtel, A.; Godoy, W.F.; Fabri, J.A. Fault identification in the stator winding of induction motors using PCA with artificial neural networks. J. Control Autom. Electr. Syst. 2016, 27, 406–418. [Google Scholar] [CrossRef]
- Kang, M.; Kim, J.; Kim, J.M.; Tan, A.C.C.; Kim, E.Y.; Choi, B.K. Reliable fault diagnosis for low-speed bearings using individually trained support vector machines with kernel discriminative feature analysis. IEEE Trans. Power Electron. 2015, 30, 2786–2797. [Google Scholar] [CrossRef] [Green Version]
- Han, S.; Choi, H.J.; Choi, S.K.; Oh, J.S. Fault diagnosis of planetary gear carrier packs: A class imbalance and multiclass classification problem. Int. J. Precis. Eng. 2019, 20, 167–179. [Google Scholar] [CrossRef]
- Ma, S.; Chu, F.; Han, Q. Deep residual learning with demodulated time-frequency features for fault diagnosis of planetary gearbox under nonstationary running conditions. Mech. Syst. Signal. Process. 2019, 127, 190–201. [Google Scholar] [CrossRef]
- Zhang, W.; Li, X.; Ding, Q. Deep residual learning-based fault diagnosis method for rotating machinery. ISA Trans. 2019, 95, 295–305. [Google Scholar] [CrossRef] [PubMed]
- Shao, S.; McAleer, S.; Yan, R.; Baldi, P. Highly accurate machine fault diagnosis using deep transfer learning. IEEE Trans. Industr. Inform. 2019, 15, 2446–2455. [Google Scholar] [CrossRef]
- Yang, F.; Zhang, W.; Tao, L.; Ma, J. Transfer learning strategies for deep learning-based PHM algorithms. Appl. Sci. 2020, 10, 2361. [Google Scholar] [CrossRef] [Green Version]
- Cao, P.; Zhang, S.; Tang, J. Preprocessing-Free gear fault diagnosis using small datasets with deep convolutional neural network-based transfer learning. IEEE Access. 2018, 6, 26241–26253. [Google Scholar] [CrossRef]
- Wang, Z.; Dai, Z.; Póczos, B.; Carbonell, J. Characterizing and avoiding negative transfer. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 11293–11302. [Google Scholar]
- Wang, T.; Huan, J.; Zhu, M. Instance-Based deep transfer learning. In Proceedings of the 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa Village, HI, USA, 7–11 January 2019; pp. 367–375. [Google Scholar]
- Zhang, L.; Guo, L.; Gao, H.; Dong, D.; Fu, G.; Hong, X. Instance-Based ensemble deep transfer learning network: A new intelligent degradation recognition method and its application on ball screw. Mech. Syst. Signal. Process. 2020, 140, 106681. [Google Scholar] [CrossRef]
- Bustillo, A.; Urbikain, G.; Perez, J.M.; Pereira, O.M.; Lopez de Lacalle, L.N. Smart optimization of a friction-drilling process based on boosting ensembles. J. Manuf. Syst. 2018, 48, 108–121. [Google Scholar] [CrossRef]
- Dai, W.; Yang, Q.; Xue, G.-R.; Yu, Y. Boosting for transfer learning. In Proceedings of the 24th International Conference on Machine Learning, New York, NY, USA, 20–24 June 2007; pp. 193–200. [Google Scholar]
- Yao, Y.; Doretto, G. Boosting for transfer learning with multiple sources. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; pp. 1855–1862. [Google Scholar]
- Kullback, S.; Leibler, R.A. On information and sufficiency. Ann. Math. Stat. 1951, 22, 79–86. [Google Scholar] [CrossRef]
- Borgwardt, K.M.; Gretton, A.; Rasch, M.J.; Kriegel, H.P.; Scholkopf, B.; Smola, A.J. Integrating structured biological data by Kernel Maximum Mean Discrepancy. Bioinformatics 2006, 22, e49–e57. [Google Scholar] [CrossRef] [Green Version]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Huh, J.; Pham Van, H.; Han, S.; Choi, H.J.; Choi, S.K. A data-driven approach for the diagnosis of mechanical systems using trained subtracted signal spectrograms. Sensors 2019, 19, 1055. [Google Scholar] [CrossRef] [Green Version]
- Pan, S.J.; Yang, Q. A survey on transfer learning. IEEE Trans. Knowl. Data. Eng. 2009, 22, 1345–1359. [Google Scholar] [CrossRef]
- Zhuang, F.; Qi, Z.; Duan, K.; Xi, D.; Zhu, Y.; Zhu, H.; Xiong, H.; He, Q. A comprehensive survey on transfer learning. Proc. IEEE 2021, 109, 43–76. [Google Scholar] [CrossRef]
- Huang, J.; Gretton, A.; Borgwardt, K.; Schölkopf, B.; Smola, A. Correcting sample selection bias by unlabeled data. Adv. Neural. Inf. Process. Syst. 2006, 19, 601–608. [Google Scholar]
- Sugiyama, M.; Suzuki, T.; Nakajima, S.; Kashima, H.; von Bünau, P.; Kawanabe, M. Direct importance estimation for covariate shift adaptation. Ann. Inst. Stat. 2008, 60, 699–746. [Google Scholar] [CrossRef]
- Wang, Q.; Michau, G.; Fink, O. Domain adaptive transfer learning for fault diagnosis. In Proceedings of the 2019 Prognostics and System Health Management Conference (PHM-Paris), Paris, France, 2–5 May 2019; pp. 279–285. [Google Scholar]
- Pan, S.J.; Tsang, I.W.; Kwok, J.T.; Yang, Q. Domain adaptation via transfer component analysis. IEEE Trans. Neural. Netw. 2011, 22, 199–210. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Gretton, A.; Borgwardt, K.M.; Rasch, M.J.; Schölkopf, B.; Smola, A. A kernel two-sample test. J. Mach. Learn. Res. 2012, 13, 723–773. [Google Scholar]
- Hyundai Robotics. Available online: https://www.hyundai-robotics.com/english/index.html (accessed on 9 July 2020).
- Obara. Available online: http://www.obara.co.jp/en/ (accessed on 12 July 2020).
Layer | Type | Data Dimension |
---|---|---|
Input | #,224,224,3 | |
Conv1 | ConvolutionalLayer | #,112,112,64 |
bn_conv1 | BatchNormalizationLayer | #,112,112,64 |
conv1_relu | Ramp | #,112,112,64 |
pool1_pad | PaddingLayer | #,113,113,64 |
pool1 | PoolingLayer | #,56,56,64 |
2a | ResidualBlock2 (12 nodes) | #,56,56,256 |
2b | ResidualBlock2 (10 nodes) | #,56,56,256 |
2c | ResidualBlock2 (10 nodes) | #,56,56,256 |
3a | ResidualBlock3 (12 nodes) | #,28,28,512 |
3b | ResidualBlock3 (10 nodes) | #,28,28,512 |
3c | ResidualBlock3 (10 nodes) | #,28,28,512 |
3d | ResidualBlock3 (10 nodes) | #,28,28,512 |
4a | ResidualBlock4 (12 nodes) | #,14,14,1024 |
4b | ResidualBlock4 (10 nodes) | #,14,14,1024 |
4c | ResidualBlock4 (10 nodes) | #,14,14,1024 |
4d | ResidualBlock4 (10 nodes) | #,14,14,1024 |
4e | ResidualBlock4 (10 nodes) | #,14,14,1024 |
4f | ResidualBlock4 (10 nodes) | #,14,14,1024 |
5a | ResidualBlock5 (12 nodes) | #,7,7,2048 |
5b | ResidualBlock5 (10 nodes) | #,7,7,2048 |
5c | ResidualBlock5 (10 nodes) | #,7,7,2048 |
pool5 | PoolingLayer | #,1,1,2048 |
flatten_0 | FlattenLayer | #,2048 |
Fc2 | LinearLayer | #,2 |
prob | SoftmaxLayer | #,2 |
Output | class |
Specimen Material | Electrode Force | Weld Current | Weld Time | Hold Time | Hold Time |
---|---|---|---|---|---|
Mild steel | 250 kgF | 5.0 kA | 0.167 s | 0.250 s | 0.250 s |
GA steel | 385 kgF | 5.5 kA | 0.167 s | 0.250 s | 0.250 s |
GI steel | 360 kgF | 6.0 kA | 0.167 s | 0.250 s | 0.251 s |
Input Shape | Algorithm | Training Dataset | Classification Accuracy (%) | |
---|---|---|---|---|
Labeled Target Data | Source Data | |||
Handcrafted features | DNN (fully-connected) | O | X | 73.80 |
O | O | 50.10 | ||
STFT image | ResNet | O | X | 92.67 |
ResNet | O | O | 94.44 | |
Proposed method | O | O | 98.33 | |
WPD image | ResNet | O | X | 94.00 |
ResNet | O | O | 95.00 | |
Proposed method | O | O | 97.22 | |
WPD image (spectral subtraction) | ResNet | O | X | 92.56 |
Proposed method | O | O | 96.67 |
Proposed Method | ResNet50 (Non-Transfer Learning) | |||
---|---|---|---|---|
Number of Target Training Data | Classification Accuracy (%) | Max.Accuracy Epoch | Classification Accuracy (%) | Max.Accuacry Epoch |
180 | 98.33 | 148 | 92.67 | 497 |
170 | 98.89 | 212 | 93.78 | 382 |
160 | 96.44 | 96 | 91.78 | 253 |
150 | 97.78 | 138 | 91.11 | 200 |
140 | 97.33 | 237 | 91.78 | 218 |
130 | 96.22 | 173 | 88.67 | 403 |
120 | 96.00 | 277 | 89.22 | 232 |
110 | 95.56 | 211 | 90.00 | 237 |
100 | 95.44 | 230 | 86.22 | 218 |
90 | 91.89 | 169 | 87.11 | 181 |
80 | 91.11 | 282 | 84.67 | 193 |
70 | 93.11 | 121 | 82.33 | 309 |
60 | 88.56 | 191 | 79.44 | 319 |
50 | 85.11 | 295 | 78.11 | 235 |
40 | 83.22 | 186 | 74.22 | 353 |
30 | 75.56 | 62 | 61.67 | 431 |
20 | 75.00 | 90 | 61.00 | 484 |
Scenario | Number of Training Data | No Optimization | Target Dropout | KLD Instance Weights | MMD Instance Weights | Multi-Objective Instance Weights (Ours) |
---|---|---|---|---|---|---|
S1 | 180 | 92.11 | 96.33 | 97.11 | 97.33 | 98.33 |
170 | 96.33 | 97.11 | 97.44 | 97.44 | 98.89 | |
160 | 89.33 | 92.78 | 95.22 | 92.56 | 96.44 | |
150 | 92.56 | 93 | 95.89 | 95.89 | 97.78 | |
140 | 94.11 | 94.67 | 95.67 | 96.56 | 97.33 | |
130 | 93.33 | 95.78 | 90.56 | 94.78 | 96.22 | |
S2 | 120 | 88.89 | 92.11 | 93 | 91.78 | 96 |
110 | 92.33 | 93.44 | 94.56 | 92.78 | 95.56 | |
100 | 90.44 | 92.67 | 90.67 | 93 | 95.44 | |
90 | 86.67 | 86.22 | 86.78 | 90.45 | 91.89 | |
80 | 88.22 | 85 | 83.89 | 87.11 | 91.11 | |
70 | 89.22 | 86.89 | 85.22 | 90 | 93.11 | |
S3 | 60 | 83.22 | 85.33 | 81.33 | 86.44 | 88.56 |
50 | 77.78 | 81.44 | 77.44 | 82.56 | 85.11 | |
40 | 78.78 | 81.89 | 75.44 | 81.89 | 83.22 | |
30 | 69 | 75.56 | 72.78 | 73.89 | 75.56 | |
20 | 65.56 | 70 | 69.44 | 75 | 75 |
Transfer Scenario (Number of Training Data) | No Optimization | Target Dropout | KLD Instance Weights | MMD Instance Weights | Multi-Objective Instance Weights (Ours) |
---|---|---|---|---|---|
S1 (from 130 to 180) | 92.96 | 94.95 | 95.32 | 95.76 | 97.5 |
S2 (from 70 to 120) | 89.3 | 89.39 | 89.02 | 90.85 | 93.85 |
S3 (from 20 to 60) | 74.87 | 78.84 | 75.29 | 79.96 | 81.49 |
S4 (from 20 to 180) | 86.35 | 88.25 | 87.2 | 89.38 | 91.5 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lee, K.; Han, S.; Pham, V.H.; Cho, S.; Choi, H.-J.; Lee, J.; Noh, I.; Lee, S.W. Multi-Objective Instance Weighting-Based Deep Transfer Learning Network for Intelligent Fault Diagnosis. Appl. Sci. 2021, 11, 2370. https://doi.org/10.3390/app11052370
Lee K, Han S, Pham VH, Cho S, Choi H-J, Lee J, Noh I, Lee SW. Multi-Objective Instance Weighting-Based Deep Transfer Learning Network for Intelligent Fault Diagnosis. Applied Sciences. 2021; 11(5):2370. https://doi.org/10.3390/app11052370
Chicago/Turabian StyleLee, Kihoon, Soonyoung Han, Van Huan Pham, Seungyon Cho, Hae-Jin Choi, Jiwoong Lee, Inwoong Noh, and Sang Won Lee. 2021. "Multi-Objective Instance Weighting-Based Deep Transfer Learning Network for Intelligent Fault Diagnosis" Applied Sciences 11, no. 5: 2370. https://doi.org/10.3390/app11052370
APA StyleLee, K., Han, S., Pham, V. H., Cho, S., Choi, H. -J., Lee, J., Noh, I., & Lee, S. W. (2021). Multi-Objective Instance Weighting-Based Deep Transfer Learning Network for Intelligent Fault Diagnosis. Applied Sciences, 11(5), 2370. https://doi.org/10.3390/app11052370