Long-Term Prediction Model for NOx Emission Based on LSTM–Transformer
Abstract
:1. Introduction
2. Model Object Description
2.1. Continuous Emission Monitoring System
2.2. Statement of Existing Problems
3. The Proposed NOx Prediction Model
3.1. LSTM
3.2. Self-Attention Mechanism
3.3. LSTM–Transformer
- Long-term dependencies are modeled using a self-attention mechanism, and short-term dependencies are modeled using LSTM, thus simultaneously focusing on the repetitive patterns of the time series data in the long- and short-term.
- The sin–cos position encoding method only considers the distance relationship but not the direction relationship, both of which are equally important for time series prediction tasks. And from the structure, LSTM has the feature of inputting and transmitting information sequentially in a time sequence, so LSTM can be used to learn the distance and direction information of the input data.
- The LSTM encoding can maintain the continuity of time series data in time, thus reducing the decrease in model accuracy caused by the attention mechanism that disrupts the continuity of time series data.
- Parallel-designed structures can improve computational efficiency.
- (a)
- Encoder
- (b)
- Decoder
- (c)
- Output layer
4. Data Preprocessing
4.1. Datasets
4.2. Outlier Detection and Missing Values Handling
4.3. Feature Variable Selection
4.4. Data Standardization
5. Experiments
5.1. Evaluation Metrics
5.2. Baselines
5.3. Implementation Details
5.4. Results and Analysis
5.4.1. NOx Concentration Emission Prediction
- (1)
- In the long-term prediction task, LSTM–Transformer significantly improves the prediction performance in both datasets with different sampling intervals. This demonstrates the success of the proposed model in enhancing long-term time series prediction capability.
- (2)
- LSTM–Transformer has better prediction accuracy than Transformer. The reason for this is that LSTM can provide fine-grained short-term trend information and provide position information. This demonstrates the effectiveness of the structure we designed.
- (3)
- The increase in the sampling interval time may ignore some changes in the data during this increased time, which leads to the loss of information. This is the main reason for the degradation of the model performance. Notably, the LSTM–Transformer still has a better prediction accuracy as the sampling interval time increases. It means that the LSTM–Transformer has better robustness, which is meaningful for the accurate long-term prediction of NOx emission concentration.
- (4)
- The transformer-based model has a better prediction accuracy. This demonstrates the advantage of the self-attention in capturing long-term dependencies, as the self-attention makes the path of signaling as short as possible.
- (5)
- CEEMDAN-AM-LSTM has better performance in LSTM-based models and shows a similar prediction accuracy as Transformer. This demonstrates the effectiveness of the CEEMDAN method in time series preprocessing. We speculate that combining CEEMDAN with the Transformer might have good results.
- (6)
- We also find that LSTM–Transformer, Transformer, CEEMDAN-AM-LSTM, and S2S-AM-LSTM gradually deteriorate with regard to prediction accuracy as the prediction distance increases. This is due to the limitations of the encoder–decoder architecture, which suffers from error accumulation when implementing dynamic decoding inference.
5.4.2. Analysis of Generalization Capacity
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Ministry of Ecology and Environment of the PRC. Annual Statistics Report on Ecology and Environment in China 2020; China Environmental Publishing Group: Beijing, China, 2022.
- Ministry of Ecology and Environment of the PRC. Technical Guideline for the Development of National Air Pollutant Emission Standards; Ministry of Ecology and Environment of the PRC: Beijing, China, 2019.
- Ministry of Ecology and Environment of the PRC. Emissions Standard of Air Pollutants for Thermal Power Plants; Ministry of Ecology and Environment of the PRC: Beijing, China, 2011.
- Wei, Z.; Li, X.; Xu, L.; Cheng, Y. Comparative study of computational intelligence approaches for NOx reduction of coal-fired boiler. Energy 2013, 55, 683–692. [Google Scholar] [CrossRef]
- Wei, L.G.; Guo, R.T.; Zhou, J.; Qin, B.; Chen, X.; Bi, Z.X.; Pan, W.G. Chemical deactivation and resistance of Mn-based SCR catalysts for NOx removal from stationary sources. Fuel 2022, 316, 123438. [Google Scholar] [CrossRef]
- Yang, T.; Ma, K.; Lv, Y.; Bai, Y. Real-time dynamic prediction model of NOx emission of coal-fired boilers under variable load conditions. Fuel 2020, 274, 117811. [Google Scholar] [CrossRef]
- Xie, P.; Gao, M.; Zhang, H.; Niu, Y.; Wang, X. Dynamic modeling for NOx emission sequence prediction of SCR system outlet based on sequence to sequence long short-term memory network. Energy 2020, 190, 116482. [Google Scholar] [CrossRef]
- Zhou, H.; Zhao, J.P.; Zheng, L.G.; Wang, C.L.; Cen, K.F. Modeling NOx emissions from coal-fired utility boilers using support vector regression with ant colony optimization. Eng. Appl. Artif. Intell. 2012, 25, 147–158. [Google Scholar] [CrossRef]
- Lv, Y.; Yang, T.; Liu, J. An adaptive least squares support vector machine model with a novel update for NOx emission prediction. Chemom. Intell. Lab. Syst. 2015, 145, 103–113. [Google Scholar] [CrossRef]
- Wang, G.; Awad, O.I.; Liu, S.; Shuai, S.; Wang, Z. NOx emissions prediction based on mutual information and back propagation neural network using correlation quantitative analysis. Energy 2020, 198, 117286. [Google Scholar] [CrossRef]
- Zhou, H.; Cen, K.; Fan, J. Modeling and optimization of the NOx emission characteristics of a tangentially fired boiler with artificial neural networks. Energy 2004, 29, 167–183. [Google Scholar] [CrossRef]
- Ilamathi, P.; Selladurai, V.; Balamurugan, K.; Sathyanathan, V. ANN–GA approach for predictive modeling and optimization of NOx emission in a tangentially fired boiler. Clean Technol. Environ. Policy 2013, 15, 125–131. [Google Scholar] [CrossRef]
- Elman, J.L. Finding structure in time. Cogn. Sci. 1990, 14, 179–211. [Google Scholar] [CrossRef]
- Arsie, I.; Cricchio, A.; De Cesare, M.; Lazzarini, F.; Pianese, C.; Sorrentino, M. Neural network models for virtual sensing of NOx emissions in automotive diesel engines with least square-based adaptation. Control Eng. Pract. 2017, 61, 11–20. [Google Scholar] [CrossRef]
- Arsie, I.; Marra, D.; Pianese, C.; Sorrentino, M. Real-Time Estimation of Engine NOx Emissions via Recurrent Neural Networks. IFAC Proc. Vol. 2010, 43, 228–233. [Google Scholar] [CrossRef]
- Bengio, Y.; Simard, P.; Frasconi, P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 1994, 5, 157–166. [Google Scholar] [CrossRef] [PubMed]
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Tan, P.; He, B.; Zhang, C.; Rao, D.; Li, S.; Fang, Q.; Chen, G. Dynamic modeling of NOX emission in a 660 MW coal-fired boiler with long short-term memory. Energy 2019, 176, 429–436. [Google Scholar] [CrossRef]
- Yang, G.; Wang, Y.; Li, X. Prediction of the NOx emissions from thermal power plant using long-short term memory neural network. Energy 2020, 192, 116597. [Google Scholar] [CrossRef]
- He, W.; Li, J.; Tang, Z.; Wu, B.; Luan, H.; Chen, C.; Liang, H. A Novel Hybrid CNN-LSTM Scheme for Nitrogen Oxide Emission Prediction in FCC Unit. Math. Probl. Eng. 2020, 2020, 8071810. [Google Scholar] [CrossRef]
- Kim, T.Y.; Cho, S.B. Predicting residential energy consumption using CNN-LSTM neural networks. Energy 2019, 182, 72–81. [Google Scholar] [CrossRef]
- Wang, X.; Liu, W.; Wang, Y.; Yang, G. A hybrid NOx emission prediction model based on CEEMDAN and AM-LSTM. Fuel 2022, 310, 122486. [Google Scholar] [CrossRef]
- Kalchbrenner, N.; Blunsom, P. Recurrent continuous translation models. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, WA, USA, 18–21 October 2013; pp. 1700–1709. [Google Scholar]
- Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv 2014, arXiv:1406.1078. [Google Scholar] [CrossRef]
- Cho, K.; Van Merriënboer, B.; Bahdanau, D.; Bengio, Y. On the properties of neural machine translation: Encoder-decoder approaches. arXiv 2014, arXiv:1409.1259. [Google Scholar] [CrossRef]
- Sutskever, I.; Vinyals, O.; Le, Q.V. Sequence to Sequence Learning with Neural Networks. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 8–13 December 2014; Volume 27. [Google Scholar]
- Bahdanau, D.; Cho, K.; Bengio, Y. Neural machine translation by jointly learning to align and translate. arXiv 2014, arXiv:1409.0473. [Google Scholar] [CrossRef]
- Luong, M.T.; Pham, H.; Manning, C.D. Effective approaches to attention-based neural machine translation. arXiv 2015, arXiv:1508.04025. [Google Scholar] [CrossRef]
- Zhou, H.; Zhang, S.; Peng, J.; Zhang, S.; Li, J.; Xiong, H.; Zhang, W. Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 2–9 February 2021; Volume 35, pp. 11106–11115. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.u.; Polosukhin, I. Attention is All you Need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
- Yan, H.; Deng, B.; Li, X.; Qiu, X. TENER: Adapting transformer encoder for named entity recognition. arXiv 2019, arXiv:1911.04474. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016. [Google Scholar]
- Ba, J.L.; Kiros, J.R.; Hinton, G.E. Layer normalization. arXiv 2016, arXiv:1607.06450. [Google Scholar] [CrossRef]
- Williamson, D.F.; Parker, R.A.; Kendrick, J.S. The Box Plot: A Simple Visual Method to Interpret Data. Ann. Intern. Med. 1989, 110, 916–921. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar] [CrossRef]
Models | LSTM–Transformer | Transformer | CEEMDAN-AM-LSTM | S2S-AM-LSTM | CNN-LSTM | LSTM | BPNN | SVM | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Metric | RMSE | MAPE | RMSE | MAPE | RMSE | MAPE | RMSE | MAPE | RMSE | MAPE | RMSE | MAPE | RMSE | MAPE | RMSE | MAPE | |
I = 36 | 6 | 0.455 | 1.397 | 0.571 | 1.732 | 0.694 | 2.331 | 1.667 | 6.511 | 1.177 | 4.154 | 2.044 | 7.502 | 2.125 | 8.135 | 2.109 | 8.244 |
12 | 0.450 | 1.344 | 0.681 | 2.241 | 0.726 | 2.379 | 1.706 | 6.514 | 1.225 | 4.306 | 2.070 | 7.525 | 2.064 | 7.667 | 2.136 | 8.270 | |
24 | 0.512 | 1.476 | 0.699 | 2.388 | 0.746 | 2.468 | 1.756 | 6.628 | 1.274 | 4.639 | 1.650 | 6.234 | 2.072 | 8.144 | 2.160 | 8.642 | |
48 | 0.523 | 1.528 | 0.766 | 2.442 | 0.814 | 2.542 | 1.779 | 6.713 | 1.363 | 4.992 | 2.565 | 9.238 | 2.175 | 8.378 | 2.325 | 9.097 |
Models | LSTM–Transformer | Transformer | CEEMDAN-AM-LSTM | S2S-AM-LSTM | CNN-LSTM | LSTM | BPNN | SVM | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Metric | RMSE | MAPE | RMSE | MAPE | RMSE | MAPE | RMSE | MAPE | RMSE | MAPE | RMSE | MAPE | RMSE | MAPE | RMSE | MAPE | |
I = 36 | 6 | 0.702 | 1.370 | 0.834 | 1.773 | 1.039 | 2.758 | 2.688 | 7.549 | 1.653 | 4.649 | 2.885 | 8.046 | 3.289 | 9.159 | 3.276 | 10.035 |
12 | 0.727 | 1.419 | 0.896 | 2.147 | 1.098 | 2.872 | 2.806 | 8.090 | 1.731 | 4.758 | 2.647 | 7.189 | 3.000 | 8.786 | 3.480 | 9.868 | |
24 | 0.970 | 2.514 | 1.269 | 3.595 | 1.219 | 3.230 | 2.829 | 8.133 | 1.782 | 5.160 | 2.947 | 8.703 | 3.105 | 9.424 | 4.017 | 11.954 | |
48 | 1.077 | 2.713 | 1.369 | 4.098 | 1.365 | 3.963 | 2.852 | 8.179 | 1.829 | 5.313 | 3.657 | 9.965 | 3.416 | 10.056 | 5.975 | 15.270 |
Input-36 | 18 h | 24 h | 36 h | |||
---|---|---|---|---|---|---|
Predict-O | RMSE | MAPE | RMSE | MAPE | RMSE | MAPE |
6 | 0.713 | 1.394 | 0.690 | 1.336 | 0.755 | 1.692 |
12 | 0.718 | 1.391 | 0.731 | 1.559 | 0.797 | 1.729 |
24 | 0.873 | 2.113 | 0.864 | 1.991 | 0.949 | 2.468 |
48 | 0.996 | 2.677 | 1.059 | 2.738 | 1.137 | 2.872 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Guo, Y.; Mao, Z. Long-Term Prediction Model for NOx Emission Based on LSTM–Transformer. Electronics 2023, 12, 3929. https://doi.org/10.3390/electronics12183929
Guo Y, Mao Z. Long-Term Prediction Model for NOx Emission Based on LSTM–Transformer. Electronics. 2023; 12(18):3929. https://doi.org/10.3390/electronics12183929
Chicago/Turabian StyleGuo, Youlin, and Zhizhong Mao. 2023. "Long-Term Prediction Model for NOx Emission Based on LSTM–Transformer" Electronics 12, no. 18: 3929. https://doi.org/10.3390/electronics12183929
APA StyleGuo, Y., & Mao, Z. (2023). Long-Term Prediction Model for NOx Emission Based on LSTM–Transformer. Electronics, 12(18), 3929. https://doi.org/10.3390/electronics12183929