Next Article in Journal / Special Issue
Development and Analysis of Solutions to Improve the Efficiency of Volute Inlet Pipes in Radial Turboexpanders
Previous Article in Journal
Method of the Mechanical Properties Evaluation of Polyethylene Gas Pipelines with Portable Hardness Testers
Previous Article in Special Issue
Evaluation of the Operating Modes of the Urban Electric Networks in Dushanbe City, Tajikistan
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Using XGBoost Regression to Analyze the Importance of Input Features Applied to an Artificial Intelligence Model for the Biomass Gasification System

1
Department of Aeronautical Engineering, Chaoyang University of Technology, Taichung 413, Taiwan
2
Department of Mechanical Engineering, Lungwa University of Science and Technology, Taoyuan 333, Taiwan
*
Author to whom correspondence should be addressed.
Inventions 2022, 7(4), 126; https://doi.org/10.3390/inventions7040126
Submission received: 13 November 2022 / Revised: 8 December 2022 / Accepted: 12 December 2022 / Published: 13 December 2022
(This article belongs to the Special Issue Data Analytics in the Energy Sector)

Abstract

:
Recently, artificial intelligence models have been developed to simulate the biomass gasification systems. The extant research models use different input features, such as carbon, hydrogen, nitrogen, sulfur, oxygen, and moisture content, in addition to ash, reaction temperature, volatile matter (VM), a lower heating value (LHV), and equivalence ratio (ER). The importance of these input features applied to artificial intelligence models are analyzed in this study; further, the XGBoost regression model was used to simulate a biomass gasification system and investigate its performance. The top-four features, according to the results are ER, VM, LHV, and carbon content. The coefficient of determination (R2) was highest (0.96) when all eleven input features noted above were selected. Further, the model performance using the top-three features produced a R2 value of 0.93. Thus, the XGBoost model performance was validated again and observed to outperform those of previous studies with a lower mean-squared error of 1.55. The comparison error for the hydrogen gas composition produced from the gasification at a temperature of 900 °C and ER = 0.4 was 0.07%.

1. Introduction

Alam and Qiao presented a review on the management of municipal solid waste (MSW) in Bangladesh; this study on the energy recovery from MSW showed that the wastes could be minimized by employing an appropriate MSW management system. According to their results, this project was estimated to effectively reduce the disposal costs by approximately USD 15.29 million annually [1]. Furthermore, Malkow proposed several approaches for pyrolysis and gasification technologies that could improve the fuel utilization and combustion efficiency in Europe [2]. Basu explored the design, analysis, and operational aspects of the biomass gasification from the perspective of thermochemical conversion [3]. Baruah and Baruah reported a review of the optimum parameters for a gasifier design that could achieve the best model performance during gasification [4]. Fixed-bed gasifiers are the simplest type, and the updraft-type gasifier is shown in Figure 1.
Recently, machine learning has been favored and used widely as an approach to model complex nonlinear systems. Ullah et al. used gene expression programming (GEP) to develop a model for the mix design relation of lightweight foamed concrete (LWFC) whose R2 value for performance reached 0.95 [5]. Machine learning technologies, such as the random forest regression and GEP have also been used to simulate the depth of wear of ecofriendly concrete [6]. Furthermore, a support vector machine (SVM) and random forest have been employed to model predictions regarding the compressed LWFC [7]. A combination of the artificial neural network (ANN), SVM, and GEP has been used to model the mechanical properties of self-compacting concrete [8]. Bagasse-ash-based geopolymers have been investigated, along with different mixture rates of propylene fibers [9]. The predictions of individual and ensemble approaches for the compressive strength of fly-ash-based concrete have also been proposed [10]. Such models have been compared and optimized using individually learned and ensemble learned machine intelligence algorithms [11].
Puig-Arnavat et al. developed an ANN to model the biomass gasification process in a fluidized bed reactor and obtained successful predictions for the main producer gas during the complex chemical reactions [12]. Souza et al. presented a method based on ANNs to obtain the regression calculations for different kinds of biomass given the operating conditions; the maximum amounts of produced gas were investigated under different operating conditions [13].
ANNs and gradient boosting regression models were developed by Wen et al. to predict rice husk syngas compositions, using the equivalence ratio (ER), bottom temperature, and steam flow rate as the model input features [14]. An ANN model was developed by Baruah and Hazarika for the biomass gasification using six input features, namely the moisture content (MC), ash, C, H, O, and reaction temperature (Tg); the R2 value for the model performance for the H2 production was 0.9855 [15]. Furthermore, Ozonoh et al. proposed estimating the gasification efficiency by selecting two kinds of input feature numbers for the ANN models; the first model involved eleven features, namely the ash content, MC, volatile matter (VM), C, H, N, O, and S, in addition to a lower heating value (LHV), ER, and Tg; the second model utilized three features: C, VM, and Tg. The R2 value for the performances of both models for the H2 production was 0.95 [16].
Wen et al. proposed the feature importance analysis for NOx and CO2 artificial intelligence models of diesel vehicles, to investigate the model accuracy, based on the different numbers of features. Finally, the best model and best prediction model were determined in terms of the model performance [17]. In addition, the XGboost algorithm has been widely utilized in the detection, disease diagnosis, classification, and prediction [18,19,20,21,22].
The XGboost algorithm has been previously shown to have an excellent performance, which has also been approved and validated in other studies. The model performance is often affected when using different numbers of input features. This study aims to investigate the importance of the input features applied to an artificial intelligence model for the biomass gasification; further, the development of a XGBoost regression model for the biomass gasification system is presented.

2. Materials and Methods

2.1. Data

The data were collected from the extant research on biomass, coal, and the blend of coal and biomass [23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52]. The total number of samples obtained was 315. The proximate analysis of the data included the ash content, MC, and VM, and the ultimate analysis of the data included eight additional variables, namely the C, H, N, O, and S content, as well as LHV, ER, and Tg. The statistical analysis results between the model input features and target are visually shown in Figure 2. For instance, Figure 2a shows the univariate and bivariate distributions of the independent variable (N) versus the dependent variable (H2). It is helpful to understand the dataset relationships by plotting the univariate and bivariate distributions, simultaneously.

2.2. Feature Importance

The permutation technique was used to analyze the rank and score of the feature importance. This analysis was implemented in Python, using permutation_importance directly. The process involved breaking the input feature values by random shuffling. The results showed the relative predictive power of these features in the model [53]. The algorithm for this procedure is as follows:
  • Inputs: fitted predictive model m , tabular dataset (training or validation) D ;
  • Compute the reference score s of the model m on data D (for instance, the accuracy for a classifier and R2 for regression);
  • For each feature j (column of D );
  • For each repetition k in 1,……, K ;
  • Randomly shuffle column j of data D to generate a corrupted version of the data as D ˜ k j ;
  • Compute the score s k j of model m on corrupted data D ˜ k j ;
  • Compute the importance i j for feature f j defined as
i j = s 1 K k = 1 K s k j

2.3. XGBoost Model

Chen and Guestrin proposed a novel sparsity-aware algorithm called the XGBoost, which is available as an open-source package for approximate tree learning. The XGBoost algorithm has been shown to reduce the calculation costs and provide a high model performance [54]. This algorithm has been widely used in many fields [55,56,57]. In this study, the data are split into two subsets using the Python sklearn library code, such that 70% is used as the training set, 10% is used for the validation, and 20% is used for the testing. Therefore, the XGBoost regression model can be trained using the training set, evaluated using the validation set, and then applied to the test set for the verification for the model performance evaluation.
When building the XGBoost regression model, several parameters should be assigned in the Python code. The main parameters of the XGBoost model are the following: Max_depth denoting the maximum depth of the tree, whose default value is 3; this was set to 15 in this study. N_estimators denote the number of trees used for boosting, whose value was set to 100 in the model, in this study. Learning_rate indicates the learning rate that determines the step size at each iteration while moving toward a minimum loss function, and its value was set at 0.2 in this study. Colsample_bytree denotes a family of parameters for subsampling the columns within a range of 0 to 1, and this value was set at 0.3. Figure 3 shows the flowchart of the proposed method for the composition gas prediction during gasification.
Figure 4 shows the violin plots depicting the summary statistics and densities of C, VM, LHV, and H2. The broader areas of the violin plots represent higher probabilities that the members of the population will take on the given values; correspondingly, the narrower areas represent the lower probabilities.
The model performance reflects the accuracy of the model, and the mean absolute error (MAE), root mean-squared error (RMSE), and the coefficient of determination ( R 2 ) are the metrics used in this study to assess the performance. The statistical equations for these metrics are given in Equations (1)–(3), where Pi is the predicted value obtained with the model and Ti is the measured value from the PEMS. P ¯ i is the average of the predicted values for the entire dataset.
M A E = 1 n   i = 1 n T i P i
R M S E = 1 n   i = 1 n T i P i 2  
R 2 = 1   i = 1 n P i T i 2 i = 1 n P i ¯ T i 2

3. Results

3.1. Features Importance Analysis

The detailed score rankings of the feature importance for the H2 model are shown in Figure 5 and Table 1. The summation of all of the feature scores is 1. It is noted that the top-four features are the ER, VM, LHV, and C content, each of which contributes an essential percentage of more than 10%. According to the permutation importance analysis, the most essential input feature is the ER, which contributes 29% to the H2 model. In addition, the lowest score of the H2 model is the N content, which contributes 1%.

3.2. Model Performance

In this study, a total of 11 input features are selected for the H2 model, and the target is H2 (% volume). The R2 value is essential for evaluating how well the model fits the raw data. Upon the completion of the feature importance analysis, the number of input features is selected, according to the ranks in Figure 5, to construct the different H2 models, so as to determine the best model. For the H2 model, the R2 value of the training data are between 0.93 and 0.97, by selecting different numbers of features. It is noted that when more numbers of the input feature are assigned, the higher the R2 values. For the test data, the R2 values are between 0.93 and 0.95, with a trend similar to that of the training data. When more input features are used, the higher value of R2 is obtained. However, for the validation data, the R2 values vary from 0.81 to 0.92, in a broad range; this represents the model performance evaluation results with different numbers of input features. Furthermore, the R2 value of the entire model has the same trend as those of the training and test data. These details are shown in Figure 6. It is therefore expected that more characteristics of the data can be captured with a larger number of features.
The other statistical results for the H2 model performance are listed in Table 2. The RMSE and MAE values of the models using different numbers of features (top-3 and all 11 features) are listed. These two performance statistics show similar trends as the R2 results.

4. Discussion

The study by El-Shafay et al. [58] showed that the percentage of the constituent gas H2 increases with the increasing R2 value of the gasification temperature. In this study, the H2 XGBoost regression models using different numbers of input features were built successfully. The model performance was excellent at predicting the hydrogen gas composition after gasification. It is also noted that the H2 model accuracy was validated by comparisons with the results from the literature.
Table 3 shows the performance comparison between the H2 model and the ANN model by Ozonoh et al. [16] using 11 input features. The results show that the R2 values of the XGBoost model are higher than those of Ozonoh’s ANN model, except for the test data. The reason for this is that the percentages of the test data used in these two algorithms are different. The results also show that the performance of the H2 XGBoost regression model is better than that of the ANN model because the lower MSE value indicates a better model performance. However, the opposite is true for the R2 value.
Furthermore, the hydrogen gas composition production was validated by comparison with the results of El-Shafay et al. [58] to verify the H2 XGBoost regression model. The proximate and ultimate analyses of the sawdust pellets, based on [58], are listed in Table 4. Figure 7a,b show that the H2 volume percentages vary at different ER and temperature values. The black squares represent the results of El-Shafay et al., and the red circles represent the results of the H2 model. In Figure 7a, it is noted that the deviations between the two models at a temperature of 900 °C, are relatively small, but these differences are more significant at 600 and 800 °C at ER = 0.3. The trend of the increasing temperature may also result in a higher H2 production linearly. Figure 7b shows that the deviations of the two models are significant at ER = 0.4. However, the trend of the H2 model is observed to be similar to the results of El-Shafay et al.

5. Conclusions

In this study, the feature importance analysis of the H2 production model of biomass gasification was built successfully. The top-four features, according to importance, were determined as the equivalence ratio (ER), volatile matter (VM), lower heating value (LHV), and carbon (C) content. The model performances using different numbers of input features in training the H2 model were also investigated, and the results show that selecting all 11 input features produced the best model performance, with an R2 value of 0.96, because more data characteristics could be captured. For reducing the simulation cost, one may consider using the top-three input features, namely ER, VM, and LHV, in the model training while still obtaining an excellent performance (R2 = 0.93). Furthermore, a comparison of the performance between the XGBoost regression model and Ozonoh’s ANN model was performed, and the XGBoost regression model was observed to outperform Ozonoh’s ANN model. Thus, the application of the XGBoost regression model has been validated once again. The results show that the deviations between the H2 production model and the findings of El-Shafay et al. are close, especially at a temperature of 900 °C at ER = 0.4.

Author Contributions

Conceptualization, H.-T.W. and H.-Y.W.; methodology, H.-T.W.; software, H.-T.W.; validation, H.-T.W. and K.-C.L.; writing—original draft preparation, H.-T.W.; writing—review and editing, H.-T.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Alam, O.; Qiao, X. An in-depth review on municipal solid waste management, treatment and disposal in Bangladesh. Sustain. Cities Soc. 2020, 52, 101775. [Google Scholar] [CrossRef]
  2. Malkow, T. Novel and innovative pyrolysis and gasification technologies for energy efficient and environmentally sound MSW disposal. Waste Manag. 2004, 24, 53–79. [Google Scholar] [CrossRef] [PubMed]
  3. Basu, P. Biomass Gasification and Pyrolysis: Practical Design and Theory; Academic Press: New York, NY, USA, 2010. [Google Scholar]
  4. Baruah, D.; Baruah, D. Modeling of biomass gasification: A review. Renew. Sustain. Energy Rev. 2014, 39, 806–815. [Google Scholar] [CrossRef]
  5. Ullah, H.S.; Khushnood, R.A.; Ahmad, J.; Farooq, F. Predictive modelling of sustainable lightweight foamed concrete using machine learning novel approach. J. Build. Eng. 2022, 56, 104746. [Google Scholar] [CrossRef]
  6. Khan, M.A.; Farooq, F.; Javed, M.F.; Zafar, A.; Ostrowski, K.A.; Aslam, F.; Malazdrewicz, S.; Maślak, M. Simulation of depth of wear of eco-friendly concrete using machine learning based computational approaches. Materials 2021, 15, 58. [Google Scholar] [CrossRef]
  7. Ullah, H.S.; Khushnood, R.A.; Farooq, F.; Ahmad, J.; Vatin, N.I.; Ewais, D.Y.Z. Prediction of compressive strength of sustainable foam concrete using individual and ensemble machine learning approaches. Materials 2022, 15, 3166. [Google Scholar] [CrossRef]
  8. Farooq, F.; Czarnecki, S.; Niewiadomski, P.; Aslam, F.; Alabduljabbar, H.; Ostrowski, K.A.; Śliwa-Wieczorek, K.; Nowobilski, T.; Malazdrewicz, S. A comparative study for the prediction of the compressive strength of self-compacting concrete modified with fly ash. Materials 2021, 14, 4934. [Google Scholar] [CrossRef]
  9. Akbar, A.; Farooq, F.; Shafique, M.; Aslam, F.; Alyousef, R.; Alabduljabbar, H. Sugarcane bagasse ash-based engineered geopolymer mortar incorporating propylene fibers. J. Build. Eng. 2021, 33, 101492. [Google Scholar] [CrossRef]
  10. Ahmad, A.; Farooq, F.; Niewiadomski, P.; Ostrowski, K.; Akbar, A.; Aslam, F.; Alyousef, R. Prediction of compressive strength of fly ash based concrete using individual and ensemble algorithm. Materials 2021, 14, 794. [Google Scholar] [CrossRef]
  11. Farooq, F.; Ahmed, W.; Akbar, A.; Aslam, F.; Alyousef, R. Predictive modeling for sustainable high-performance concrete from industrial wastes: A comparison and optimization of models using ensemble learners. J. Clean. Prod. 2021, 292, 126032. [Google Scholar] [CrossRef]
  12. Puig-Arnavat, M.; Hernández, J.A.; Bruno, J.C.; Coronas, A. Artificial neural network models for biomass gasification in fluidized bed gasifiers. Biomass Bioenergy 2013, 49, 279–289. [Google Scholar] [CrossRef]
  13. De Souza, M.; Couceiro, L.; Barreto, A.; Quitete, C.P.B. Neural network based modeling and operational optimization of biomass gasification processes. In Gasification for Practical Applications; Books on Demand: Norderstedt, Germany, 2012; pp. 297–312. [Google Scholar]
  14. Wen, H.-T.; Lu, J.-H.; Phuc, M.-X. Applying artificial intelligence to predict the composition of syngas using rice husks: A comparison of artificial neural networks and gradient boosting regression. Energies 2021, 14, 2932. [Google Scholar] [CrossRef]
  15. Baruah, D.; Baruah, D.; Hazarika, M. Artificial neural network based modeling of biomass gasification in fixed bed downdraft gasifiers. Biomass Bioenergy 2017, 98, 264–271. [Google Scholar] [CrossRef]
  16. Ozonoh, M.; Oboirien, B.; Higginson, A.; Daramola, M. Dataset from estimation of gasification system efficiency using artificial neural network technique. Chem. Data Collect. 2020, 25, 100321. [Google Scholar] [CrossRef]
  17. Wen, H.-T.; Lu, J.-H.; Jhang, D.-S. Features Importance Analysis of Diesel Vehicles’ NOx and CO2 Emission Predictions in Real Road Driving Based on Gradient Boosting Regression Model. Int. J. Environ. Res. Public Health 2021, 18, 13044. [Google Scholar] [CrossRef]
  18. Jiang, H.; He, Z.; Ye, G.; Zhang, H. Network intrusion detection based on PSO-XGBoost model. IEEE Access 2020, 8, 58392–58401. [Google Scholar] [CrossRef]
  19. Ogunleye, A.; Wang, Q.-G. XGBoost model for chronic kidney disease diagnosis. IEEE/ACM Trans. Comput. Biol. Bioinform. 2019, 17, 2131–2140. [Google Scholar] [CrossRef]
  20. Wang, C.; Deng, C.; Wang, S. Imbalance-XGBoost: Leveraging weighted and focal losses for binary label-imbalanced classification with XGBoost. Pattern Recognit. Lett. 2020, 136, 190–197. [Google Scholar] [CrossRef]
  21. Osman, A.I.A.; Ahmed, A.N.; Chow, M.F.; Huang, Y.F.; El-Shafie, A. Extreme gradient boosting (Xgboost) model to predict the groundwater levels in Selangor Malaysia. Ain Shams Eng. J. 2021, 12, 1545–1556. [Google Scholar] [CrossRef]
  22. Ramaneswaran, S.; Srinivasan, K.; Vincent, P.; Chang, C.-Y. Hybrid inception v3 XGBoost model for acute lymphoblastic leukemia classification. Comput. Math. Methods Med. 2021, 2021, 2577375. [Google Scholar] [CrossRef]
  23. Yang, S.; Li, B.; Zheng, J.; Kankala, R.K. Biomass-to-Methanol by dual-stage entrained flow gasification: Design and techno-economic analysis based on system modeling. J. Clean. Prod. 2018, 205, 364–374. [Google Scholar] [CrossRef]
  24. Pei, H.; Wang, X.; Dai, X.; Jin, B.; Huang, Y. A novel two-stage biomass gasification concept: Design and operation of a 1.5 MWth demonstration plant. Bioresour. Technol. 2018, 267, 102–109. [Google Scholar] [CrossRef] [PubMed]
  25. La Villetta, M.; Costa, M.; Cirillo, D.; Massarotti, N.; Vanoli, L. Performance analysis of a biomass powered micro-cogeneration system based on gasification and syngas conversion in a reciprocating engine. Energy Convers. Manag. 2018, 175, 33–48. [Google Scholar] [CrossRef]
  26. Rodriguez, R.; Mazza, G.; Fernandez, A.; Saffe, A.; Echegaray, M. Prediction of the lignocellulosic winery wastes behavior during gasification process in fluidized bed: Experimental and theoretical study. J. Environ. Chem. Eng. 2018, 6, 5570–5579. [Google Scholar] [CrossRef] [Green Version]
  27. De Sales, C.A.V.B.; Maya, D.M.Y.; Lora, E.E.S.; Jaén, R.L.; Reyes, A.M.M.; González, A.M.; Andrade, R.V.; Martínez, J.D. Experimental study on biomass (eucalyptus spp.) gasification in a two-stage downdraft reactor by using mixtures of air, saturated steam and oxygen as gasifying agents. Energy Convers. Manag. 2017, 145, 314–323. [Google Scholar] [CrossRef]
  28. Konstantinou, E.; Marsh, R. Experimental study on the impact of reactant gas pressure in the conversion of coal char to combustible gas products in the context of Underground Coal Gasification. Fuel 2015, 159, 508–518. [Google Scholar] [CrossRef] [Green Version]
  29. He, W.; Park, C.S.; Norbeck, J.M. Rheological study of comingled biomass and coal slurries with hydrothermal pretreatment. Energy Fuels 2009, 23, 4763–4767. [Google Scholar] [CrossRef] [Green Version]
  30. Lv, P.; Yuan, Z.; Wu, C.; Ma, L.; Chen, Y.; Tsubaki, N. Bio-syngas production from biomass catalytic gasification. Energy Convers. Manag. 2007, 48, 1132–1139. [Google Scholar] [CrossRef]
  31. Rapagnà, S.; Jand, N.; Kiennemann, A.; Foscolo, P.U. Steam-gasification of biomass in a fluidised-bed of olivine particles. Biomass Bioenergy 2000, 19, 187–197. [Google Scholar] [CrossRef]
  32. Gai, C.; Dong, Y. Experimental study on non-woody biomass gasification in a downdraft gasifier. Int. J. Hydrog. Energy 2012, 37, 4935–4944. [Google Scholar] [CrossRef]
  33. Loha, C.; Chatterjee, P.K.; Chattopadhyay, H. Performance of fluidized bed steam gasification of biomass–modeling and experiment. Energy Convers. Manag. 2011, 52, 1583–1588. [Google Scholar] [CrossRef]
  34. Ong, Z.; Cheng, Y.; Maneerung, T.; Yao, Z.; Tong, Y.W.; Wang, C.H.; Dai, Y. Co-gasification of woody biomass and sewage sludge in a fixed-bed downdraft gasifier. AIChE J. 2015, 61, 2508–2521. [Google Scholar] [CrossRef]
  35. Serrano, D.; Kwapinska, M.; Horvat, A.; Sánchez-Delgado, S.; Leahy, J.J. Cynara cardunculus L. gasification in a bubbling fluidized bed: The effect of magnesite and olivine on product gas, tar and gasification performance. Fuel 2016, 173, 247–259. [Google Scholar] [CrossRef] [Green Version]
  36. Miccio, F.; Piriou, B.; Ruoppolo, G.; Chirone, R. Biomass gasification in a catalytic fluidized reactor with beds of different materials. Chem. Eng. J. 2009, 154, 369–374. [Google Scholar] [CrossRef]
  37. Thakkar, M.; Makwana, J.; Mohanty, P.; Shah, M.; Singh, V. In bed catalytic tar reduction in the autothermal fluidized bed gasification of rice husk: Extraction of silica, energy and cost analysis. Ind. Crops Prod. 2016, 87, 324–332. [Google Scholar] [CrossRef]
  38. Karmakar, M.; Mandal, J.; Haldar, S.; Chatterjee, P. Investigation of fuel gas generation in a pilot scale fluidized bed autothermal gasifier using rice husk. Fuel 2013, 111, 584–591. [Google Scholar] [CrossRef]
  39. Subramanian, P.; Sampathrajan, A.; Venkatachalam, P. Fluidized bed gasification of select granular biomaterials. Bioresour. Technol. 2011, 102, 1914–1920. [Google Scholar] [CrossRef]
  40. Mansaray, K.; Ghaly, A.; Al-Taweel, A.; Hamdullahpur, F.; Ugursal, V. Air gasification of rice husk in a dual distributor type fluidized bed gasifier. Biomass Bioenergy 1999, 17, 315–332. [Google Scholar] [CrossRef]
  41. Salam, P.A.; Bhattacharya, S. A comparative study of charcoal gasification in two types of spouted bed reactors. Energy 2006, 31, 228–243. [Google Scholar]
  42. Arena, U.; Di Gregorio, F. Energy generation by air gasification of two industrial plastic wastes in a pilot scale fluidized bed reactor. Energy 2014, 68, 735–743. [Google Scholar] [CrossRef]
  43. Lahijani, P.; Zainal, Z.A. Gasification of palm empty fruit bunch in a bubbling fluidized bed: A performance and agglomeration study. Bioresour. Technol. 2011, 102, 2068–2076. [Google Scholar] [CrossRef] [PubMed]
  44. Hu, J.; Shao, J.; Yang, H.; Lin, G.; Chen, Y.; Wang, X.; Zhang, W.; Chen, H. Co-gasification of coal and biomass: Synergy, characterization and reactivity of the residual char. Bioresour. Technol. 2017, 244, 1–7. [Google Scholar] [CrossRef] [PubMed]
  45. Campoy, M.; Gomez-Barea, A.; Vidal, F.B.; Ollero, P. Air–steam gasification of biomass in a fluidised bed: Process optimisation by enriched air. Fuel Process. Technol. 2009, 90, 677–685. [Google Scholar] [CrossRef]
  46. George, J.; Arun, P.; Muraleedharan, C. Assessment of producer gas composition in air gasification of biomass using artificial neural network model. Int. J. Hydrog. Energy 2018, 43, 9558–9568. [Google Scholar] [CrossRef]
  47. Thengane, S.K.; Gupta, A.; Mahajani, S.M. Co-gasification of high ash biomass and high ash coal in downdraft gasifier. Bioresour. Technol. 2019, 273, 159–168. [Google Scholar] [CrossRef]
  48. Nipattummakul, N.; Ahmed, I.I.; Kerdsuwan, S.; Gupta, A.K. Hydrogen and syngas production from sewage sludge via steam gasification. Int. J. Hydrog. Energy 2010, 35, 11738–11745. [Google Scholar] [CrossRef]
  49. Dascomb, J.; Krothapalli, A. Hydrogen-Enriched Syngas from Biomass Steam Gasification for Use in Land-Based Gas Turbine Engines. In Novel Combustion Concepts for Sustainable Energy Development; Springer: New York, NY, USA, 2014; pp. 89–110. [Google Scholar]
  50. Lv, P.; Xiong, Z.; Chang, J.; Wu, C.; Chen, Y.; Zhu, J. An experimental study on biomass air–steam gasification in a fluidized bed. Bioresour. Technol. 2004, 95, 95–101. [Google Scholar] [CrossRef]
  51. Guo, F.; Dong, Y.; Dong, L.; Guo, C. Effect of design and operating parameters on the gasification process of biomass in a downdraft fixed bed: An experimental study. Int. J. Hydrog. Energy 2014, 39, 5625–5633. [Google Scholar] [CrossRef]
  52. Wang, L.Q.; Chen, Z.S. Experimental studies on H2-rich gas production by Co-gasification of coal and biomass in an intermittent fluidized bed reactor. In Advanced Materials Research; Trans Tech Publications Ltd.: Stafa-Zurich, Switzerland, 2013; pp. 1127–1131. [Google Scholar]
  53. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
  54. Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
  55. Chen, R.-C.; Caraka, R.E.; Arnita, N.E.G.; Pomalingo, S.; Rachman, A.; Toharudin, T.; Tai, S.-K.; Pardamean, B. An end to end of scalable tree boosting system. Sylwan 2020, 165, 1–11. [Google Scholar]
  56. Lee, C.; Lee, S. Exploring the Contributions by Transportation Features to Urban Economy: An Experiment of a Scalable Tree-Boosting Algorithm with Big Data. Land 2022, 11, 577. [Google Scholar] [CrossRef]
  57. Nalluri, M.; Pentela, M.; Eluri, N.R. A Scalable Tree Boosting System: XG Boost. Int. J. Res. Stud. Sci. Eng. Technol. 2020, 7, 36–51. [Google Scholar]
  58. El-Shafay, A.; Hegazi, A.; Zeidan, E.; El-Emam, S.; Okasha, F. Experimental and numerical study of sawdust air-gasification. Alex. Eng. J. 2020, 59, 3665–3679. [Google Scholar] [CrossRef]
Figure 1. Fixed-bed updraft gasifier.
Figure 1. Fixed-bed updraft gasifier.
Inventions 07 00126 g001
Figure 2. Statistical analyses between the model input features and the target.
Figure 2. Statistical analyses between the model input features and the target.
Inventions 07 00126 g002aInventions 07 00126 g002bInventions 07 00126 g002c
Figure 3. Flowchart of the proposed method for composition gas prediction in gasification.
Figure 3. Flowchart of the proposed method for composition gas prediction in gasification.
Inventions 07 00126 g003
Figure 4. Violin plots of the model input features and target H2.
Figure 4. Violin plots of the model input features and target H2.
Inventions 07 00126 g004
Figure 5. Feature importance detailed scores of the H2 model.
Figure 5. Feature importance detailed scores of the H2 model.
Inventions 07 00126 g005
Figure 6. Performance R2 statistics results of the H2 model.
Figure 6. Performance R2 statistics results of the H2 model.
Inventions 07 00126 g006
Figure 7. Comparisons of the results of the H2 [%] model prediction with those of El-Shafay et al. [58] for different ER and temperature.
Figure 7. Comparisons of the results of the H2 [%] model prediction with those of El-Shafay et al. [58] for different ER and temperature.
Inventions 07 00126 g007
Table 1. Feature importance rank of the H2 model.
Table 1. Feature importance rank of the H2 model.
FeatureRank
ER1
VM2
LHV3
C4
Ash5
S6
H7
MC8
O9
Tg10
N11
Table 2. Performance RMSE and MAE statistic results of the H2 model.
Table 2. Performance RMSE and MAE statistic results of the H2 model.
Selected FeaturesRMSEMAE
All features2.641.51
Top 10 features2.741.52
Top 9 features3.131.82
Top 8 features3.171.86
Top 7 features3.131.81
Top 6 features3.792.52
Top 5 features3.82.51
Top 4 features3.752.45
Top 3 features3.732.39
Table 3. Performance comparison between the H2 model and that of Ozonoh et al. [16].
Table 3. Performance comparison between the H2 model and that of Ozonoh et al. [16].
Selected FeaturesR2MSE
TrainTestValidationAll
H2 model0.970.950.920.966.96
Ozonoh et al.0.970.970.960.958.51
Table 4. Proximate and ultimate analyses of the sawdust pellets [58].
Table 4. Proximate and ultimate analyses of the sawdust pellets [58].
Proximate Analysis(wt %)
Moisture8.8
Ash0.58
Volatile74.61
Fixed carbon16.01
Ultimate Analysis(wt %)
Carbon47.37
Hydrogen6.3
Oxygen42
Nitrogen0.12
Sulfur0.0
Heating value17.95 MJ/kg
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Wen, H.-T.; Wu, H.-Y.; Liao, K.-C. Using XGBoost Regression to Analyze the Importance of Input Features Applied to an Artificial Intelligence Model for the Biomass Gasification System. Inventions 2022, 7, 126. https://doi.org/10.3390/inventions7040126

AMA Style

Wen H-T, Wu H-Y, Liao K-C. Using XGBoost Regression to Analyze the Importance of Input Features Applied to an Artificial Intelligence Model for the Biomass Gasification System. Inventions. 2022; 7(4):126. https://doi.org/10.3390/inventions7040126

Chicago/Turabian Style

Wen, Hung-Ta, Hom-Yu Wu, and Kuo-Chien Liao. 2022. "Using XGBoost Regression to Analyze the Importance of Input Features Applied to an Artificial Intelligence Model for the Biomass Gasification System" Inventions 7, no. 4: 126. https://doi.org/10.3390/inventions7040126

APA Style

Wen, H. -T., Wu, H. -Y., & Liao, K. -C. (2022). Using XGBoost Regression to Analyze the Importance of Input Features Applied to an Artificial Intelligence Model for the Biomass Gasification System. Inventions, 7(4), 126. https://doi.org/10.3390/inventions7040126

Article Metrics

Back to TopTop