Image-Based Model for Assessment of Wood Chip Quality and Mixture Ratios

Plankenbühler, Thomas; Kolb, Sebastian; Grümer, Fabian; Müller, Dominik; Karl, Jürgen

doi:10.3390/pr8060728

Open AccessArticle

Image-Based Model for Assessment of Wood Chip Quality and Mixture Ratios

by

Thomas Plankenbühler

^1,*

,

Sebastian Kolb

¹

,

Fabian Grümer

²,

Dominik Müller

¹ and

Jürgen Karl

¹

Chair of Energy Process Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Fürther Straße 244f, D-90429 Nürnberg, Germany

²

Independent Researcher, D-90439 Nürnberg, Germany

^*

Author to whom correspondence should be addressed.

Processes 2020, 8(6), 728; https://doi.org/10.3390/pr8060728

Submission received: 30 May 2020 / Revised: 12 June 2020 / Accepted: 15 June 2020 / Published: 23 June 2020

(This article belongs to the Special Issue Progress in Energy Conversion Systems and Emission Control)

Download

Browse Figures

Versions Notes

Abstract

:

This article focuses on fuel quality in biomass power plants and describes an online prediction method based on image analysis and regression modeling. The main goal is to determine the mixture fraction from blends of two wood chip species with different qualities and properties. Starting from images of both fuels and different mixtures, we used two different approaches to deduce feature vectors. The first one relied on integral brightness values while the latter used spatial texture information. The features were used as input data for linear and non-linear regression models in nine training classes. This permitted the subsequent prediction of mixture ratios from prior unknown similar images. We extensively discuss the influence of model and image selection, parametrization, the application of boosting algorithms and training strategies. We obtained models featuring predictive accuracies of R² > 0.9 for the brightness-based model and R² > 0.8 for the texture based one during the validation tests. Even when reducing the data used for model training down to two or three mixture classes—which could be necessary or beneficial for the industrial application of our approach—sampling rates of n < 5 were sufficient in order to obtain significant predictions.

Keywords:

biomass; fuel quality; regression modeling; machine learning; image analysis; biomass power plant

Graphical Abstract

1. Introduction

Solid biomass combustion plants—from kilowatt-sized furnaces for domestic heating to combined heat and power plants in the megawatt range—suffer from a decisive drawback in comparison to many other (mainly fossil) alternatives due to the low quality and the challenging properties of the utilized feedstock. Inappropriate fuel quality drastically affects a biomass furnace’s behavior: Subpar biomass reduces the lifespan of a furnace and leads to operational challenges, such as corrosion, ash deposition [1,2,3,4,5] as well as increased gaseous and particulate emissions [6]. The majority of industrial solid biomass combustion plants rely on woody biomass (commonly known as wood chips) as feedstock. In general, wood chips and their properties are well regulated and subject to standards, such as ISO 17225 (replacing the outdated European Standard EN 14961-4 and the still wide-spread Austrian ÖNORM M7133 used among European plant operators) which defines distinct fuel quality classes based on several parameters, such as particle size distributions, moisture and ash content [7]. These norms are often used for definitions, agreements and contract management along the wood chip supply chain. However, in the biomass plant operator’s daily practice, even the defined limits (e.g., regarding fine particle content or ash content) are regularly breached and/or not controlled. Biomass boilers running on wood chips are thus challenged by different feedstock qualities due to a multitude of different suppliers and fluctuations between individual batches. Furthermore, economic needs often act as an important driving force towards cheaper biomass resources, such as residual or waste wood, shrubs, bark or alternatives, such as straw or short rotation crops. As a consequence, operators often use mixtures of “high” and “low” quality feedstock [8,9]. Finally, there are also seasonal fluctuations in the moisture content, which may vary between 20 and 45 weight-% from summer to winter.

All these variations constitute a major difference between biomass furnaces and plants running on coal. The latter are typically tailored or at least adapted to a certain narrow range of fuel qualities and properties. Thus, the variety of potential operational points is by far higher in biomass combustion. Many operators use empiric/experience-based parameters, such as estimated (or guessed) moisture or fine particle contents, as input parameters for the process control system and hope for stable operation. Yet, most of the time, biomass plants react to changes in feedstock quality after the operator observes a change in measured values in their control system.

This highlights the demand for reproducible methods in order to assess fuel quality parameters prior to the combustion process itself. Such an approach would be possible to proactively control and adapt a biomass furnace to future conditions. As a result, this mitigates the risk or the amount of slag and ash deposits and enhances the plant’s performance both in efficiencies and economic parameters.

A promising method for the pre-evaluation of biomass feedstock is optical analysis (in the most basic case, image capturing through photography) followed by image evaluation. Such a technology is capable of running continuously without any need for human intervention or interaction (e.g., by observing fuel conveyor belts, stokers or chain conveyors). Moreover, it is a non-intrusive measurement method which can be easily retrofitted into the fuel feeding systems of biomass plants. This potentially enables an a priori determination of feedstock properties.

The contact-free analysis of biomass feedstock has been a subject of more and more interest over recent years. Back in 2001, Swedish researchers stated that continuous image analysis in order to characterize biomass fuels was too time consuming and “tedious” [10]. However, due to the increase in computational resources and novel technology, this has changed drastically. The literature shows a variety of image evaluation techniques for multiple purposes and applications. The authors in [11] give a good overview on the optical analysis of particle properties; however, they focus on individual particles instead of those in bulk. Rezaei et al. examine single wood particles with microscopes and scanners in [12], focusing on particle properties after a grinding/milling process. Igathinathane et al. present a methodology in order to understand, improve and potentially replace the sieving of biomass particles via computer vision in [13]. The authors state that image analysis potentially hurdles the drawbacks of sieving irregularly shaped particles, such as fibers. In [14], Ding et al. describe an approach in order to characterize wood chips (as a resource for paper production) from images using artificial neural networks and regression models based on color features, size determination by means of granulometry and near infrared (NIR) sensors for moisture detection. Since then, NIR became a commonly used technique, especially for online moisture measurements [15,16,17,18,19]. These approaches are often combined with regression modeling or artificial neural networks/deep learning approaches. Tao et al. [20], for instance, describe a methodology based on IR spectra and regression modelling in order to characterize fuels regarding their chemical constituents and heating value. In [21,22,23], the authors aim at determining biomass heating values by means of artificial neural networks, support vector machines and random forest models, based on measured particle properties as input parameters. Machine learning models can further be used in order to tackle and forecast the operational problems of biomass furnaces, identifying stable and beneficial operational points [24].

Ongoing research at FAU aims at the development of an online fuel quality estimator, based on images of raw feedstock prior to the combustion process. The system targets industrial combustion plants of all sizes up to the megawatt range. In order to address small-scale (i.e., in the 100 kW range) furnaces as well, we based our methodology on consumer-grade hardware.

This study aims to examine the viability of an image-based assessment of mixing ratios of two distinct varieties of woody biomass. Storing fuel in at least two stockpiles of feedstock labelled as “high” and “low” quality, respectively, is a common practice for biomass plant operators. As mentioned above, this allows for the mixture of different grades of feedstock in order to tailor fuel with acceptable, yet economic, parameters and properties. Our goal is to determine the mixture fraction of two fuels prior to its feeding into a combustion chamber. For this task, we used a machine learning/regression modeling approach, based on feature vectors derived from image data taken in a lab-scale environment. We discuss two different approaches for feature vector generation—one based on spatially-resolved, one relying on integral information—and several linear and non-linear regression models. The modeling results are discussed with respect to their statistical accuracies, parametrization and the training strategy. Eventually, since the approach ought to be applicable for biomass furnaces in the megawatt range, we evaluate several simplified training approaches based on few input parameters which can be applied during the ongoing operation of biomass power plants. Since the modeling accuracy suffers from these simplifications, we estimate necessary sampling sizes in order to maintain an appropriate predictive quality. This study is, therefore, a first and a major step towards the development of a proactive biomass furnace control system with respect to incoming feedstock.

2. Materials and Methods

2.1. Fuel Selection and Classification

For this work, we used feedstock from the Bad Mergentheim biomass combined heat and power (CHP) plant in Southern Germany running on wood chips (WC) and forest residual wood (FR). The operator intended to run on a 50/50 mixture; however, since “mixing” happens with a bulldozer on the yard, the actual composition was subject to fluctuations. We took samples for the optical analysis from the WC and FR stockpiles in August 2019. WC principally consist of pine stem wood from the German Tauberfranken region with low shares of bark and branches. In contrast, FR is a mixture of soft and hardwood featuring a high amount of branches, leaves, needles and shrubs. Both fractions can be visually distinguished. Due to the summertime sampling and the preceding yard storage, both of the fuels exhibited low water content in the range of 18.9 ± 0.9% (WC) and 22.6 ± 1.0% (FR), based on nine measurements over the batch size. WC featured a lower ash content of 3.8 ± 1.2% compared to FR (5.9 ± 0.2%). For both fuels, a sieve analysis revealed that between 88% and 97% of the fuel mass originated from particles larger than 5 mm, and 98.8 to 99.5% of the mass from particles larger than 1 mm. A major difference between WC and FR lay in their bulk densities, which we determined according to ISO 17828. With a bulk density of 222 ± 23 kg/m³, WC featured an approximately 75% higher value than FR (125 ± 12 kg/m³). Measuring the bulk density of a mixture (1:1) of these fuels led to 176 ± 9 kg/m³, a value lying close to the calculated average. The lower value of FR is not surprising, as it results from the generally higher aspect ratios and irregular shapes of individual particles, due to the high amounts of branches and bark. Table 1 sums up the determined fuel properties.

For this study, we created a total of nine different mixtures based on distinct volumetric ratios. Volumetric ratio was chosen due to the aforementioned real-life comparison process of mixing with a bulldozer. Besides the 0/100 and 100/0 mixture classes, we prepared ratios of 25/75, 33/66, 40/60, 50/50, 60/40, 66/33 and 75/25. Thus, there was an intended higher amount of data for mixtures close to the operator’s target mixture of 50/50. Figure 1 displays exemplary representatives from five of the fuel classes. In this article, we use the term “fuel quality” in order to address certain mixture ratios: 0% fuel quality refers to pure FR; 50% means a volumetric ratio of 1:1 (FR:WC); 100% is solely WC (i.e., the high quality feedstock).

2.2. Image Capturing

In order to capture representative fuel images, we used a closed box with two planar LED panels for artificial lighting. This ensured the absence of ambient light, thus enabling a representative image capturing process. The color of the LED could be altered; however, we observed no influence on the regression modelling results, especially since the evaluation happened on greyscale images anyway. Thus, all images were taken with green light from above as incident light. The camera was a SONY SNC-VB600B IP camera with a maximum resolution of 1280 × 1024 pixels. The decision towards a robust consumer-type camera was due to the project’s goal of implementing a similar setup into biomass-fired power plants and furnaces, and developing an economically feasible fuel analysis setup, even for small combustion plants. All of the pictures were taken through a window made of acrylic glass from a distance of 450 mm, capturing a total area of 480 × 360 mm. Figure 2 depicts the experimental setup. Image capturing happened automatically by means of a script written in the Python programming language. For each of the presented fuel classes, we took between 32 and 40 different photographs (325 in total), shuffling the samples between each shot. In order to avoid three-dimensional mixing effects due to different particle shape, the feedstock was spread in a layer of only approximately 5 cm height.

2.3. Image Processing

2.3.1. Preprocessing

Image preprocessing started with the application of a smoothing filter by means of a bilateral filter. This operation reduced unwanted noise while maintaining sharp edges. In comparison to other smoothing filters, such as the Linear or Gaussian Blur convolution, bilateral filtering recognizes and ignores high gradients in pixel intensities—especially occurring at edges—during smoothing by combining both a similarity/intensity and a distance/coordinate kernel [25]. Subsequently, the RGB (red, green, blue) images were transformed into 8-bit greyscale images.

2.3.2. Feature Vector Formulation

In order to successfully apply regression modelling to the image matrices, it was necessary to reduce the amount of information drastically by identifying, describing and quantifying one or several characteristic features. The so-called feature vector included the entirety of features and was used for model training.

In this case, we examined two different approaches in order to formulate feature vectors for their applicability. The first method relied on the greyscale (8-bit) histograms (i.e., representing the images’ pixel brightness distribution densities). In contrast to that, the second approach recognized the spatial distribution of the pixels by calculating textural features, known as Haralick features [26]. After the calculation of both features, all of the data were normalized. Data standardization (i.e., dividing the data by the standard deviation after subtracting the respective mean value) was also evaluated, but it was not applied since it did not enhance the results, but rather increased the noise.

2.3.3. Histogram-Based Feature Vector

The 8-bit greyscale image potentially consisted of 2⁸ = 256 different values, so calculating the histogram reduced the amount of information from width x height (pixels) to 256. The feature vector’s size could decrease even further by applying different techniques, such as histogram equalization (i.e., neighboring brightness values aggregated into bigger classes) or simply ignoring low-intensity histogram areas (in photographs often close to 0 or 255; i.e., pure white and dark black).

Figure 3 depicts the averaged normalized histograms of the 0, 50 and 100% fuel quality classes. The fuel images exhibit close to no pixel values in the highest and lowest value ranges, so these were neglected for the feature vector formulation. Subsequently, the histogram was reduced to nine features representing pixel values from 90 to 180 in equidistant steps (with feature 1 at 90 pixels and feature 9 at 180 pixels); these data were exported to the image’s feature vector. This approach is the result of an examination (which is not discussed in detail here) of the behavior different possible feature vectors extracted from the histogram; the chosen method leads to the most promising results. Figure 4 displays the resulting mean values for the nine fuel quality classes and their standard deviations. The latter thus indicates the typical deviations from the average values. Please note that the y-axis contains the data (with pixel counts as unit) before normalization for reasons of better visibility. The feature vector training takes place with normalized data, though.

2.3.4. Texture-Based Feature Vector

The calculation of the Haralick texture features happened analogous to [26], based on greyscale images and the co-occurrence matrix G:

G = [\begin{matrix} p (1, 1) & \dots & p (1, N_{g}) \\ ⋮ & ⋱ & ⋮ \\ p (N_{g}, 1) & \dots & p (N_{g}, N_{g}) \end{matrix}]

(1)

where N_g is the total number of an image’s grey values and p(i,j) describes the probability that a pixel i is located in adjacency to a pixel with a value of j.

Haralick et al. propose a total set of 14 different features which are a measure of the spatial relationship of the pixels [26]. Most of them are obtained by calculating the sum of characteristic information over the co-occurrence matrix p(i,j). This includes statistical information, such as the variance, average, correlation coefficients and the sum of squares of pixel intensities, as well as moments and the image’s entropy. [26]

We calculated the Haralick features in this work by means of its implementation in the “mahotas” Python package [27]. For stability reasons, this module included only the first 13 Haralick features (as is a common approach in the literature [28,29]), which are subsequently used as feature vectors in normalized form. Figure 5 depicts the resulting 13 normalized Haralick features for the different fuel classes, highlighting their mean values and the corresponding standard deviations.

2.4. Regression Model Selection and Setup

Although the training data fall into up to nine distinct classes, we regard the fuel mixture analysis as a regression problem instead of a task for classification. This is due to the circumstance that the real-world application will challenge the model with data which might lie in-between distinct classes. For this reason, interpolation between individual classes might become appropriate. In order to train a regression model, the data need to be subdivided into two subsets, a set used for training and model generation, and a test/holdout set in order to validate the resulting model with “unknown” data. For this work, we split the images into 80% for training and 20% for testing. To make the best use of the data set, consisting of 325 images, the 5-fold cross validation, a commonly used form of the k-fold cross validation, found application in the training process [30].

In the course of this work, we evaluated a total of seven different regression models under varying parametrization. Although the extracted features did not behave linearly, there was no very prominent nonlinearity either. Consequently, three different linear regression models were tested [30]:

Ordinary least squares Linear Regression
Lasso Linear Regression featuring L1 regularization
Ridge Linear Regression with L2 regularization

L1 and L2 regularizations refer to the exponent k = 1 or 2 in an additional penalty term in the loss function L, which potentially avoids overfitting (L1) or reduces the impact of irrelevant features (L2):

L = \sum_{i = 1}^{n} {({\hat{y}}_{i} - y_{i})}^{2} + α \sum_{i = 1}^{n} {| β_{i} |}^{k}

(2)

where y_i and

{\hat{y}}_{i}

are the actual and predicted data, respectively. For

α

= 0, Equation (2) equals the ordinary least squares loss function. The model generally seeks to minimize L by identifying (“learning”) appropriate coefficients

β_{i}

in an iterative process for the target function:

y_{i} = β_{0} + β_{1} x_{1} + β_{2} x_{2} + \dots + β_{i - 1} x_{i - 1} + β_{i} x_{i}

(3)

Furthermore, we applied four non-linear regression algorithms, more precisely four kinds of tree-based methods [30]:

Decision Tree Regression (DT)
Random Forest Regression (RF)
Gradient Boosting Regression (GB) and
Extra Tree Regression (ET).

These modeling algorithms rely on the development of one or several tree-shaped structures, starting from a root node and subsequent decision nodes (“parent nodes”) of a different hierarchy from where further branches (“child nodes”) originate until the respective section terminal node. The predicted value depends on the reached final node after finding a pathway from the root to a terminal node. Tree-type models are parametrized by defining or limiting the number of layers, branches or nodes, as well as the minimum sample number needed at a certain node in order to provoke the formulation of child nodes. The DT algorithm results in a single tree, while the RF and GB models consist of a large number of combined trees (with the difference between RF and GB lying in the moment of combination). Eventually, the ET algorithm is a RF derivative that does not split at certain optimized ratios of features at a point, but randomly. This may prevent overfitting and is often computationally cheaper due to the lack of one optimization step. The RF, GB and ET algorithms lead to so-called ensemble models, since they combine a group of weak learners in order to create one model with high predictive accuracy. For this purpose, they apply the bagging and boosting techniques. The first (i.e., bagging) refers to the parallel development of a high number of different modelling manifestations; the final predictions originate from the aggregated individual predictions. Boosting, on the other hand, is the sequential training of several weak regressors, which influence and correct upstream learners.

RF and ET apply bagging as an ensemble model. GB obviously uses boosting per se; regarding the other models, we trained and evaluated several boosted models using the AdaptiveBoosting (adaboost) algorithm [31] with the respective base estimator.

The models were used as implemented in the scikit-learn Python programming language module [32].

2.4.1. Model Evaluation and Error Estimation

The evaluation of the models’ performances relied on metrics commonly used in statistical analysis (cf. Kuhn and Johnson [30]). For the following mathematical representations, the data set n consisted of true or observed values y_i and its corresponding prediction or modelled value

{\hat{y}}_{i}

of the ith sample

{\bar{y}}_{i}

.

In order to assess the predictive quality, we used the coefficient of determination R²:

R^{2} (\hat{y}, y) = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - {\bar{y}}_{i})}^{2}} w i t h y_{i} = \frac{1}{n} \sum_{i = 1}^{n} y_{i}

(4)

R² is the squared correlation coefficient, yielding information about the amount of data points being well represented by the model; R² is consequently highly affected by outliers. Since the model training happens with 5-fold cross validation, the accuracy can be denoted as a mean value of R² and its standard deviation.

The mean squared error (MSE) and root mean squared error (RMSE) both indicate the scattering of predicted values around the real value. Since the dimension of RMSE is similar to the evaluated data, it is easy to assess and a quite intuitive parameter:

RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}

(5)

Besides the irreducible error (noise), MSE includes both the model’s bias and variance:

MSE = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2} = B i a s {(\hat{y}, y)}^{2} + V a r (y) + N o i s e

(6)

Low MSE values thus indicate some scattering and a lower risk of unwanted overfitting. Generally, there is a trade-off between bias and variance, which makes it impossible to minimize both parameters. The method of bias–variance decomposition can be used in order to evaluate the contribution of variance and bias by solving a quadratic equation, including the expected error E:

E [{(y - \hat{y})}^{2}] = {(y - E [\hat{y}])}^{2} + E [{(E [\hat{y}] - \hat{y})}^{2}] = B i a s {(\hat{y}, y)}^{2} + V a r (y)

(7)

2.4.2. Parameter Variations

In order to identify the most suitable model with respect to the modelling accuracy, we carried out a variation of numerical parameters, i.e., hyperparameter tuning. Optimizing with regard to the computational cheapness of a model is not of paramount relevance, since the calculation times for all models are generally <1 s and, therefore, are appropriate for the target application. However, as expected, the linear regression models were considerably faster than the tree-type models, with calculation times in the millisecond range on standard office computers.

For the hyperparameter tuning of the adaboost models, we varied the number of estimators and the learning rate, identifying optimal combinations. For the tree-type models, we further tuned the maximum depth. For the Ridge Linear regression, we varied the regularization strength in the range of 10-5 to 10-1.

Parameter variation happened in two steps: The first hyperparameter tuning (without extensive results plotting), focusing on solely the (mean) R²-value, revealed that a linear loss function during adaboost regression (the function used for updating adaboost-internal weights) never led to worse results than the corresponding square or exponential approach. Subsequently, due to the trade-off between the number of estimators and the learning rate, the second optimization happened with a fixed learning rate. After the intrinsic hyperparameter tuning, a total of 201 different parameter sets were used for model training.

2.5. Training Strategy

As reported above, the data set consisted of 9 distinct fuel quality classes. We followed three different approaches in order to train the regression models: First of all, we used all nine data sets, well aware of the fact that several classes lay in proximity to each other (e.g., 60% and 66%, 33% and 40%). This approach covered the broadest range of data; however, model training may have become flawed due to irregularities and stochastic effects during image capturing. A second and third training data set consisted of the images from the 0, 25, 50, 75, 100 and 0, 50, 100 classes only, respectively. This reduced the number of training images, but potentially mitigated overfitting. Eventually, we trained models with solely the 0 and 100 classes, interpolating between those two points. This reduction seemed rather odd, but followed an idea with regard to future applications: For model training in full-scale environments (e.g., megawatt boilers), providing distinct and reproducible fuel mixtures is an ambitious—if not impossible—task due to the high amount of material required and the challenging mixing process. In this case, it could be an option to train the optical system with pure low and high quality feedstock, respectively, while interpolating in between. Therefore, the 2- and 3-class models were used to test this hypothesis.

3. Results and Discussion

Figure 6 displays the result of the training process for both feature vectors (histogram- and texture-based) using different regression models and model parameters for models based on all 9 fuel mixture classes. The average 5-fold cross validation R² and the RMSE serve as evaluation criteria. The plot is divided into model-specific subsections, which are internally sorted by the R² value based on the validation data set. In general, both feature vectors are, in combination with a properly trained model, capable of reaching high predictive accuracy with R² > 0.8. The corresponding RMSE values lie below 10% (in units of mixture fraction or fuel quality). The best histogram-based model leads to a RMSE of 3.8% (DT model) for unknown validation data and the best texture-based one reaches 5.1% (Linear Lasso). Overall, the histogram-based approach exhibits higher accuracies and lower errors. It furthermore shows a lower discrepancy between the best and worst models of similar kind. This will be discussed in the following sections with respect to the model parametrization, input data selection and the training strategy.

3.1. Influence of Regression Model and Parametrization

Although most of the examined models display potentially high predictive accuracy, there are substantial differences. These, and their origin, will be discussed in this section, based on the training process with all nine fuel-mixture classes.

For all of the regression algorithms, there are both accurate and flawed models, as displayed in Figure 6. At first glance, this could be due to the model parametrization (e.g., boundary conditions, such as the adaboost algorithm, the deepness of the tree models or the regularization factor α). However, the evaluation shows that the parametrization does not explain most of the deviances (albeit it plays an important role for some models, which will be discussed later). Most of the models are robust to parametrization. Instead, these models’ performances heavily rely on the training process itself. This can be assessed by comparing a model’s training and validation accuracy.

Figure 7 displays this relation. Unsurprisingly, models with low predictive quality after the model training fail during the validation process as well. However, there are dysfunctional models with high training accuracy as well, especially for the histogram-based models with tree-type regression.

The training accuracies of the linear models are generally lower than one of the best performing tree-type models, yet they are more stable, with the R² lying in the range of 0.87 ± 0.006 for the histogram-based approach and 0.90 ± 0.007 for the texture-based approach. However, regarding the validation accuracy, many of these models are outranked by the best performing nonlinear models. When well-trained linear models exhibit low validation accuracy, this is due to a non-representative (or insufficient) provision of training data or due to an underfitted model. Indeed, the bias–variance decomposition shows a trend towards lower variances and higher bias for the underperforming linear models during validation. Consequently, the model either encounters unknown (as in “never seen before”) features during validation or features that did not find representation in the model parameters β_i. The first factor can be addressed by increasing the amount or quality of training data (or at least the share used for training); the latter by using more complex regression models, such as the tree-type models.

For these, including their boosted and bagged derivates, Figure 6 also depicts a big discrepancy regarding the prediction accuracy. There are both high and low quality predictions for each individual model, depending on the training process. The histogram-based approach leads to higher accuracies on average, as well as for the individual best model in comparison to the texture-based approach. The difference in the R² value amounts for 0.06–0.1. The adaboost algorithm improves the median accuracy by 0.03–0.10 for histogram-based regression. For texture-based regression, the median improvement is higher up to ΔR² = 0.20 for the DT model. This is due to the DT model being the only non-ensemble model, thus gaining the largest benefit from the boosting algorithm. In all cases, there is also no correlation of R² and the number of estimators. Increasing the trees deepness affects the prediction accuracy only for values up to 4. Higher values increase the computational needs without improving the accuracy. We consequently suggest choosing parameters leading to shorter calculation times (e.g., a limited number of tree layers (up to 4) and adaboost estimators (up to 200)).

Consequently, most of the examined regression models could theoretically be used for the real-world application of fuel recognition. Parametrization and, in particular, training, require an individual adaption and testing of the underlying model. Table 2 and Table 3 give an overview of the results obtained by the different regression algorithms.

3.2. Influence of the Training Data

As displayed, the high discrepancy between the best and the worst models—for both linear and tree-type models—is not the result of improved hyperparameter tuning, but the result of the images/input data used for training. The most influential factor for an individual model is the (random) selection of training and validation data. This becomes apparent when taking the standard deviation of the k-fold cross-validation R² into account (cf. the error bars in Figure 6). Depending on the (random) choice of the training folds, the regression leads to models of different predictive capabilities. While some trained subsets reach standard deviations of as low as 0.02 regarding R², certain models feature values of 0.2 or more; in these cases, the training results in flawed and insufficient models.

There are two major issues regarding the input data, leading to underperforming models:

First of all, the amount of training and validation data may be insufficient for this kind of problem. Depending on the random samples used for training and testing, the model then encounters either simple/known or unknown data/outliers. This could be addressed by drastically increasing the amount of input images and utilization of more complex models. However, this also risks overfitting and highly biased models.

On the other hand, the input images—more precisely, the applied image labels—might be misleading, and the source of errors as well. The manual fuel preparation, mixing and, most importantly, the image capturing process definitely affect the training process. The camera’s limited point-of-view probably results in images (and, subsequently, histograms and Haralick features) that are not perfect representatives of a certain class, albeit they are labelled as such. Besides the inhomogeneity, the fuel’s lumpiness also plays a role. For instance, large chunks of bright wood chips—that appear in the low-quality FR as well—will affect the predictions, shifting the results towards higher predicted quality, while patches with typically darker fine particles shift the prediction towards lower qualities. In addition to that, certain training classes lie close together (e.g., 25/75, 33/66 and 40/60 with only 7–8% deviation in fuel quality) so that the statistic error of sampling and image capturing lies in the same order of magnitude as the applied labels. Figure 8 shows some examples of prominent model failures for irregular images and respective photos with high accuracies. The flawed prediction of the 25/75 photograph in Figure 8 (left) visually appears to be darker than the correctly assigned prediction. The source of the error thus lies in the sampling process, not in the modelling itself.

Yet, with regard to the intended use, these kinds of errors must be accepted. Due to their size and a flawed mixing process, the targeted application thus faces similar challenges (i.e., fluctuating qualities, difficult validation and a limited field-view and point-of-view onto the surface of the feedstock material). Training with images taken from megawatt-scale industrial furnaces will consequently always suffer from similar drawbacks.

3.3. Influence of the Training Strategy

As mentioned, the training process in full-scale environments is challenging. Due to the yard-“mixing” of several cubic meters of feedstock with bulldozers in biomass power plants, training a model with exactly determined pairs of images and data labels (i.e., quantified defined mixture classes) is an impossible task. As described above, for this reason, we performed model training with a reduced amount of data and labels. The most elementary model relies on a two-point training process with pure WC and FR, respectively. This means that the whole range of possible mixtures relies on the interpolation capabilities of the underlying regressor.

This section’s analysis relies on the histogram-based feature vector. As models, we used a reduced selection of algorithms: GB due to its robustness, ET due to its enhanced interpolation capabilities (because of the random splits) and the adaboosted linear model for simplicity.

Figure 9 displays the corresponding results by comparing predicted and actual mixture fractions. Graph (a) shows a GB reference case, in which all 9 classes were used for training. In contrast, graph (b) relies on five distinct classes (shown in white), while the other categories (dark dots) are subject to interpolation. It becomes apparent that this model performs better for known classes. The standard deviation for the predictions lies in the range of σ = 0.22–5.60% (for fuel quality) around their average value for these classes. Unknown ones lie in the range of σ = 4.68–8.32%. However, the confidence of predictions can be enhanced by providing a sampling size n > 1. The sample size n could be increased—even in the envisaged application in industrial furnaces—by capturing and evaluating several different photos of an unknown fuel mixture. This mitigates the risk of capturing a non-representative sample and, furthermore, reduces the model’s prediction uncertainty.

The necessary minimum sample size of images n can be approximately estimated from the standard deviation σ according to Equation (8) under the assumption of normally distributed values [33]:

n \geq {(\frac{σ \cdot z}{e})}^{2}

(8)

z depends on the desired level of confidence. For 90% confident predictions, z equals z = 1.645 and it rises for higher levels, (e.g., z (95%) = 1.96). e represents the acceptable (margin of) error. This means that n different photos of a distinct fuel are necessary in order to estimate its fuel quality with a confidence of 90% (z = 1.645), while allowing an estimation error of up to e = 10 [%].

If all 9 classes are used for training, only a few samples are required: In order to reach a 90% confident prediction, this corresponds to sample sizes of up to 5 (for the worst case σ = 8.32 and an acceptable error of 6 [%]). However, for σ = 5 and an acceptable error in fuel quality deciles (e = 10 [%]), one sample is still sufficient for 95% confident predictions.

For a reduced number of classes, as shown in graphs (c) and (d), as models trained with three and two classes, respectively, the error significantly increased. Although these algorithms (GB and ET) tend to deliver predictions scattering around the suggested real values, standard deviations of σ > 15 hinder accurate predictions, even for higher samples sizes n > 10. Since one has to keep the other sources of errors and uncertainties mentioned above in mind, only trends can be assessed by these approaches.

However, graphs (e) and (f) show the results obtained from a similar training strategy, but this time using the linear regression. In these cases, predicted values scatter more densely around the real mixture values, the maximum standard deviation of graph (e) (three-class training) equals σ = 11.99 around their mean value for the 50% fuel quality class. This corresponds to a sample size of n ≥ 3.47 in order to obtain a 90% confidence for a certain fuel quality decile (e = 10 [%]). The two-class training approach requires only slightly higher sample sizes. However, in this case, the interpolation does not follow the desired linear relation between 0 and 100% fuel quality. Consequently, fitting the underlying trend (cf. the dotted line in graph (f), a fitting curve according to the Equation y = a + bx^c) ought to be applied in order to obtain real mixture values.

3.4. Repercussions for the Industrial Application

Although not perfectly accurate, the two- and three-class training, based on the linear regression algorithm sound promising in order to continuously assess the fuel (mixture) quality, even in full-scale environments. Consequently, the training process of such a system could even be performed in the ongoing operations of a biomass furnace. The system’s calibration may happen during two short-term runs with unmixed, pure fuels of both kinds, potentially complemented by training data from a typical 1:1 mixture. Subsequently, the required sample sizes for approximate predictions lie in a range of n < 5, which can be easily obtained by continuously monitoring sliding floors, conveyor belts or chain conveyors in biomass plants with camera systems and appropriate lighting. Since capturing and evaluating an individual photograph of feedstock on current standard office hardware requires significantly less than 1 s, the system can be used in order to capture live data. When special attention is paid to the hardware and optical systems, higher sampling rates are certainly possible as well.

Finally, Figure 10 displays the potential application of the methodology in order to calculate fuel properties from predicted mixture ratios by either applying linear interpolation (which, as discussed, is appropriate when using more than two fuel classes of training images) or a modified relation between the predicted and actual fuel quality. The latter has been demonstrated in Figure 9, graph (vi) for the two-point calibration. The underlying assumption of a linear behavior of fuel properties (e.g., bulk density), seems valid. As shown in Section 2.1, a mixture (1:1) of WC and FR exhibits approximately the mean value of both fuels.

4. Summary and Conclusions

This article examines and demonstrates a method for the assessment of fuel properties to identify the volumetric mixture fractions of two types of wood biomass. The approach is designed with respect to an envisaged application in industrial biomass furnaces and power plants. We assess the influence of different sources of errors and uncertainties during our lab-scale examinations in order to gain knowledge about the system’s properties and repercussions for operation in full-scale environments. We apply two different kinds of image processing techniques to generate feature vectors (a spatial/texture-based and an integral/histogram-based one) for regression modelling. Wood chips and forest residues in nine distinct mixture fractions serve as feedstock. Subsequently, we examined seven regression algorithms with different parametrization, including ensemble models, such as boosting and bagging.

This approach leads to a series of conclusions:

Both histogram- and texture-based modelling approaches are capable of identifying fuel mixture qualities with an acceptable error. Especially for the first, the obtainable accuracy (as R²) values are larger than 0.90. Furthermore, models based on histogram features are more robust to subpar training compared to texture information. The generally most reliable results were obtained by the gradient boosting (GB) models and the linear model for both feature vectors, as mirrored in the low RMSE values of their predictions.
The application of the adaboosting algorithm enhances the median model accuracies by values of ΔR² = 0.02–0.04 for the histogram-based and up to ΔR² = 0.15–0.20 for the texture-based feature vector. L1/2 regularization improves the linear models’ accuracies, especially for the texture-based regression (for 10⁻⁵ < α < 10⁻⁴) and has close to no effect on the histogram-based models.
The data preparation and training process are the principal reasons for underperforming models and require the most attention during model preparation. Especially the image labelling (i.e., assigning a class to images and/or providing training images with exactly determined properties) is challenging for lab-scale considerations and even more so for the target purpose in industrial furnaces.
It is possible to train models relying on a reduced amount of distinct classes, interpolating in between. Although the predictive accuracy significantly shrinks when using three instead of nine mixture classes during the training process, we find that it is still possible to obtain reliable predictions, even for sample sizes n < 5. This enables us to reliably distinguish fluctuations of 10% in the fuel mixture quality. If not otherwise possible, even a two-point calibration is applicable to continuously monitor feedstock quality, consequently, still revealing otherwise unobtainable information. In this case, a linear model should be used due to its better interpolation capabilities.

All in all, this work describes and examines a series of steps, considerations and prerequisites in order to develop and train an optical regression-based system for the online detection of fuel quality prior to the actual combustion process. This method is thus potentially capable of improving the furnace’s process control with respect to fluctuations in feedstock quality by behaving proactively, rather than reactively. More precisely, the gathered information (here as “fuel quality” correlating with the feedstock’s volumetric density) can be used to optimize a grate furnace’s fuel stoker. The developed methodology will be subsequently implemented into a biomass heat and power plant in the megawatt range in order to gain further operational knowledge, identifying and quantifying its potential during long-term operation.

Future work should address other relevant fuel properties, potentially derivable from images, such as particle size (distributions), fine particle content or the content of mineral soil and further contaminations.

Author Contributions

T.P., D.M. and J.K. conceived the method. D.M. and T.P. designed and erected the experimental setup, performed the fuel sampling, evaluation and the image capturing process. T.P. and F.G. wrote the source code for the image analysis and crafted the models. S.K. and T.P. analyzed and interpreted the results. T.P., D.M., F.G., S.K. and J.K. wrote the paper, T.P. visualized the article. T.P., D.M. and J.K. acquired funding. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the German Federal Ministry for Economic Affairs and Energy (BMWi) under the Framework 7th Energy Research Programme (Biomass Energy Use) within the “FuelBand2” project under the funding code 03KB145A. The authors greatly appreciate the financial support. Responsibility for this publication lies with the authors.

Conflicts of Interest

The authors declare no conflict of interest.

References

Reid, W.T. The relation of mineral composition to slagging, fouling and erosion during and after combustion. Prog. Energy Combust. Sci. 1984, 10, 159–169. [Google Scholar] [CrossRef]
Plankenbühler, T.; Müller, D.; Karl, J. Influence of Fine Fuel Particles on Ash Deposition in Industrial-Scale Biomass Combustion: Experiments and Computational Fluid Dynamics Modeling. Energy Fuels 2019, 33. [Google Scholar] [CrossRef]
Plankenbühler, T.; Müller, D.; Karl, J. Slagging prevention and plant optimisation by means of numerical simulation. In Proceedings of the 25th European Biomass Conference & Exhibition, Stockholm, Sweden, 12–15 June 2017; pp. 653–659. [Google Scholar] [CrossRef]
Baxter, L.L.; Miles, T.R.; Miles, T.R., Jr.; Jenkins, B.M.; Milne, T.; Dayton, D.; Bryers, R.W.; Oden, L.L. The behavior of inorganic material in biomass-fired power boilers: Field and laboratory experiences. Fuel Process. Technol. 1998, 54, 47–78. [Google Scholar] [CrossRef]
Niu, Y.; Tan, H.; Hui, S. Ash-related issues during biomass combustion: Alkali-induced slagging, silicate melt-induced slagging (ash fusion), agglomeration, corrosion, ash utilization, and related countermeasures. Prog. Energy Combust. Sci. 2016, 52, 1–61. [Google Scholar] [CrossRef]
Schön, C.; Kuptz, D.; Mack, R.; Zelinski, V.; Loewen, A.; Hartmann, H. Influence of wood chip quality on emission behaviour in small-scale wood chip boilers. Biomass Conv. Bioref. 2019, 9, 71–82. [Google Scholar] [CrossRef]
Gendek, A.; Aniszewska, M.; Chwedoruk, K. Bulk density of forest energy chips. Ann. Warsaw Univ. Life Sci. SGGW Agric. 2016, 67, 101–111. [Google Scholar]
Lokare, S.S.; Dunaway, J.D.; Moulton, D.; Rogers, D.; Tree, D.R.; Baxter, L.L. Investigation of ash deposition rates for a suite of biomass fuels and fuel blends. Energy Fuels 2006, 20, 1008–1014. [Google Scholar] [CrossRef]
Eriksson, G.; Hedman, H.; Boström, D.; Pettersson, E.; Backman, R.; Öhman, M. Combustion characterization of rapeseed meal and possible combustion applications. Energy Fuels 2009, 23, 3930–3939. [Google Scholar] [CrossRef]
Paulrud, S.; Erik, J.; Nilsson, C. Particle and handling characteristics of wood fuel powder: Effects of different mills. Fuel Process. Technol. 2002, 76, 23–39. [Google Scholar] [CrossRef]
Rodriguez, J.M.; Edeskär, T.; Knutsson, S. Particle Shape Quantities and Measurement Techniques—A Review. EJGE 2013, 18, 169–198. [Google Scholar]
Rezaei, H.; Lim, C.J.; Lau, A.; Sokhansanj, S. Size, shape and flow characterization of ground wood chip and ground wood pellet particles. Powder Technol. 2016, 301, 137–746. [Google Scholar] [CrossRef] [Green Version]
Igathinathane, C.; Pordesimo, L.O.; Columbus, E.P.; Batchelor, W.D.; Sokhansanj, S. Sieveless particle size distribution analysis of particulate materials through computer vision. Comput. Electron. Agric. 2009, 66, 147–158. [Google Scholar] [CrossRef]
Ding, F.; Benaoudia, M.; Bédard, P.; Lanouette, R.; Lejeune, C.; Gagné, P. Wood chip physical and measurement. Pulp Pap. Canada 2005, 2, 27–32. [Google Scholar]
Rhe, C. Multivariate NIR spectroscopy models for moisture, ash and calorific content in biofuels using bi-orthogonal partial least squares regression. Analyst 2005, 130, 1182–1189. [Google Scholar] [CrossRef]
Fridh, L.; Volpé, S.; Eliasson, L.; Fridh, L. A NIR machine for moisture content measurements of forest biomass in frozen and unfrozen conditions unfrozen conditions. Int. J. For. Eng. 2017, 28, 42–46. [Google Scholar] [CrossRef]
Daugbjerg, P.; Hartmann, H.; Bo, T.; Temmerman, M. Moisture content determination in solid biofuels by dielectric and NIR reflection methods. Biomass Bioenergy 2006, 30, 935–943. [Google Scholar] [CrossRef]
Pan, P.; Mcdonald, T.P.; Via, B.K.; Fulton, J.P.; Hung, J.Y. Predicting moisture content of chipped pine samples with a multi-electrode capacitance sensor. Biosyst. Eng. 2016, 145, 1–9. [Google Scholar] [CrossRef] [Green Version]
Tomas, E.; Andersson, T. A Machine Learning Approach for Biomass Characterization. Energy Procedia 2019, 158, 1279–1287. [Google Scholar] [CrossRef]
Tao, J.; Liang, R.; Li, J.; Yan, B.; Chen, G. Fast characterization of biomass and waste by infrared spectra and machine learning models Junyu. J. Hazard. Mater. 2019, 387, 121723. [Google Scholar] [CrossRef]
Nieto, P.G.; Garcia-Gonzalo, E.; Paredes-Sánchez, J.P.; Sánchez, A.B.; Fernández, M.M. Predictive modelling of the higher heating value in biomass torrefaction for the energy treatment process using machine-learning techniques. Neural Comput. Appl. 2019, 31, 8823–8836. [Google Scholar] [CrossRef]
Xing, J.; Luo, K.; Wang, H.; Gao, Z.; Fan, J. A comprehensive study on estimating higher heating value of biomass from proximate and ultimate analysis with machine learning approaches. Energy 2019, 188, 116077. [Google Scholar] [CrossRef]
Samadi, S.H.; Ghobadian, B.; Nosrati, M. Environmental Effects Prediction of higher heating value of biomass materials based on proximate analysis using gradient boosted regression trees method. Energy Sources Part A Recover. Util. Environ. Effect 2019. [Google Scholar] [CrossRef]
Gatternig, B.; Karl, J. Prediction of ash-induced agglomeration in biomass-fired fluidized beds by an advanced regression-based approach. Fuel 2015, 161, 157–167. [Google Scholar] [CrossRef]
Nixon, M.; Aguado, A.S. Feature Extraction & Image Processing, 2nd ed.; Academic Press, Inc.: Cambridge, MA, USA, 2008. [Google Scholar]
Haralick, R.M.; Dinstein, I.; Shanmugam, K. Textural features for image classification. IEEE Trans. Syst. Man Cybern 1973, 3, 610–621. [Google Scholar] [CrossRef] [Green Version]
Coelho, L.P. Mahotas: Open source software for scriptable computer vision. J. Open Res. Softw. 2013. [Google Scholar] [CrossRef]
Tasdemir, S.B.Y.; Tasdemir, K.; Aydin, Z. ROI Detection in Mammogram Images Using Wavelet-Based Haralick and HOG Features. In Proceedings of the 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), Orlando, FL, USA, 17–20 December 2018; pp. 105–109. [Google Scholar] [CrossRef]
Lishani, A.; Boubchir, L.; Khalifa, E.; Bouridane, A. Human gait recognition based on Haralick features. Signal Image Video Process. 2017. [Google Scholar] [CrossRef]
Kuhn, M.; Johnson, K. Applied Predictive Modeling; Springer: New York, NY, USA, 2013; Volume 26. [Google Scholar]
Freund, Y.; Schapire, R.E. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. J. Comput. Syst. Sci. 1997, 55, 119–139. [Google Scholar] [CrossRef] [Green Version]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
von der Lippe, P.M. Induktive Statistik: Formeln, Aufgaben, Klausurtraining; Oldenbourg-Verlag: München, Germany, 1998. [Google Scholar]

Figure 1. Exemplary photos of different fuel mixture classes (taken with green ambient lighting).

Figure 2. Experimental setup for fuel photography at FAU-EVT (with removed front cover).

Figure 3. Averaged normalized histograms of 0%, 50% and 100% fuel quality classes, based on the greyscale image. Gray lines indicate the positions used for the feature vector formulation, exemplarily highlighting feature 3.

Figure 4. Extracted histogram-based features as mean values with respect to the fuel class; axis is not normalized for the purpose of better visibility only. Error bars indicate the standard deviation.

Figure 5. Extracted texture-based Haralick features as mean values with respect to the fuel quality class; error bars indicate the standard deviation.

Figure 6. Overview on the models’ average performances for the histogram (top), and texture-based models (bottom) as R² (left axis) and RMSE values (right axis); error bars indicate the standard deviation of the average 5-fold cross validation training.

Figure 7. Relation between training and validation accuracy (as R²) of the predicted fuel quality for both feature vector types.

Figure 8. Exemplary high and low accuracy predictions for different fuel classes (Settings: GB regressor, R² (validation set) = 0.885, RMSE = 6.91).

Figure 9. Results for the fuel quality obtained from different training strategies. (a): reference case with all classes used for training; (b): training using 0, 25, 50, 75 and 100%; (c): 0, 50 and 100% class; (d): 0 and 100%; (e) and (f): as (b) and (d), but with linear model. Grey shade indicates the range of ± 10% error.

Figure 10. (left): Estimating fuel properties from calculated mixture ratios; (right): Interpolation of bulk density and ash content from calculated mixture ratios for two and three training classes.

Table 1. Selected fuel properties of the utilized feedstock.

Parameter	Standard	Wood Chips WC	Forest Residues FR
water content	DIN EN 14774-3	18.9 ± 0.9%	22.6 ± 1.0%
ash content	DIN EN 14775	3.8 ± 1.2 wt.%	5.9 ± 0.2 wt.%
bulk density	ISO 17828	222 ± 23 kg/m³	125 ± 12 kg/m³
fine particles < 5 mm	-	3–12 wt.%	3–12 wt.%
fine particles < 1 mm	-	0.5–1.2 wt.%	0.5–1.2 wt.%

Table 2. Overview of average predictive accuracy as R² of the predicted fuel quality for the different methods and algorithms.

Model		Histogram-Based Regression			Texture-Based Regression
		accuracy as R²			accuracy as R²
		Best	Average	Median	Best	Average	Median
Tree-type models
DT		0.875 ± 0.05	0.771 ± 0.09	0.780 ± 0.09	0.650 ± 0.31	0.479 ± 0.29	0.472 ± 0.33
DT	with adaboost	0.909 ± 0.04	0.842 ± 0.09	0.848 ± 0.09	0.825 ± 0.12	0.669 ± 0.18	0.672 ± 0.17
ET		0.869 ± 0.03	0.773 ± 0.08	0.801 ± 0.08	0.690 ± 0.12	0.576 ± 0.19	0.592 ± 0.17
ET	with adaboost	0.936 ± 0.02	0.853 ± 0.08	0.879 ± 0.06	0.869 ± 0.09	0.700 ± 0.11	0.743 ± 0.12
RF		0.900 ± 0.02	0.807 ± 0.10	0.811 ± 0.11	0.778 ± 0.11	0.670 ± 0.14	0.700 ± 0.13
RF	with adaboost	0.934 ± 0.01	0.798 ± 0.09	0.797 ± 0.09	0.839 ± 0.06	0.683 ± 0.14	0.670 ± 0.12
GB		0.892 ± 0.06	0.783 ± 0.14	0.800 ± 0.11	0.773 ± 0.06	0.609 ± 0.21	0.645 ± 0.17
GB	with adaboost	0.913 ± 0.03	0.885 ± 0.11	0.868 ± 0.08	0.819 ± 0.09	0.713 ± 0.16	0.722 ± 0.13
Linear models
Linear		0.884 ± 0.05	0.821 ± 0.08	0.829 ± 0.09	0.895 ± 0.04	0.795 ± 0.11	0.815 ± 0.08
Linear-Lasso		0.813 ± 0.03	0.748 ± 0.11	0.781 ± 0.09	0.911 ± 0.03	0.698 ± 0.19	0.755 ± 0.13
Linear-Ridge		0.890 ± 0.04	0.818 ± 0.07	0.833 ± 0.06	0.880 ± 0.10	0.678 ± 0.11	0.720 ± 0.08

Table 3. Overview of average predictive accuracy as RMSE of the predicted fuel quality for the different methods and algorithms.

Model		Histogram-Based Regression			Texture-Based Regression
		RMSE			RMSE
		Best	Average	Median	Best	Average	Median
Tree-type models
DT		6.95	10.78	10.23	9.77	13.85	12.64
DT	with adaboost	5.91	10.56	8.59	8.16	12.37	11.80
ET		5.68	10.55	9.77	11.04	12.79	13.91
ET	with adaboost	6.61	10.85	9.13	9.12	13.60	12.13
RF		6.27	9.86	10.04	8.49	12.94	12.06
RF	with adaboost	6.51	8.56	8.89	8.32	12.42	11.38
GB		5.66	8.45	8.60	7.94	10.87	10.97
GB	with adaboost	5.29	7.52	6.93	7.56	10.16	9.78
Linear models
Linear		8.90	11.93	12.17	5.56	7.17	7.12
Linear-Lasso		6.21	8.27	7.30	8.27	10.19	8.52
Linear-Ridge		6.03	7.79	7.80	7.57	8.72	8.84

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Plankenbühler, T.; Kolb, S.; Grümer, F.; Müller, D.; Karl, J. Image-Based Model for Assessment of Wood Chip Quality and Mixture Ratios. Processes 2020, 8, 728. https://doi.org/10.3390/pr8060728

AMA Style

Plankenbühler T, Kolb S, Grümer F, Müller D, Karl J. Image-Based Model for Assessment of Wood Chip Quality and Mixture Ratios. Processes. 2020; 8(6):728. https://doi.org/10.3390/pr8060728

Chicago/Turabian Style

Plankenbühler, Thomas, Sebastian Kolb, Fabian Grümer, Dominik Müller, and Jürgen Karl. 2020. "Image-Based Model for Assessment of Wood Chip Quality and Mixture Ratios" Processes 8, no. 6: 728. https://doi.org/10.3390/pr8060728

APA Style

Plankenbühler, T., Kolb, S., Grümer, F., Müller, D., & Karl, J. (2020). Image-Based Model for Assessment of Wood Chip Quality and Mixture Ratios. Processes, 8(6), 728. https://doi.org/10.3390/pr8060728

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Image-Based Model for Assessment of Wood Chip Quality and Mixture Ratios

Abstract

1. Introduction

2. Materials and Methods

2.1. Fuel Selection and Classification

2.2. Image Capturing

2.3. Image Processing

2.3.1. Preprocessing

2.3.2. Feature Vector Formulation

2.3.3. Histogram-Based Feature Vector

2.3.4. Texture-Based Feature Vector

2.4. Regression Model Selection and Setup

2.4.1. Model Evaluation and Error Estimation

2.4.2. Parameter Variations

2.5. Training Strategy

3. Results and Discussion

3.1. Influence of Regression Model and Parametrization

3.2. Influence of the Training Data

3.3. Influence of the Training Strategy

3.4. Repercussions for the Industrial Application

4. Summary and Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI