3.1. Samples Collected in Qufu
Reference values for the content of CaO, SiO
2, Al
2O
3, and Fe
2O
3 in cement raw meal samples were measured by XRF method and the results are shown in
Table 1. The content of CaO is distributed between 40.97 and 44.26% and is concentrated in the range of 42–42.5%; the content of SiO
2 is distributed between 11.22 and 13.70% and is concentrated in the range of 12.25–13%; the content of Al
2O
3 is distributed between 2.47 and 3.65% and is concentrated in the range of 3–3.3%; the content of Fe
2O
3 is distributed between 1.63 and 2.25% and is concentrated in the range of 1.9–2.1%.
The 202 cement raw material samples were split into 2 groups at a ratio of approximately 3:1 using SPXY method, with 152 samples in the calibration set and 50 samples in the prediction set, and the statistical results for the 4 ingredients are shown in
Table 2. For each of the four oxides, the highest and lowest content of samples were included in the calibration set and the content range of samples in the calibration set covered those in the prediction set, respectively; therefore, the modeling criteria were met.
The cross-validation-absolute-deviation-F-test algorithm was performed to identify and reject outliers for the four ingredients in the calibration set samples. In the cross-validation process, samples in the calibration set were divided into five subsets, and the results are shown in
Table 3. Five outliers of CaO were identified, with sample numbers 29, 33, 42, 54, and 69; seven outliers of SiO
2 were determined, with sample numbers 9, 52, 54, 69, 71, 105, and 107; five outliers of Al
2O
3 were detected, with sample numbers 30, 52, 54, 69, and 137; and seven outliers of Fe
2O
3 were ascertained, with numbers 5, 6, 30, 31, 105, 120, and 129.
The optimal spectral preprocessing method for each oxide was chosen, as described in the Reference [
38], the near infrared spectra of the calibration set samples were enhanced with interesting information using Savitzky–Golay convolution smoothing (SG), multiple scattering correction (MSC) with 1st derivative (1D), 1st derivative (1D), and standard normal variate transformation (SNV), respectively, to obtain the spectra employed for subsequent modeling of CaO, SiO
2, Al
2O
3, and Fe
2O
3, as shown in
Figure 7.
Subsequently, backward interval partial least squares and genetic algorithm was applied to the four effective components to select the characteristic wavenumber variables. Firstly, feature intervals of each oxide were selected by the biPLS algorithm. The entire spectrum (4000–10,000 cm−1) of cement raw meal was split into 5, 10, 20, 30, 40, and 50 equal intervals, respectively, and 6 biPLS models were obtained, the model with lowest RMSEP value was selected. Then, the range of intervals was narrowed, assuming that the number of intervals of the selected model is m, the full spectrum was divided into m − 5,…, m − 1, m + 1, …, m + 4, and m + 5 equal intervals, respectively, and 10 biPLS models were established. After that, the intervals corresponding to the lowest RMSEP and highest Rp models were selected as the characteristic intervals. For CaO, the range of 4000–10,000 cm−1 was divided into 30 subintervals, and 8 subintervals (1st, 2nd, 3rd, 4th, 5th, 12th, 14th, and 22nd) were selected; for SiO2, the range was split into 10 subintervals, and 2 subintervals (1st and 5th) were chosen, and then the genetic algorithm; for Al2O3, the range was partitioned into 5 subintervals, and the 1st subinterval was picked; for Fe2O3, it was divided into 40 subintervals, a total of 25 subintervals (2nd, 3rd, 4th, 6th, 8th, 9th, 10th, 11th, 13th, 15th, 16th, 17th, 19th, 20th, 21st, 23rd, 24th, 25th, 26th, 27th, 28th, 30th, 31st, 32nd, and 33rd) were selected.
Then genetic algorithm was used to select effective variables from the feature intervals obtained by biPLS method. The GA program parameters employed in this paper were the following: (1) the number of chromosomes was 30; (2) the probability of mutation was 0.01; (3) the probability of cross-over was 0.5; (4) maximum number of variables selected in the chromosome was 30; (5) the number of runs was 100; (6) the iterations selected as termination criteria were 100. The characteristic variables obtained for CaO, SiO
2, Al
2O
3, and Fe
2O
3 are shown in the shaded portion of
Figure 8.
Based on the remaining samples in the calibration set and the selected characteristic wavenumber variables, PLS models were developed for CaO, SiO
2, Al
2O
3, and Fe
2O
3, respectively, and the models were externally validated using samples from the prediction set, the parameters are listed in
Table 4.
The reference and predicted values for the 50 samples in the prediction set are represented in a line graph as shown in
Figure 9, it is obvious that the predicted values of each oxide content obtained by NIR spectroscopy are in general consistent with the reference values obtained by X-ray fluorescence spectroscopy method with small errors.
The standard methods for chemical analysis of cement (GB/T 176-2017) require that the measurement error for CaO, SiO2, Al2O3, and Fe2O3 should be within 0.40%, 0.30%, 0.20%, and 0.15%, respectively. The mean prediction errors of the 50 samples in the prediction set for CaO, SiO2, Al2O3, and Fe2O3 were 0.171%, 0.193%, 0.069%, and 0.032%, respectively, which perfectly satisfied the requirements for feedback control in cement production line.
3.2. Samples Collected in Linyi
As listed in
Table 5, the reference values for the content of CaO, SiO
2, Al
2O
3 and Fe
2O
3 in the raw meal samples produced in Linyi were determined by XRF method. Although the NIR spectra of the samples in the two groups showed significant differences, the average values of the four ingredients were similar, the content ranges were essentially the same, and the deviation values were not very different.
The 119 cement raw meal samples were split into 2 groups at a ratio of approximately 3:1 with the SPXY algorithm, resulting in 89 samples in the calibration set and 30 samples in the prediction set, and the statistical results for CaO, SiO
2, Al
2O
3, and Fe
2O
3 are shown in
Table 6. For each of the four oxides, the highest and lowest content samples were included in the calibration set and the content range of the prediction set was contained within that of the calibration set, in accordance with the modeling criteria.
Outliers were identified and eliminated by CVADF method for the four oxides in the calibration set samples. In the cross-validation process, samples in the calibration set were divided into 5 subsets, and the results are as shown in
Table 7. Five outliers were found for CaO (Nos. 1, 12, 20, 62, and 81); three outliers were detected for SiO
2 (Nos. 1, 7, and 9); three outliers were identified for Al
2O
3 (Nos. 2, 11, and 33); and three outliers were discovered for Fe
2O
3 (Nos. 2, 33, and 36).
The optimal spectral preprocessing method for each oxide was chosen, as described in Reference [
38], valuable information was enhanced utilizing Savitzky–Golay convolution smoothing (SG), Savitzky–Golay convolution smoothing with multiple scattering correction (SG with MSC), multiple scattering correction (MSC) and Savitzky–Golay convolution smoothing (SG) for the NIR spectra of the calibration set samples, respectively, to obtain the spectra for the subsequent modeling of CaO, SiO
2, Al
2O
3, and Fe
2O
3, as shown in
Figure 10.
Afterwards, the GA-biPLS method was employed to pick the feature wavenumber variables for each of the four oxides, the feature intervals of each oxide were selected by the biPLS algorithm in the same way as the samples from Qufu, for CaO, the range of 4000–10,000 cm
−1 was split into 8 subintervals, and a total of 3 subintervals (2nd, 4th, and 5th) were selected; for SiO
2, the range was divided into 4 subintervals, and the first subinterval was selected; for Al
2O
3, the range was partitioned into 8 subintervals, and a total of 5 subintervals (1st, 3rd, 4th, 5th, and 8th) were chosen; for Fe
2O
3, 30 subintervals were classified, and 10 subintervals (1st, 6th, 10th, 11th, 12th, 13th, 15th, 21st, 25th, and 26th) were picked. Then, the genetic algorithm was employed to select effective variables from the feature intervals obtained by biPLS method with the same parameters as the samples from Qufu, the characteristic variables obtained for CaO, SiO
2, Al
2O
3, and Fe
2O
3 are shown in the shaded portion of
Figure 11.
Based on the remaining samples in the calibration set and the determined characteristic wavenumber variables, PLS models are established for CaO, SiO
2, Al
2O
3, and Fe
2O
3 respectively, the parameters are shown in
Table 8. Then, the models were externally validated using samples in the predicted set.
The prediction errors for the 30 samples in the prediction set are represented in a line graph, as shown in
Figure 12. It is clear that the quantitative calibration models have high predictive power and low prediction errors. For CaO, the prediction error fluctuates up and down around 0.15%, with a maximum error of 0.38%; for SiO
2, the prediction error varies up and down around 0.10%, with a maximum error of 0.28%; for Al
2O
3, the prediction error fluctuates up and down around 0.02%, with a maximum error of 0.10%; and for Fe
2O
3, the prediction error varies up and down around 0.02%, with a maximum error of 0.06%.
The average prediction errors for the 30 samples In the prediction set are 0.154%, 0.100%, 0.022%, and 0.018% for CaO, SiO2, Al2O3, and Fe2O3, respectively, which are fully adequate for the feedback control of cement production line.
Although the types and sources of raw materials used in the Linyi and Qufu samples differ, the models obtained through the proposed modeling process both showed good predictive accuracy, demonstrating that the modeling procedure proposed in this paper has good generality and it is completely capable of meeting the detection requirements of different production lines.