1. Introduction
Online or/and on-site elemental analysis is of high interest in modern industries like alloy production and additive manufacturing [
1,
2,
3,
4,
5,
6,
7,
8]. Laser-induced breakdown spectroscopy (LIBS) has become a “superstar” technique for online analysis due to its ability to analyze “photon-reachable” targets in almost any environment [
9]. LIBS had poor sensitivity and accuracy compared to conventional laboratory-based techniques in atomic emission spectroscopy. However, the requirement for super-fast analysis has made LIBS a highly demanded technique in different fields, including industrial applications [
2,
6,
10,
11], agriculture, ecological monitoring [
12,
13,
14] and medicine [
15].
Online analysis requires continuous data acquisition, so single-shot or multiple-shot averaging is typically used in LIBS. Generally, in atomic emission spectroscopy (optical emission spectroscopy with spark or arc discharge, optical emission spectroscopy of inductively coupled plasma), the analytical signal follows a normal distribution, so the limits of detection can be estimated using “three-sigma” criteria. However, in some cases, the analytical signal does not follow a normal distribution. For example, K. Behrends experimentally demonstrated for different analytical methods that the deviations in analytical results do not always obey the Gaussian distribution [
16]. In the case of LIBS measurements, the analytical signal distribution function should depend on different parameters, including the laser pulse energy variation, sample homogeneity, etc. Shot-to-shot LIBS signal fluctuations have been studied in detail by many research teams [
17,
18,
19,
20,
21,
22] supposing that LIBS signals follow the normal distribution. However, in 2007, A. Michel and A.D. Chave [
23] clearly demonstrated that the atomic line signal follows a generalized extreme value distribution (GEVD) rather than a normal distribution. The first published results on the non-Gaussian distribution function of LIBS signals were presented by V. Lazic et al. [
24], but only raw data were presented without any discussion on this issue. Surprisingly, however, the LIBS community did not pay much attention to the non-Gaussian distribution of LIBS signals, and 10 years later, only two papers have followed. Klus et al. [
25,
26] studied the atomic and ionic line emission statistics and demonstrated that some lines were distributed according to the GEVD while other signals could be described using the normal distribution. The same team simultaneously measured atomic line intensities, gated plasma images, shockwaves and acoustic waves and demonstrated that the distribution functions for all the signals did not follow a normal distribution [
27]. The authors indicated that the non-Gaussian distribution of LIBS signals will have an impact on analysis results but have not published any proof.
In the current manuscript, we made the next step and systematically studied the LIBS signal statistics’ (distribution functions) influence on the analysis sensitivity (limits of detection). Single-variable calibration curves were plotted for the non-Gaussian distributed signals with different spectra averaging and the limits of detection were estimated. It is generally accepted in analytical atomic spectroscopy methods that atomic/ionic line emission follows a Gaussian distribution, so the limit of detection (LOD) is estimated supposing the normal distribution condition is fulfilled. However, in LIBS, blank samples may not always be available, so low-contamination samples (pseudo-blanks) can be used. In such cases, histograms of blank (pseudo-blank) samples should be compared to quantify the LOD. Signal averaging is a widely utilized procedure in the LIBS community to improve the measurement reproducibility. The second aim of the study is to estimate the impact of spectra averaging on LIBS signal distribution functions and measured LODs.
2. Experiment
The LIBS signal statistics were studied with a conventional experiment setup (
Figure 1) which is based on a flash-lamp-pumped solid-state Nd:YAG laser [
28]. A laser beam of 1064 nm wavelength was focused using a quartz lens (F = 100 mm) onto the sample surface (laser spot 120 μm) at an angle of 90 degrees. The laser beam waist was located 1 mm below the sample surface to improve the measurement reproducibility. The laser pulse duration was measured as 10 ns using a fast photodiode (11HSP-FS, <1 ns, Standa Ltd., Vilnius, Lithuania) and digital oscilloscope (XSD3302, Owon, 300 MHz, 2.5 GS/s, Owon Ltd., Zhangzhou, China). The laser pulse energy was controlled using a thermal energy meter (QE25LP-S-MB, Gentec, Gentec Electro-Optics, Inc., Quebec City, QC, Canada). The laser beam quality was estimated as M
2 = 2. The laser ran at a 5 Hz rate. The plasma emission was collected from the side using a quartz lens located at double focal length from the plasma (F = 45 mm), so the plasma image was transferred onto the spectrograph slit at a scale of 1:1. A spectrograph (2400 grooves/mm, λ/δλ = 3500 Shamrock 303i, Andor, Andor Technology Ltd, Belfast, UK) was equipped with an intensified CCD camera (ICCD) (iStar, Andor, Andor Technology Ltd, Belfast, UK). An additional CMOS camera (acA1920-40um, Basler, Basler AG, Ahrensburg, Germany) was utilized to capture the time- and spectra-integrated plasma images (exposure time 500 µs, 350–800 nm spectral range). In order to monitor the shot-to-shot laser pulse energy repeatability, we installed a glass plate and measured the laser beam fleck using a CMOS camera (acA1920-40um, Basler, Basler AG, Ahrensburg, Germany). The laser and intensified CCD (ICCD) and CMOS cameras were synchronized using a digital pulse generator (DG-535, Stanford Research, Sunnyvale, CA, USA). The computer controlled the data acquisition with custom-written software developed in the LabVIEW 2014 environment (National Instruments, Austin, TX, USA). The sample was installed in a holder so that the lens-to-sample distance was the same throughout the experiments. The holder was built into a two-axis motorized stage (8MT173-30, Standa Ltd., Vilnius, Lithuania) to ensure the same lens-to-sample distance during sample movement. Low-alloy steel samples (see
Table 1) were chosen for their suitability in estimating LIBS sensitivity in terms of the limits of detection (LODs). Before measurements, a sample surface was polished using sandpaper (grit P2400, Dexter Pro, Gdansk, Poland) to remove the slag layers and then washed with ethanol.
3. Results and Discussion
The choice of analytical lines in LIBS measurements depends on several factors, such as the absence of spectral interference and self-absorption as well as the high signal-to-noise ratio of the selected line. Typically, the shot-to-shot fluctuations in LIBS measurements can be compensated by an analytical line normalization procedure [
29]: the analytical line is normalized on major component line intensity or plasma background emission. According to the current study goal, we focused on measuring LIBS signal distribution functions and the impact on analysis results. To study the non-Gaussian LIBS signal distribution impact on analysis results, we utilized low-alloy steels and quantified Si I 288.16 (analyte) and the Fe I 281.33 line (matrix) intensities in the spectral range of 280–290 nm (
Figure 2). The Si I 288.16 nm line is a resonant line so its intensity is the largest between silicon atomic lines in the plasma spectrum. Additionally, the Si I 288.16 nm line has no spectral interference with the iron atomic lines. The Fe I 281.33 nm line had high intensity and did not spectrally interfere with the other lines so it is a good choice. The atomic line integral with background correction was defined as the corresponding line LIBS signal. In order to measure limits of detection, the background signal was defined as continuous wave plasma emission with the same spectral width as the analyte line at FWHM but shifted toward the free-of-lines spectral region. Laser pulses triggered all the processes in laser ablation so it was convenient to monitor the laser pulse energy during experiments. To do so, a glass plate was installed after the laser output and a neutral density glass filter was installed before the CMOS camera which digitized the laser beam fleck profile (
Figure 2a). The resulting laser beam image was used to define the laser pulse energy as the rectangular area where pixel intensity drops to 1/10 from the beam maximum (white dashed line in
Figure 2a). The spectra- and time-integrated plasma images have also been captured by another CMOS camera since plasma imaging can be a simple but effective way for atomic/ionic line normalization. The plasma imaging signal was defined as the sum of the pixel counts in the rectangular area at the 1/10 level from the maximum amplitude (white dashed line in
Figure 2b).
The following sampling strategy was used in the experiments: 10 laser pulses were used to ablate the same spot and every spectrum (or plasma image) was captured; then, the sample was moved to another spot for the next 10 single-shot LIBS measurements; a matrix of 20 × 20 spots was sampled so 4000 LIBS spectra and plasma images were acquired. A 300 μm distance was set between laser spot centers to avoid any possible influence of reablation at the crater border between the measurements. The Nd:YAG laser generated pulses at a 5 Hz rate, thus 2 s were needed for single-spot measurements but a few hundred milliseconds were needed to move the stage so the duration of a single sample scan took almost 20 min. Supposing that cleaning procedures (grinding, polishing and ethanol cleaning) can pollute the sample surface, it would be convenient to study the impact of the pulse number on the captured LIBS spectrum. To do so, we measured Si I 288.16 and Fe I 281.33 signals (corresponding integrals with the background correction were quantified by a self-written program in the LabVIEW environment) for the first, the second, etc. until the tenth pulse for every spot and then the signal distribution functions were plotted for all the scans (400 points) corresponding to the shot number in the spot (
Figure 3). Supposing that silicon pollution will have a greater impact on low-Si samples, we have measured the samples for the lowest (0.013% wt.) and highest (0.67% wt.) concentrations available. For the low-Si-concentration sample, the Si I 288.16 signal increased for the first and the second laser shots but stabilized after the fourth shot. For samples with high Si concentration, the first shot signal was comparable to the fifth and following shot spectra but the second and the third shot spectra deviated a lot. In the case of the matrix element, the first-pulse Fe I 281.33 nm signal was systematically lower than the next shot signals: the second shot signal was slightly greater and then it stabilized. The systematical increase in the Si I 288.16 signal for the sample with the lowest concentration should be attributed to contamination. According to these results, we skipped the LIBS data for the first five laser shots in a spot and took the next shots (from the 6th to the 10th shots) so a 1000-spectra dataset per sample was processed.
The datasets for laser pulse energy (in counts), plasma imaging signal, Si I 288.16, Fe I 281.33 and plasma background emission signals are presented in
Figure 4 and the corresponding distributions in
Figure 5. The laser energy signal was rather stable during the measurements and the relative standard deviation (RSD) was better than 1.2%. The plasma imaging signal reproducibility was 6-fold poorer (RSD = 7%) compared to the laser energy signal. The Si I 288.16, Fe I 281.33 and plasma background emission signals were rather randomly distributed with the largest variations (RSD ~ 20%).
Interestingly, the distributions for laser energy signals and LIBS signals were rather different. The signals were not distributed normally except the plasma imaging signal (space-, spectra- and time-integrated signal). The laser energy mean was non-symmetric with the median shifted toward larger values (Weibull distribution function). LIBS signal (Si I 288.16, Fe I 281.33, plasma background emission) distribution functions were also non-symmetric with the median shifted to the lower values (Frechet distribution function). The different vectors of the median biases for the laser pulse energy signal and LIBS signals are the indicator of low impact of the laser pulse energy variation on the gated plasma emission. The distribution functions were tested for normality by the Anderson–Darling test [
30] to quantitatively compare the biased and the Gaussian distributions. The Kolmogorov–Smirnov normality test can provide unreliable results so we did not use it [
31]. According to the presented results, the only signal which follows the normal distribution was the plasma imaging signal (space, spectra and time integrated). All other signals were significantly biased from the Gaussian distributions (
p-value smaller than 0.05).
The similarity of the distribution functions for Si I 288.16, Fe I 281.33 and plasma background emission signals is a good indicator of the signals’ correlation. Indeed, cross-correlation analysis of all the datasets revealed that the strongest correlation (the largest Spearman correlation coefficient) was measured for the Si I 288.16, Fe I 281.33 and plasma background emission signals (ρ = 0.8 and 0.9, respectively). For better normalization, we plotted the calibration curves for the Si I 288.16 signal as well as Si I 288.16 normalized on Fe I 281.33 and plasma background emission signals.
A good question is how the analytical signal distribution functions will be changed after the normalization procedure: the normalized signal can have a normal distribution function due to the similarity of the bias in distribution functions of original signals (Si I 288.16 and Fe I 281.33 signals; Si I 288.16 and plasma background signals). Another question is how spectra averaging will influence the LIBS signal distribution functions: the signal-to-noise ratio will increase as the square root of the measurement number so it will have an impact on LIBS signal distribution function. To answer this, we plotted the distribution functions for the Si I 288.16 signal and corresponding normalized signals in the case of the sample with the lowest Si content (0.014% wt. Si). We also carried out spectra averaging for 5, 20 and 100 shots, calculated the signals and plotted the corresponding distribution functions in
Figure 6. The normalization procedure did not have a large impact on the normalized signal distribution function—the distribution did not pass the Anderson–Darling test. However, the
p-value was greater for the normalized signals, thus normalization had a tendency to provide “more normally distributed” signals.
Spectra averaging changed the LIBS signal distribution functions (
Figure 6). First, the median value increased as the number of averaged spectra enlarged due to the signal-to-noise ratio increment. For example, the Si I 288.16 signal median increased by almost 20% if 100 spectra were averaged. Second, spectra averaging had a much greater tendency to transform the distribution function profile closer to the Gaussian one. In the case of the Si I 288.16 signal, the distribution function did not pass the Anderson–Darling test (
p-value smaller than 0.05) until 100 spectra were averaged though only 20 data points were available for distribution function construction. For normalized LIBS signals, the distribution functions passed the Anderson–Darling test for the data obtained from the averaging of 20 spectra.
In order to quantify the impact of non-Gaussian distribution functions and spectra averaging on LIBS analysis performance, we plotted the calibration curves for the Si I 288.16 nm signal, as well as for analytical signals after normalizing on the Fe I 281.33 nm line and background signals (
Figure 7). For each sample, we obtained 2000 single-shot LIBS measurements or 10,000 data points in total for the calibration curve. Drawing all the data points in the calibration plot resulted in total confusion of the data so we have not presented data points in
Figure 7. The calibration curves were plotted for all the data points. In the
Supplementary Materials (
Figure S1), we present calibration curves with the data points (100 per concentration value) in the case of 100 averaged spectra. The calibration curve linearity was poor for the Si I 288.16 signal and after normalizing on the plasma background signal (R
2 = 0.693 and 0.621, respectively). However, normalizing on the Fe I 281.33 nm signal significantly improved the calibration plot linearity and R
2 reached values of 0.922. Interestingly, linear fits were almost the same for the averaged spectra signals. For example, in
Figure 7a four linear fits were plotted but only the “single-shot signal” line differed visually from the “5, 20 and 100 averaged” linear functions. Limits of detection (LODs) were determined for each calibration curve by “3σ criteria”, in accordance with the recommendations of the IUPAC [
32,
33]. The LOD was calculated using the formula
LOD =
3σ/s, where σ represents the standard deviation of the background signal for the sample with the lowest analyte content, and
s denotes the sensitivity, which corresponds to the slope of the calibration curve. The obtained LOD values for different analytical signals and different amounts of spectral averaging are presented in
Table 2. The LOD values range from 0.11 to 0.22 wt.% Si and the smallest one was obtained for the calibration curve based on the Si I 288.16 signal.
The spectra averaging (5, 20 and 100) had a small impact on the calibration curves. We found out that averaging improves the calibration curve performance in case of linearity (R2), but it also makes the lines merge and it was not possible to detect the difference between the averaging of 5 pulses and 100 pulses. The linearity (R2) of the calibration curve for the Si I 288.16 nm signal increases smoothly from 0.693 to 0.813 with an increasing number of averaged spectra. The best results were achieved for Si I 288.16 normalized on Fe I 281.33 signals. The LODs were also improved 3 times as the averaged spectra signals were utilized for silicon line intensity as well as after the normalizing procedure.
According to the IUPAC recommendation, the practice of accurately metering the LOD involves distribution plot construction for both blank and signal data and considering the probabilities of false-positive and false-negative decisions [
34,
35]. In this paper, we focus on how the non-Gaussian distributed signal will impact on LOD estimation. Since spectra averaging makes the analytical signal closer to the Gaussian distribution (see above), then it will be interesting to plot blank and analytical signal histograms to estimate the LOD for averaged spectra as well. In the context of LIBS, obtaining a signal from a blank sample can be quite challenging because analyte-free samples may not always be available. In such a case, the blank signal can be determined for the sample with the lowest analyte concentration as continuous wave plasma emission with the same spectral width as the analyte line at FWHM but shifted toward the region without any spectral lines. Generally in LIBS, the atomic/ionic line integral distribution function sets are not checked for normality but a priori assumed to follow a normal distribution, so mean values are plotted in a calibration curve rather than data points of the parallel measurements. In our experiment, we found out that LIBS signals followed a normal distribution only for a large number of averaged spectra (see
Figure 6). Therefore, except for a few tens to hundreds of averaged spectra, the non-Gaussian fitting functions like GEVD should be used. The signals (without and with normalization) for blank and analyte for the sample with the lowest concentration (0.014% wt. Si) are shown in
Figure 8. In the case of single-shot measurements for Si I 288.16 and normalized signals, the blank and analyte distribution function curves overlap so the 5% confidence limit requirement is not fulfilled [
35]. In other words, the Si LOD is greater than 0.014% wt. for single-shot LIBS measurements. As the number of averaged spectra increased, the analyte signal distribution function profile changed to the Gaussian profile and its FWHM decreased. The blank signal distribution width also decreased as the spectra were averaged but the mean value did not change, nor did the distribution profile. Consequently, the blank and analyte signal distribution functions became less overlapped as the number of averaged spectra increased. According to
Figure 8a,c,e, the fulfillment of the 5% confidence limit was not achieved for single-shot LIBS spectra, i.e., the limit of detection for Si was greater than the 0.014% wt. value. However, spectra averaging improves the situation and LOD values should almost reach the 0.014% wt. values.
In accordance with the IUPAC recommendation [
32] (“detection limits cannot be derived in the absence of known (or assumed) distributions”), we carried out the following. First, the critical value should be determined for false positives (“type I” error) based on the blank signal distribution function. Then, the signal distribution should be fitted with the appropriate function and the fit curve should be shifted to higher signal values in such a way that only 5% of the area is located below the critical value, therefore setting the false negative threshold (“type II” error) of 0.05 according to IUPAC [
32]. The new centroid positions of the shifted curve were determined (see
Figure 8b,d,f), and its values were converted to silicon concentration by using the corresponding calibration curves. For a better view, we present distribution functions for the LIBS signal and corresponding blank only for single-shot and 100-shot averaging while we made a calculation for 5- and 20-shot averaging as well and present LOD values in
Table 2. For single-shot LIBS signals (largest bias from the Gaussian distribution function), the difference between the LOD determined by calibration curves and histogram plotting was significant. Careful metering of the LOD according to histogram plotting provided systematically lower values (up to 3-fold) compared to the LODs determined from the calibration curves. Such a difference can be easily understood by fitting the Frechet distribution with the Gaussian function—the Gaussian function fit FWHM will be greater for non-symmetric distribution so the σ-value will be greater than its true value due to poor choice of the fit function. Plotting histograms and fitting them with appropriate functions provided more reliable results for LOD determination in accordance with the IUPAC definition discussed above. Spectra averaging transforms the LIBS signal distribution functions and makes them more similar to the Gaussian distribution function so the difference between the LOD determined by “calibration curves” and “histogram plotting” became smaller as the number of averaged spectra grew. LODs are improved up to 20-fold as the number of averaged spectra increases due to a better signal-to-noise ratio in the averaged spectra (20-fold improvement for the 100 spectra averaged). It should be noted that the LOD improvement by spectra averaging was different for “calibration curve” and “histogram plotting” approaches: 3- and 20-fold improvement, respectively.