1. Introduction
The limit of detection (LOD) of a given element/compound is a key parameter in analytical chemistry, representing the lower concentration of a given element that an analytical technique can detect with a reasonable certainty. The limit of detection depends on the matrix in which the analyte is present but also on the specific experimental setup used for the analysis. For this reason, the LOD is often used to compare the performances of different techniques, or variations of the same technique, in terms of their analytical capability.
In this paper, we will discuss the issues related to the calculation of the
LODs in applications of the Laser-Induced Breakdown Spectroscopy (
LIBS) technique [
1].
LIBS is a powerful spectroscopic technique which is based on the analysis of the light emitted by a laser-induced plasma (LIP). One of its most interesting features is its ability to work on untreated samples, which makes it particularly suited for in situ analysis [
2,
3,
4,
5]. In the laboratory applications of
LIBS, the possibility of working without solvents or gases is also appreciated, as well as the short time needed for the analysis and the simplicity of the experimental apparatus [
6,
7,
8,
9,
10].
However, the same characteristics above described make sometimes problematic the building of
LIBS calibration curves of good quality.
LIBS’s unique feature of not needing to prepare a sample has as a counterpart the occurrence of strong matrix effects (the signal of the analyte depends on the composition of the sample) [
11], which may negatively affect its sensitivity, as well as the trueness of the
LIBS analytical results.
According to the current narrative,
LIBS is characterized by high
LODs for many of the elements of interest (typically in the range of parts per million or higher). The
LOD of a given analyte is usually derived from a calibration curve, which reports the measured signal as a function of the analyte concentration in suitable samples of known composition (standards) [
12,
13]. To reduce matrix effects, it is necessary to build calibration curves using standards of known composition, having the same matrix as the samples to be analyzed.
Depending on the application, the availability of suitable matrix-matched standards may be limited. It is not uncommon to see
LIBS calibration curves built with less than 10 samples, sometimes characterized by concentrations of the analyte that are very different from the concentrations of the unknown sample. In some cases, the change in the concentration of the analyte among the standards may produce non-linearities in the calibration curve [
14]. Finding suitable blanks (reference samples where the analyte is absent) with the same matrix of the samples to be analyzed can be problematic, too.
Besides that, it must be stressed that the growing use of multivariate analytical tools [
15] for the quantitative determination of samples’ composition calls for a reconsideration of the formulas traditionally used for the calculation of the
LODs [
16]. A similar effort is needed for extending the calculation of
LODs in the framework of standardless/calibration-free analytical approaches [
17,
18].
Despite the abundant literature on the topic [
14,
16,
19,
20,
21,
22,
23,
24],
LODs in Laser-Induced Breakdown Spectroscopy (
LIBS) are often miscalculated [
1], leading to over-optimistic estimations of the analytic capabilities of the technique. Moreover, as already mentioned, the calculation of the
LOD is usually limited to the case of univariate analysis, after the construction of a suitable calibration curve [
14]. At the best of our knowledge, the problem of calculating the elemental
LODs in multivariate analysis has not been previously discussed within the framework of
LIBS analysis.
The issues associated with the building of a calibration curve in
LIBS have been extensively discussed in the past [
14,
25,
26]; as we have seen, the number of available standards is usually small, and sometimes, it can be hard to obtain a suitable blank. Moreover, in many
LIBS applications, the analyte concentrations of interest are of the order of the percent or higher, a condition that may produce non-negligible self-absorption effects [
27,
28]. These effects can be easily compensated, but the impact of self-absorption can be subtle, introducing non-linearities in the calibration curve that may not be easy to recognize.
In the next sections, we will discuss the derivation of the commonly used ‘3-sigma over slope’ [
29] rule for the determination of the
LOD of a given analyte from a univariate calibration curve and how this old definition, almost universally adopted in
LIBS analysis, should be modified to consider the results of recent research in the field. Subsequently, we will discuss how the current definition of
LOD in univariate calibration can be extended to the case of multivariate analysis. Finally, we will introduce a simple practical method for the estimation of the
LOD in multivariate analysis, recently proposed by Oleneva et al. [
23].
2. Materials and Methods
The practical case that we chose for exemplifying the different methods used for the determination of the
LOD using both univariate and multivariate approaches is the determination of copper concentration in cast iron (Fe-C-Si alloy, with carbon concentration between 1.8% and 4% and silicon concentration around 1–3%) samples. The ten cast iron standards were provided by the Bundesanstalt für Materialforschung und -prüfung (BAM, Berlin, Germany) to the participants to the
LIBS proficiency test held in 2016 at the
LIBS Conference in Chamonix, France. The Cu concentration in the samples is reported in
Table 1.
The samples were provided to the LIBS proficiency test participants in the form of powder or small chips. For the analysis, the samples were placed on double tape on a glass substrate.
The
LIBS spectra were acquired using the micro-Modì instrument by Marwan Instruments, Pisa [
30,
31,
32]. The instrument is equipped with a double-pulse laser, emitting two collinear pulses of about 20 mJ each in 20 ns Full-Width at Half Maximum (FWHM), delayed by 1 microsecond; the laser pulses are focused on the sample by a Zeiss Axio microscope. The laser spot on the sample had a diameter of about 50 μm; the irradiance on the sample was, thus, of about 50 GW/cm
2 per pulse. The laser-induced plasma emission is collected by an optical fiber, placed at about 1 cm from the laser spot, at an angle of about 45 degrees with the vertical axis, and analyzed by a wideband Avantes Avaspec-2048-2 USB spectrometer covering the spectral range between 190 and 900 nm (0.1 nm resolution from 190 to 450 nm, 0.3 nm resolution from 450 to 900 nm, for a total of 3745 spectral points) with a delay of 1 microsecond after the second pulse. The acquisition time is 2 milliseconds (time-integrated measurement).
For each sample, ten spectra were acquired at 12 different points, for a total of 120 spectra per sample. The instrument was manually refocused on the sample surface at each point. The LIBS spectra were independently saved for each point of analysis and each sample. The dataset of 1200 spectra, each one corresponding to 3745 wavelengths, was analyzed using the Matlab® software (version R2022a). Although the laser was carefully refocused on each point of analysis, the LIBS signal was still affected by the irregularity of the sample surface, which produced large fluctuations in the intensity of the spectra. With the aim of removing the outlier spectra, as a preliminary treatment the spectra with integral intensities larger or smaller by a factor of two with respect to the integral of the average spectrum of the same sample were removed from the dataset (typically around 10% of the 120 spectra). Subsequently, each of the remaining spectra was normalized to its integral intensity, to compensate for the effect of the fluctuations of the ablated mass from shot to shot, due to the fluctuations of the laser energy and the changes in the laser-sample coupling.
3. Results
3.1. Calculation of LOD from Univariate Calibration Curve
The currently accepted operative definition of the
LOD identifies this quantity as the minimum concentration of the analyte that can be detected, “controlling the risks” [
22] of false positives (fluctuation of the blank signal mistaken for analyte signal) and false negatives (signal of the analyte mistaken for a fluctuation of the blank) (see
Figure 1).
Assuming a Gaussian distribution of the signal of the blank and the sample, with the same variance
σ, the signal mean intensity at the
LOD corresponding to a given probability of false positives
PFP and false negatives
PFN can be written as:
where
is the inverse of the complementary error function (1—integral of the Gaussian function from –∞ to x). In most cases, for the calculation of the
, the two probabilities of false positives and false negatives are assumed to be equal to a reasonably low value
P. In that case:
In
LIBS, the calculation of the Limit of Detection of an analyte is almost universally performed, exploiting the traditional IUPAC (now obsolete) definition [
13]:
where
σ is the standard deviation of the
LIBS signal of a sample with zero concentration of the analyte (blank) and b is the slope of the calibration curve (sensitivity). The
LOD defined in Equation (3) is interpreted as the minimum concentration of the analyte that can be safely distinguished from the blank, being characterized by a signal (
ILOD =
b ×
LOD) equal to three times the standard deviation of the signal of the blank σ.
Comparing definition (3) with (1), we see that the probability of false positives (assumed equal to the probability of false negatives, as in (2)) is .
We have already anticipated that the definition (3) has been modified by IUPAC. The new definition for the
LOD of a given element in a univariate calibration is:
where:
where
N represents the number of points in the calibration curve [
22]. The term
is an estimate of the deviation of the signals
Ii with respect to the value predicted by the calibration curve
at that concentration, under the hypothesis of homoscedasticity (equal variance) of the signals close to the
LOD.
are the concentrations of the standards and
is the average of these concentrations. The term
represent the leverage for the blank sample. The higher the leverage, the higher would be the fluctuations of the intercept of the calibration curve, which reflects the uncertainty of the signals.
The new IUPAC definition differs from (4) in two aspects. Firstly, the probability of false positives/false negatives is set to 5%, instead of 7.7%. Therefore, the factor of 3 in Equation (3) becomes 3.3 in (4). Moreover, as already pointed out forty years ago by Long and Winefordner [
24], for the calculation of the
LOD, the average
LIBS signal of the blank is assumed to be zero. Consequently, the uncertainty on the intercept of the calibration curve cannot be neglected in the calculation of the σ of the signals. In fact, if
, for calculating the standard deviation of the signal, the uncertainty on the calibration curve intercept
σa must be considered, too:
The new IUPAC definition considers this effect, taking into account the leverage of the calibration curve at zero concentration. The concentration of the analyte at the LOD is assumed to be close enough to zero to consider the standard deviation of its signal equal to the one of the blank (hypothesis used for deriving Equation (2) from Equation (1)).
It must be stressed that the (reasonable) hypotheses leading to the new definition of the LOD can be easily made less strict, considering for example a Student t-distribution instead of a Gaussian, if the number of measurements is low or considering the possibility of having different standard deviations for the blank and for the sample at the LOD.
3.2. LOD of Cu in Cast Iron from Univariate Calibration
For determining the
LOD of copper in cast iron using a univariate approach, we firstly built the corresponding calibration curve. We focused on the narrow spectral interval comprised between 324.4 nm and 328 nm, where the two most intense Cu I emission lines (peaks at 324.75 and 327.40 nm) are visible (see
Figure 2).
For building a univariate calibration curve for Cu, we defined the
LIBS signal as the sum of the areas of the two copper lines (red zones in
Figure 3). The background (corresponding to zero analyte signal) was estimated from the gray area in
Figure 3.
Plotting the
LIBS signal versus the concentration of the samples, we obtain the calibration curve shown in
Figure 4.
It should be noted that in
LIBS, the signal should be ideally proportional to the number concentration of the analyte, which is different from the weight concentration [
9]. However, in our case, the evident non-linearity of the calibration curve must be attributed to the effect of the self-absorption of the Cu line emission [
27]. For the determination of the
LOD of Cu in our experimental conditions, we will, thus, limit the range of the calibration curve to the linear region comprised between zero (background signal) and 0.55 w%, thus eliminating the two samples at concentration higher than 1 w% (see
Figure 5).
In this range, the relation between Cu signal and concentration in weight is substantially linear and it is, therefore, possible to define the
LOD of the analyte. Note that the calibration curve does not reach zero at zero copper concentration. This is also evident from
Figure 1, where the Cu line at 324.75 nm seems to be observable even in the spectra of the S3 and S5 samples, which have very low copper concentrations (0.02% and 0.03%, respectively). This is probably due to the interference of some (weak) iron line emitted at a close wavelength, but does not represent a problem for the calculation of the
LOD, as soon as the uncertainty on the value of the offset is taken into account.
By applying the formula reported in Equation (4), in our experimental conditions, we obtain an
LOD for Cu in cast iron equal to about 0.35 w% (0.2 w% according to the old IUPAC definition, Equation (4)). However, some caution must be taken when interpreting this result; observing directly the intensity distribution of the blank (estimated as the background signal of sample with the lower Cu concentration in the calibration set) and of sample S9, whose Cu concentration is about one half of the estimated
LOD, we observe that the Gaussian fittings of the two intensity distributions overlap well above the 5% confidence limit, but the two distributions of the blank and of the signal do not overlap at all (and thus, Cu at this concentration can be discriminated from the blank with 100% probability, see
Figure 6).
This is because the distribution of the
LIBS signal of the sample is strongly asymmetric towards the higher intensities, and thus, the approximation of having a Gaussian distribution does not apply (the standard deviation of the signal is also larger than the one of the background/blank). This effect has been studied a few years ago by Klus et al., which demonstrated that the statistical distribution of the
LIBS line intensities in some cases is better described by a General Extreme Value Distribution (GEVD) curve, rather than by a Gaussian curve [
33]. The asymmetry of the
LIBS intensity distribution can be also due to the spectral selection procedure used for eliminating the outliers in the measurement.
At lower Cu concentrations (S5 sample, for example, having a Cu concentration of 0.03 w%), we observe, instead, a substantial overlapping of the intensity distributions. In this case, the distribution of the signal is more symmetric and can be fitted, although not perfectly, with a Gaussian curve (see
Figure 7).
3.3. LOD of Cu in Cast Iron from Multivariate Calibration
In ref. [
31], studying the same samples here described, the authors have demonstrated the substantial improvement in the analytical performances of the
LIBS technique that can be obtained using a multivariate approach, instead of limiting the analysis to a single predictor of the concentration, as is done using a univariate approach.
Adopting the same strategy, we trained a simple Artificial Neural Network [
15], using as input/predictors all the spectral points in the range previously considered (32 points between 324.5 nm and 328.0 nm). We used a single neuron in the hidden layer, with a sigmoid transfer function.
In the framework of a multivariate analysis, the equivalent of a calibration curve is expressed by the regression graph, which represents the results of the ANN vs. the nominal concentrations. The curve obtained for the Cu concentration is shown in
Figure 8.
As expected, the regression curve obtained using a multivariate approach is more precise than the corresponding univariate calibration curve (
Figure 5). Pushing forward this analogy, the
LOD corresponding to the regression curve can be calculated in the same way as for the calibration curve (Equation (4)). In this case (pseudo-univariate estimation), the
LOD for Cu improves substantially, due to the reduction of σ obtained using a multivariate approach. Applying Equation (4), we obtain a
LOD = 0.15 w% for the measurement of Cu in cast iron, which should be compared with the univariate
LOD = 0.35 w%.
Additionally, in this case we can do a comparison between the distribution of the predicted values for a sample near the
LOD, as shown in
Figure 9.
Similar to the univariate case, the assumption of a Gaussian distribution is not fulfilled for the predicted Cu concentrations. Moreover, the standard deviation of the zero signal is not equal to the one of the signal at the calculated LOD.
Therefore, the actual limit of detection of Cu can be considered substantially lower than the one calculated using Equation (4).
An alternative empirical method for determining the
LOD in multivariate analysis has been recently proposed by Oleneva et al. [
23]. The authors considered the Mean Relative Error (
MRE) of the estimated concentration of a given sample
i, defined as:
where
is the concentration estimated by the multivariate algorithm for each repetition of the measurement on the sample with certified concentration
. We assume to have ordered the samples according to the growing concentration of the analyte, i.e.,
ci+1 ≥
ci. Intuitively, the
MRE of the samples with analyte concentration lower than the
LOD would be higher than the ones for samples with concentration above the
LOD. A practical definition of the
LOD can, thus, be, according to the authors, the value of c
i after which the
MRE fluctuations.
which stabilizes below a given threshold.
Applying this method to the ANN results for Cu, we obtain the results shown in
Figure 10.
The estimated LOD (about 0.12 w%) is slightly lower than the value estimated through the pseudo-univariate method (Equation (4)); this is coherent with the failure of the approximation of Gaussian distribution and homoscedasticity of the results, which leads to an overestimation of the LOD in Equation (4).
A summary of the results obtained with the different approaches for the
LOD of Cu is reported in
Table 2.
4. Discussion
In the previous sections, we have discussed the motives for using the new IUPAC formula for the calculation of the
LOD of an analyte with a given experimental procedure, using a univariate or multivariate approach. The use of the old IUPAC formula should, thus, be deprecated, in
LIBS as in other spectro-analytical applications. When the formulas for the
LOD are applied to the case of
LIBS, we noticed several critical points that would partially invalidate the application of the new IUPAC definition for the
LOD (as well as the old one). The failure of the Gaussian approximation for the
LIBS intensity distribution and the non-homoscedasticity of the
LIBS signals contrast with the hypotheses at the basis of the derivation of Equation (4). The method proposed by Oleneva et al. [
23], on the other hand, did not rely on the hypothesis of Gaussianity and homoscedasticity; however, it should be noted that the new IUPAC definition of
LOD is based on the idea of quantifying the risk of false positives or false negatives for measurements performed near the
LOD, and the Oleneva method (which can also be easily applied to a univariate calibration curve) does not provide information about these essential parameters. The definition of ‘stabilization’ of the
MRE with increasing the concentration of the standards is essential arbitrary; moreover, at high concentrations of the analyte, other effects can produce errors and corresponding fluctuations of the
MRE. On the other hand, a definition based on the analysis of the
MRE is interesting from an operative perspective, although it seems more related to the problem of the quantification of the analyte concentration, rather than just on its detection.
The issue of determining the LOD of an analyte by LIBS is, thus, very complex, both in the univariate and in the multivariate case. The discussion and the examples shown in this paper should suggest some care in providing figures for the LOD of the analytes under study without checking the validity of the Gaussian distribution and homoscedasticity of the LIBS signals near the LOD. In particular, non-Gaussian profiles of the intensity distribution may arise from spectral selection treatments when, as in the example reported in this paper, the outlier spectra are removed. Since this kind of treatment has become customary in the analysis of large quantities of data, their effect on the calculation of the LOD should be checked.
As a general suggestion, in a multivariate as well as univariate quantitative approach, it is always appropriate to check the histogram of the relevant LIBS signals, comparing them with the histogram of the blank signal. To guarantee that false positives and false negatives would have the same probability at the LOD is on the other hand necessary to use for the calculation of only the data with a standard deviation very close to the one of the blank. Samples characterized by high concentrations of the analyte should be excluded from the calculation because of the possible non-linear effect, but also because the standard deviation of LIBS signals increases with the square root of the signal. Moreover, points at high concentration in the calibration curve contribute to the increase in the leverage, which must be considered in the calculation of the LOD.