1. Introduction
Plant transpiration rate (TR) is an important trait that reflects the plant water influx and outflux [
1,
2]. When water is not a limiting factor, the TR depends on abiotic factors such as soil salinity and water vapor in the atmosphere, as well as photosynthetically active radiation (PAR). On the other hand, when water is a limiting factor, there will be a decline in TR once the volumetric water content in the pot or soil reaches a critical level [
3]. A decline in TR, or plant water status, may affect plant growth, productivity, and crop quality [
4,
5]. The ability to measure plant TR through contact or proximal measures can aid in irrigation management and crop-yield optimization [
6]. This is because TR will decline with the soil water content, which may be used as a decision aid for irrigation before the plant water stress results in tissue damage and yield loss [
7]. Depending on the plant growth stage and phenological cycle, such measurements and appropriate decisions must be made rapidly and accurately. Leaf gas exchange, including TR, is usually measured manually in a leaf gas-exchange chamber [
8]. While such a measurement may be rapid, it is still biased because a single or a few leaves are taken to represent the entire plant, and sometimes an entire field. Measuring multiple leaves on multiple plants may take several hours, during which time changes in ambient conditions such as PAR and humidity may skew the results.
Accordingly, a rapid solution for estimating TR in multiple leaves and whole plants is desirable. One solution is the process of collecting a large amount of data focused on form and function, termed high-throughput phenotyping [
9]. The analysis of these data to understand variations between different plant genomes is termed phenomics [
10]. Phenomics has been recognized as a useful tool for sustaining agricultural development along with the advances in gene discovery and trait expression [
11]. Attempts to maintain food security and support food production are manifested in the rapid development of phenomics facilities. Such facilities are usually greenhouses with automated irrigation systems and specialized photography chambers to capture images; the latter are analyzed to support the breeding and cultivation chain or to preselect the best plants for transplantation in the field [
12]. The potential for rapid, early screening of “bad” plants may help breeding procedures in allowing the selection of lines with desirable traits to be used as parental lines in breeding programs. Another solution for estimating TR is satellite remote sensing, which can typically estimate large-scale evapotranspiration [
13]. Using a combination of broad-range bands in the visible and near-infrared (NIR) regions, different vegetation indices are calculated and correlated with physiological processes that depend on solar radiation absorbance and reflectance by the plant canopy [
14]. However, this technique is limited and cannot estimate the TR itself when it is affected by environmental factors [
15].
Greenhouse proximal hyperspectral imaging spectroscopy of vegetation is a relatively new field of study which addresses this difficulty. This imaging technology fuses the image domain with the high-resolution spectral domain, in a nondestructive method that has proven to be a reliable technique in agricultural and agronomic research, from the laboratory to field scale. Recently, it has been introduced into greenhouse scenarios for in-vivo analyses [
16,
17]. Hyperspectral imaging spectroscopy is usually interpreted and analyzed using one of two methods: reducing the measured signal to a few important bands or considering the entire shape of the spectral curve. The first is often performed using a narrow-spectral-band vegetation index, whereas the latter approach can involve reducing the dimensionality of the spectrum, by partial least squares algorithm [
18], or machine-learning and deep-learning approaches such as random forest [
19], support vector machine (SVM) [
5], or deep neural networks [
20].
Extreme Gradient Boosting (XGBoost) algorithm [
21] has recently gained popularity as a winning solution in machine-learning competitions. However, the algorithm is relatively new and has been applied to only a few published remote-sensing studies. Georganos et al. [
22] used XGBoost to classify WorldView-3 multispectral urban terrain images. Zhong et al. [
20] compared XGBoost with other machine-learning algorithms and reported it to be superior in classifying multispectral remote-sensing time-series data of mixed crops. Those authors also compared XGBoost classification results with two types of deep-learning models and reported superior or nearly equal results. Abdi [
23] classified multitemporal multispectral land-cover and land-use satellite images of boreal areas using four machine-learning algorithms: XGBoost and SVM were ranked best. Sandino [
24] used a hyperspectral camera mounted on an unmanned aerial vehicle to distinguish trees that were deteriorating due to fungal infections from healthy trees and reported an overall accuracy of 97%. Loggenberg et al. [
25] used a hyperspectral camera mounted on a tripod in a vineyard to classify water-stressed Shiraz vines with good overall accuracy (80%). While all of these studies used XGBoost as a classifier, none tried to estimate continuous values using XGBoost regression trees.
Traditionally, spectral leaf measurements are carried out using a point spectrometer and a special apparatus in the laboratory or the field and then matched with laboriously collected physiological data to estimate certain vegetative traits such as pigment concentration, water content, or TR [
26,
27]. The use of an imaging chamber in a greenhouse is faster, although it creates a queue of plants, each being imaged at a different time and affected by changing ambient conditions. Different ambient conditions have already been proven to affect the usefulness of hyperspectral image analysis [
28]. When phenomics systems are used to study plant function through physiological trait measurements, they are termed functional phenomics systems [
7,
9]. Recently, a new high-throughput, whole-plant functional phenotyping system named PlantArray (PA) was developed, which can calculate the momentary and daily TR of an entire array of plants simultaneously, as well as other plant and soil attributes and interactions [
29]. The PA system’s dynamic responses to changes in ambient conditions [
3], nutrient stress, and biostimulants effectiveness have been recently explored [
1].
Potassium is considered an essential plant nutrient for crops. It is involved in many processes and metabolic pathways, such as osmoregulation, enzyme activation, stomatal opening, photosynthesis and stress resistance [
30]. Potassium fertilizing increases the plants’ tolerance to water shortage, while a deficit may affect stomatal closure and lead to a higher transpiration rate [
30]. The ability to monitor potassium levels in order to prevent yield loss is important and has been examined in several studies. Pandey et al. [
17] used a hyperspectral camera and a conveyer belt for the plants to quantitatively estimate chemical properties (such as potassium concentration) of maize and soybean plants. Zhang et al. [
31] studied oilseed rape leaves, picked, and brought to a hyperspectral scanning camera to estimate potassium content. Pimstein et al. [
18] explored the potential for potassium estimation in field-grown wheat plants using a field spectrometer. A similar investigation was carried out by Mahajan et al. [
16]. The latter tried to find the best narrow-band vegetation index to estimate the quantity or concentration of potassium. While the ability to estimate chemical properties was exhibited, destructive methods were required to produce the database used in those analyses.
In this manuscript, we hypothesized that plants treated with low levels of potassium (compared to medium and high levels) might be accurately classified using XGBoost and images from a hyperspectral moving platform in the greenhouse. The results of the XGBoost classification are compared with the traditional support vector machine classifier (SVM). Moreover, we aimed to estimate the momentary TR using the same ensemble learning XGBoost algorithm. The combined ability to classify the administered treatment and estimate TR was tested during the early growth stage of greenhouse-grown pepper (Capsicum annuum) plants treated with different potassium and salinity levels.
2. Materials and Methods
2.1. Sample Preparation and Experimental Setup
Pepper seedlings (n = 144), approximately four weeks post-germination, were planted in 3.9-L pots filled with sand. The plants were grown during April–May 2019 in a semi-commercial greenhouse located at the Robert H. Smith Faculty of Agriculture, Food and Environment in Rehovot, Israel. The temperature and relative humidity (RH) in the greenhouse were continuously monitored by the PA meteorological station (Plant-Ditech Ltd., Israel). The temperature was maintained under 35 °C by blowing air through a moist mattress; it ranged between 20 and 35 °C and 26% and 90% RH, respectively, with low values during the night.
2.2. Phenotyping Platforms
Plant fertigation (quantity and schedule) was controlled using the PA 3.0 platform (
www.plant-ditech.com). The high-throughput functional phenotyping platform was also used to measure the plant’s dynamic status through highly sensitive temperature-compensated load cells used as weighing lysimeters. Each plant was monitored and controlled separately, enabling high precision (Halperin et al., 2017).
The experiment lasted 13 days. The 144 plants were placed on two separate tables (72 plants per table). The plants on the first table were randomly administered three levels of potassium: low, medium, and high, corresponding to 30 PPM, 105 PPM, and 180 PPM, respectively. The plants on the second table were administered three levels of potassium (low, medium, and high) in combination with control, medium, and high levels of salinity (H
2O, 0.03 M NaCl, and 0.05 M NaCl, respectively) for a total of nine treatments (
Figure 1). At the end of the growing period, the plants were extracted from the pots and examined; 45 plants from the first table and 34 plants from the second table were infected with varying root fungi levels. To minimize the effect of disease on the analysis, the infected samples, as judged by a strict criterion, were eliminated; even plants with a low infection level were removed from the study. The remaining number of samples was sufficient for model training and testing.
The pots were placed into a PA-specific drainage container, situated on a lysimeter (
Figure 2). A rubber ring was used to establish a perfect fit between the pot and the container to prevent evaporation. The pot was covered with a custom-designed cover with a hole in its center for the plant stem to emerge and four additional holes for dripper irrigation, which aided in the uniform spread of irrigation. Irrigation was administered during the night, and water drainage was collected at the bottom of the plastic vessel to maintain pot capacity during the day. Weight loss during the daytime was logged and used to calculate the momentary TR. The momentary TR was calculated following Halperin et al. (2017) and adjusted as in Weksler et al. (2020). The difference in plant weight at time
ti and
ti−1 was measured every 3 min. The weight loss was attributed to water leaving the plant through the stomata and the TR was defined by the first derivative of logged weight readings throughout the day.
2.3. Image Acquisition and Database Building
Hyperspectral images were acquired every hour during the day (07:00–17:00 h) using a custom-made semi-autonomous linear platform built on the greenhouse roof (Weksler et al., 2020). The greenhouse panels are made of a clear PVC material that diffuses the incoming solar radiation. The platform carried a push-broom hyperspectral radiometrically calibrated camera (FX10, Specim, Finland) with a 400–1000 nm spectral range divided to 448 spectral channels and 1024 pixels in a line, with a field of view of 38° (
Figure 2). A small laptop and a microcontroller controlled the camera. The camera’s height and focus were adjusted during the experiment to account for the plant growth. Every image scene included the entire array of plants on both tables.
Following image acquisition, the raw images were calibrated to radiance using a pre-calculated gain image calibrated by the manufacturer and an offset image acquired by a closed-shutter image before every image was taken (see Weksler et al., 2020 for the exact calibration equation). Then reflectance was calculated using a reflectance panel (99% reflectance, Spectralon, Labsphere Inc., North Sutton, USA) placed in the scene, which was also used to calibrate the exposure time before each overpass to prevent saturation across all the bands. To generate the plant spectral database, each plant’s mean reflectance signal was then extracted, and shaded pixels were excluded by selecting the brightest 80% of the pixels as a threshold. In addition, mixed pixels (leaf edges) were also excluded using an Otsu filter [
32]. Therefore, each plant’s mean reflectance was the mean of hundreds of pixels of leaves in different viewing angles. The edges of the sensor’s spectral range were trimmed, and the subsequent analysis was performed for 425–990 nm with 416 spectral bands. Each plant’s spectral signature was standardized to unit vector by removing the mean and dividing by the standard deviation (SNV transformation), followed by a first-derivative transform and a Savitzky-Golay smoothing transformation with a window size of 7 and a second-polynomial degree (
Figure 3). After 13 days, when the plants’ canopies overlapped, the ability to separate individual plants was greatly reduced, and image acquisition was stopped. Several times during the experiment, technical malfunctions prevented camera operation, which reduced the size of the database relative to its full potential. The spectral database was merged with the TR loggings to create the database for the analyses. The final database consisted of 7617 spectral and TR measurements.
2.4. Classifying Potassium
The medium- and high-salinity treatments were not used in the classifier training (
Table 1). These treatments were omitted in order to focus on classifying potassium-level differences and not differences caused by interactions with the salinity treatments.
Three different models were trained to achieve good classification results for the low-potassium treatment, to classify the samples into their given potassium treatment. The different models were: low–high classification, low–medium classification, and low–medium–high classification. By dividing the data as such, we ensure that each model received a balanced dataset and that the three class model can describe a more heterogeneous scenario, such as a field scenario, where different field condition (slope, aggregates, organic matter, and minerology) may cause variation in potassium across the field. All the models were trained and optimized with an emphasis on the ability to classify low levels of potassium. The database was partitioned into the desired classes (
Table 2); subsequently, the data were randomly divided into training (70%) and validation (30%) sets for every model. The classification models were trained using each training sample’s spectral reflectance signal as the independent variable (also termed features) and the treatment label as the dependent variable. For each model’s estimation, a normalized confusion matrix was plotted. The confusion matrix illustrates the model’s outcome as actual values and estimated values for each group, facilitating the interpretation of the model’s performance. The model’s performance in separating the plants according to the administered treatment was evaluated by calculating the estimation accuracy, which ranged between 0 and 100%, as
where
n is the number of samples in the estimation. However, since the number of samples differed between the groups, accuracy may be a biased measure. Therefore, additional metrics, termed precision, and the sensitivity matrices were calculated for each model. A model’s sensitivity is defined as the proportion of actual positives (Equation (2)), i.e., a sensitivity value of 100% means that all of the low-potassium plants were correctly classified. A model’s precision is defined as its usefulness, and the proportion of correctly identified samples out of all samples estimated to have that same label (Equation (3)), i.e., a precision value of 100% means that the model correctly labeled all plants that were classified as low-potassium plants. These metrics also ranged between 0 and 100% and were calculated as
In estimation, feature importance is a measure of the importance of a feature (band) to the model’s estimation and its physical explanation (“spectral assignment”). A common feature importance metric is the gain, which is defined as the improvement in the score (sensitivity) after a feature is used to add a new split to a tree. The feature importance was calculated as the total gain that the feature contributed to the model. To investigate if the same results may be achieved by reducing the dimensionality of the spectral resolution, the 10 most contributing features (bands) from each classification model were selected. Additional classification analysis was calculated based on those features to compare the different models for further analysis.
Furthermore, classification models were also trained using the support vector machine classifier (SVM) using the same features and training samples to investigate if XGBoost has an advantage over a traditional classifier.
2.5. Transpiration Rate Estimation
The database was used to train several models using XGBoost and evaluate their performance. Momentary TR was first estimated by training two different XGBoost models: one using the measured spectra and another with ambient conditions during image acquisition added to the spectra as additional independent variables (features). These parameters (RH, PAR, vapor-pressure deficit, and temperature) were logged using the PA greenhouse weather stations. TR was used as the dependent variable. Since the addition of ambient conditions features was found to produce a better model, the subsequent models were trained using the same features (
Figure 3).
To analyze the contribution and interaction of the different periods of the day (morning, noon, afternoon) on model training and TR estimation the database was partitioned into three temporal ranges: morning (0700–1000 h), noon (1100–1300 h), and afternoon (1400–1700 h). In order to compare models between these temporal ranges, the data contained in each of those time intervals were randomly divided for training and testing as previously described, and a new model was trained and tested for each interval.
In addition, to test the effects of the different salinity treatments (H2O, medium, and high salinity) on the ability to estimate TR, an independent and new model was trained using only the data from the experimental table that received salinity treatments (n = 4311). To validate this model, 300 samples were left out from each salinity treatment (H2O, medium, high salinity). The left-out data for each salinity treatment (n = 300) were composed of equal portions of potassium treatments—100 samples from each treatment: low, medium, high potassium treatments for each salinity treatment. A total of 900 samples were left out for the model validation. A general model was trained on the remaining training samples (n = 3411), and estimations were validated three times.
Furthermore, to test the effects of the different potassium treatments (low, control, and high potassium) on the ability to estimate TR, a new and independent model was trained using the different potassium treatment from the table that did not receive salinity treatments (n = 3199). To validate this model, 300 samples were left out from each treatment (low, control, and high potassium) for a total of 900 left out samples. A general model was trained on the remaining training samples (n = 2299), and estimations were evaluated three times.
Two measures of model performance were calculated: the root mean squared error (
RMSE) and Willmott’s index of agreement (
dr) [
33]):
where
Pi and
Oi are the predicted and the observed TR, respectively. A lower
RMSE (closest to 0) and a higher
dr (closest to 1) are desired.
2.6. Data Analysis and Algorithm
XGBoost is a supervised ensemble learning approach based on decision trees, utilizing a gradient-boosting technique [
21]. The algorithm can be used for either regression or classification and has tunable hyperparameters that significantly affect its accuracy. A weak classifier or regressor tree are used to constantly improve the score and reduce the error. After each iteration, the algorithm predicts the class or attribute for each sample. Samples incorrectly classified receive a higher weight in the next iteration, which forces the algorithm to improve their score in the next iteration. This is principally carried out by defining the objective function (loss function). Unlike other machine-learning algorithms, the objective function holds another term which is the regularization term, which may calculate two regularization parameters (L1, L2) during training to reduce overfitting effects. The hyperparameters are typically trained using a model performance-optimization scheme known as a grid-search technique. Optimization of the model’s hyperparameters was carried out on the training samples to generate the best results from every model. A grid search with 5-fold cross-validation was used to evaluate the best hyperparameters for this experiment. These were the number of trees, the tree depth, the learning rate, regularization parameters, and the percent of features for each tree (
Table 2). As with XGBoost, the SVM hyperparameters were tuned, with linear log space intervals for each different classification model. SVM with a radial basis function kernel was shown to be superior for hyperspectral reflectance classification [
34,
35]; therefore, to produce the best possible accuracy, the parameters C (penalty) and
γ (kernel scale) were searched. For classification, the metric for optimization was the model’s sensitivity, whereas, for the estimation of TR by regression, these were the RMSE and d
r.
The interquartile range is the difference between the values below 75% and above 25% of the samples after the data are sorted. TR values greater than a 1.5 interquartile range were considered outliers (
n = 81) and removed, along with the corresponding spectra. These outliers are most likely the result of momentary interference by greenhouse personnel. The TR data were measured and logged using the PA 3.0 system. Statistical analysis, algorithms, and estimation were performed using Python 3.7 [
36].
4. Discussion
Remote sensing is often used to analyze nitrogen and pigment content since nitrogen is found in proteins, free amino acids, and chlorophyll. Nitrogen bonds with other elements, resulting in a distinct absorption spectrum and spectral changes in the visible, NIR, and shortwave-infrared portions of the electromagnetic spectrum (usually termed chromophores). The same can be attributed to other elements such as phosphorus and sulfur. However, potassium is found in the plant as a positive ion (K+) that does not produce chemical bonds; therefore, it does not produce any absorption feature. While its absence can have distinct effects on plant tissues, in this experiment, the lowest amount of potassium was selected to inhibit biological pathways without causing chlorosis, which can easily be captured by an ordinary camera. Nonetheless, potassium can affect chromophore properties and, as a result, be indirectly monitored via spectroscopy.
Long-term potassium deficiency in leaves usually induces yellowing of the leaves’ margins, i.e., chlorosis [
37]. However, insufficient administration of potassium may inhibit or slow biological and physiological pathways without causing leaf chlorosis. Therefore, there are no visible effects on the leaves, and a typical visual inspection or RGB image may not pick up this nutrient deficiency. Nevertheless, the ability to develop models to classify plants that have low potassium levels from those that have received normal and high levels was achieved, owing to the high spectral, spatial and temporal resolution of the images captured during the experiment, along with the supporting attributes logged by the PA system. Unlike other studies in which a single image is analyzed, a large imagery dataset was created and analyzed in this study. As potassium causes no direct alteration to plant reflectance, creating a sizeable spectral dataset facilitated the data analysis and use of the XGBoost algorithm. Typically, spectral reflectance is captured a few times during growing experiments. In this experiment, the high temporal resolution of the hyperspectral images enabled creating a very big dataset. This dataset was best mined using advanced machine learning, which requires thousands of samples to produce accurate and reliable results. Moreover, the high temporal resolution added to the models’ robustness, as they were related to many consecutive days and thousands of spectral and TR measurements, rather than a single measuring event.
Although K
+ has no direct chromophore, the spectral bands that most contributed to the XGBoost algorithm were, in fact, reflectance bands known to be related to chlorophyll and carotenoid absorption (460 nm and 660 nm) [
38,
39], light-use efficiency (480–500 and 570–600 nm) [
40], red edge (700–720 nm) [
41], light scattering, and the leaf-thickness NIR plateau [
39]. While only one water-related band at 930–990 nm [
42] was found to be one of the most contributing bands, it did influence the model Estimations (
Figure 5). This is not surprising because all of the plants received the same amount of water but different potassium amounts.
TR estimation using machine learning is not yet an established practice. However, the ability to create different models for the different parts of the day was used to develop different TR models that took advantage of the high temporal and spectral resolution database along with the corresponding PA attributes. The best performing model was for the afternoon, similar to our previous findings using a different approach [
28]. Whereas the previous results were obtained by comparing TR variance between parts of the day, and the correlation to spectral bands to capture those differences, this work used a more general approach to capture and model the differences. While both studies were conducted on a single plant species in one custom greenhouse setup, both found that the afternoon is more suitable for TR estimation. Nonetheless, this contradicts the common assumption that around noon, when solar radiation is strongest and TR is highest, is the best time for remote sensing of plant physiological attributes [
43,
44,
45,
46]. This raises a question as to the optimal time to conduct remote or direct spectral data acquisition for vegetation monitoring. Moreover, TR estimations of plants that received different salinity treatments showed that with a generic trained model, salinity affects the estimations. As salinity increased, the model’s performance decreased, and there was more variance in the results. The implication of these results for field conditions, and more specifically for remote sensing, is meaningful. Namely, high-salinity soils may introduce variance into the measurements, making field estimations of TR more prone to errors. In addition, TR estimation of plants that received different potassium treatment revealed that the prediction of a generic trained model is most affected by potassium deficit and performs best when potassium is at a normal levels for the crop.
The most accurate model (82% accuracy) for classifying low-potassium-treated samples was confined to the low and high potassium treatments. The effect of medium levels of potassium in relation to spectral changes was less intense than surplus levels of potassium. The two models to classify low and medium and low and high potassium levels were more sensitive to potassium deficit. When all of the treatments were combined, the overall accuracy decreased. The spectral characteristics of high levels of potassium proved to be more significant. This is evident from the model’s greater sensitivity and precision for high levels of potassium. These different models provide more insight into the effect of potassium on the spectra and, in turn, on the models’ sensitivity to alterations in potassium levels. However, in a real greenhouse or even in field scenarios, practical considerations regarding yield result in surplus addition of potassium to crops. The ability to isolate plants with low potassium levels is important, especially at the beginning of the growing season, which is the growth period covered in this experiment. This ability is an important addition to the grower’s decision-making toolbox, facilitating the targeting of specific areas or specific plants that suffer from potassium shortage to optimize yield.
The advantages of feature extraction and bands selection to reduce the dimensionality of the data and exclude none contributing spectral channels has supported the increased interest in multispectral sensors in space, airborne and field domains [
47,
48]. However, it was shown that a small subset of bands is not sufficient to achieve the same classification accuracy as obtained in hyperspectral domains. Similarly, Pimstein et al. [
18] reported the necessity of many spectral bands to estimate potassium in wheat. While the high spectral resolution was key to identifying the most important spectral bands for this task, the high temporal resolution was more important in achieving high accuracy. In addition, the comparison with a traditional algorithm (SVM) showed a clear advantage for the XGBoost algorithm in terms of accuracy. Nonetheless, the SVM has only two tunable hyperparameters, which makes it much faster to optimize and train.
This research proved reliable for potassium estimation in pepper plants, and additional research using different plants under different growing conditions is warranted. In addition, the ability to create inclusive datasets such as the one suggested here and estimating TR may help future breeding programs identify bad hybrids and focus on successful ones.