1. Introduction
Operating renewable energy systems have been constantly and rapidly growing in recent years, mainly in urban areas of developed countries [
1,
2,
3]. Photovoltaic panels, which are expected to offer a way to generate greener electrical power from non-polluting solar resources, represent an important component of these renewable energy systems. Their installations are part of urban energy strategic plans that have been highlighted by government agencies, electricity grid operators and decision makers to promote their installation and use through financial support and tax reduction. For that reason, and to avoid spiteful frauds with these substitute energies, several organizations are becoming ever more attracted to detailed information concerning these solar systems, as well as their localization and energy production [
1].
Field surveys constitute one of the ways of obtaining such precise information, at least with regard to localization, on photovoltaic panels. However, this approach is not always effective or appropriate, since it consumes a lot of time and can be very expensive. Therefore, using other less expensive and faster methods is necessary.
The use of remote sensing data is an interesting approach to automatically detecting photovoltaic installations. In this context, some remote sensing-based approaches have recently been proposed [
1,
4]. These approaches exploit very high-spatial-resolution airborne/spaceborne RGB images with a small number of spectral bands. However, this type of data does not permit the efficient detection of solar panels, principally owing to their material properties. In fact, specular reflections, in some irradiation configurations, can alter the visual properties of these panels, and therefore their detection becomes difficult.
High-spectral-resolution airborne/spaceborne hyperspectral data represent an interesting option for surmounting the above-mentioned restrictions. Indeed, these remote sensing data are acquired by hyperspectral imaging sensors that collect data in hundreds of narrow and contiguous spectral bands, which permits more accurate material detection [
5]. In particular, the basic idea when considering such remote sensing data for detecting and identifying a land surface material, including solar panels, consists of comparing, by using an appropriate criterion, each observed pixel-spectrum in an image with a reference one of the considered material [
6]. Nevertheless, and mainly owing to the low spatial resolution of these sensors, mixed pixels, which contain several pure materials (also called endmembers), and therefore are characterized by mixtures of these endmember spectra, may exist in the collected images and then prevent direct detection of pure materials without sophisticated techniques to unmix these observed pixels.
To date, no hyperspectral-unmixing-based works have been reported in the literature for automatically detecting and estimating the surfaces of photovoltaic panels, other than the recent one proposed in [
7]. In that work, a linear spectral unmixing (LSU) method was proposed to partially unmix a hyperspectral image, by means of a nonnegative matrix factorization (NMF) algorithm that exploits known solar panel spectra in order to automatically detect and estimate areas of this urban land surface material.
It should be noted that LSU methods, which can be viewed as typical linear blind source separation (BSS) approaches [
8,
9], are the most employed methods for processing hyperspectral remote sensing images. This process aims at linearly unmixing all observed spectra into a set of pure material spectra, and a collection of associated abundance fractions [
10]. More specifically, NMF represents one of the most used techniques for performing the LSU process of remote sensing images. It consists of linearly decomposing a nonnegative matrix into a product of two nonnegative matrices [
11]. Moreover, it must here be mentioned that LSU methods are generally considered when the landscape of the observed scene is flat and the irradiance is homogeneous. Otherwise, i.e., when the landscape is significantly non-flat and/or the irradiance is heterogeneous, especially in urban areas, LSU methods are not effective and must be replaced by nonlinear ones [
12], in particular, those based on linear quadratic or bilinear mixing models [
13,
14,
15]. In such cases, linear quadratic or bilinear matrix factorization methods, with nonnegativity constraints, may be considered [
15,
16,
17,
18].
The investigations reported in the present paper extend the above-mentioned approach of [
7], still in a linear context. Indeed, in this paper, and in addition to the method described in [
7], a new linear hyperspectral-unmixing method is designed for automatically detecting photovoltaic panels and estimating their surfaces. Thus, two linear methods are reported in this paper (and their experimental performance is analyzed in a much more detailed way than in [
7]). These approaches are based on original algorithms that have relationships with NMF and that exploit known solar panel spectra. Therefore, the designed linear methods can be considered to be Partial (or informed) NMF (Part-NMF) techniques [
19,
20]. The first designed approach, called Gradient-based Part-NMF (Grd-Part-NMF), employs a projected gradient descent algorithm by means of iterative update rules that include the nonnegativity constraint, whereas the second proposed one [
7], called Multiplicative Part-NMF (Multi-Part-NMF), uses multiplicative and iterative gradient-based update rules.
To evaluate the performance of the proposed approaches, experiments, based on realistic synthetic and real airborne hyperspectral data acquired over an urban region, are conducted. The obtained results are compared to those obtained by NMF-unmixing-based methods in the literature for synthetic and real data, and also to those obtained using a one-class-classification-based approach for the real data. Additionally, and still for the real data, obtained detection and area estimation results are confirmed by using ortho-images, of the same regions, with very high spatial resolution.
The remainder of this paper is structured in the following manner. The next section defines the mathematical data model used in the proposed approaches. The proposed algorithms are presented in
Section 3. In
Section 4, the realistic synthetic and real data employed are described.
Section 5 consists of test results with the considered data. In
Section 6, a discussion of the obtained results is given. Finally, conclusions are provided in the last section.
2. Data Model
As introduced above, the aim of the designed methods in these investigations is to automatically detect solar panels and to estimate their surfaces from an observed high-spectral-resolution hyperspectral remote sensing image . N and K correspond, respectively, to the numbers of spectral bands and pixels of the considered hyperspectral image.
As mentioned in the previous section, the designed methods are related to LSU techniques, in which each observed nonnegative reflectance spectrum related to a pixel in an image is considered to be a linear mixture of the nonnegative reflectance endmember spectra inside the same pixel [
10]. Thus, the whole observed nonnegative reflectance image
X can be modeled, in matrix form, as [
21,
22,
23].
Each row of the above matrix
X is one spectral band of the considered image. The
K pixels of this image are reorganized as a one-dimensional array. Each column of the matrix
is one nonnegative endmember spectrum and each row of the matrix
corresponds to all nonnegative abundance fractions, in all pixels, of one endmember. These coefficients should respect the well-known abundance sum-to-one constraint [
10].
L corresponds to the number of endmembers.
Again, and as introduced in the previous section, the NMF-LSU-based proposed methods, in which some photovoltaic panel spectra are assumed to be known, are designed to automatically detect these solar systems in urban hyperspectral data and to estimate their surfaces. Consequently, the matrix A can be expressed as
The first L1 columns of the matrix correspond to the known photovoltaic panel spectra. The other columns of this first matrix are composed of zeros. The last L2 columns of the second matrix correspond to the unknown endmember spectra. The other columns of this second matrix are composed of zeros. Evidently, L = L1 + L2. Therefore, Equation (1) becomes
In this equation, the unknown matrices, which will be estimated by the designed algorithms, described in the next section, are A2 and, especially, S, which contains photovoltaic panel abundance fractions, allowing the detection of these solar systems and the estimation of their areas.
3. Proposed Approaches
The designed iterative Part-NMF algorithms are similar to the literature NMF approaches that employ iterative multiplicative [
24,
25] or projected gradient [
26] update rules to perform the NMF-unmixing process. In these methods, two nonnegative matrix variables are involved. These matrices are
and
, which respectively aim at estimating
A2 and
S, such that
Accordingly, the designed approaches can be seen as partial/informed NMF ones [
19,
20]. These designed iterative algorithms consist in minimizing the following criterion
by means of iterative gradient-based update rules. ‖.‖
F corresponds to the Frobenius norm.
Since the proposed methods are gradient-based ones, the criterion
J is rewritten as follows to easily obtain the gradient expressions
where Tr(.) and (.)
T, respectively, correspond to the matrix trace and the matrix transpose. With this expression and using the properties reported in [
27], the gradient expressions of
J, with respect to the two variables, are in matrix form
For the designed Grd-Part-NMF method, and using a generalized gradient descent algorithm, the following update rules are obtained:
where
represents the element-wise multiplication.
and
correspond to small positive and appropriately fixed or adaptive learning rates in matrix form (selected by means of the “Armijo rule along the projection arc” as reported in [
26], and therefore this implies a higher computation time). These update rules do not guarantee nonnegativity, and therefore they are not sufficient. Thus, and to meet this constraint, the maximum, between each element of the updated matrix and a very small positive value
ε (usually set to the default MATLAB
epsilon value), is kept [
11]. Accordingly, the final gradient-based iterative update rules for the first proposed method read
For the proposed Multi-Part-NMF method, multiplicative and iterative update rules are derived from the gradient-based update rules (9) and (10). These rules are obtained by formulating the learning rate matrices
φ. as functions of the used matrices so as to obtain a multiplicative update and to preserve the nonnegativity constraint. Indeed, each gradient expression of
J with respect to
(which corresponds to one of the variable matrices
or
) can be written as the difference of two nonnegative functions such that
The function
is obtained by keeping the terms of (7) or (8) having a plus sign, whereas
is obtained by keeping the terms having a minus sign. The multiplicativity and nonnegativity constraints can be satisfied by initializing
with a nonnegative value and choosing the value of the learning rate matrix
according to
where
denotes the element-wise division. Thus, the following learning rate matrices are obtained
Accordingly, and considering each update rule (9) and (10) takes the following form:
the final designed multiplicative and iterative update rules for the
and
matrix variables are
where
is a very small and positive value, again usually set to the default MATLAB epsilon value, which is added to the denominator of each multiplicative update rule to prevent possible division by zero.
The above-designed update rules (11) and (12) for the Grd-Part-NMF method or (18) and (19) for the Multi-Part-NMF method, in addition to guaranteeing the nonnegativity constraint (when the matrix variables are initialized with nonnegative values for the Multi-Part-NMF method), experimentally give a descent algorithm, so that J decreases during iterations.
The proposed Part-NMF algorithms, similar to the literature NMF methods, have limitations: they are not guaranteed to offer a unique solution and their convergence point may depend on their initialization. In fact, the initialization stage of these algorithms is their key problem. To solve this issue, and to prevent random initialization from the point of view of the designed Part-NMF algorithms, and as the initialization step, initial estimates of unknown endmember spectra, stacked in the last
L2 columns of the matrix
may, e.g., be computed by the Vertex Component Analysis (VCA) [
28] or the Simplex Identification via Split Augmented Lagrangian (SISAL) [
29] methods, which are among the most famous approaches used in LSU techniques. The VCA and SISAL methods require the number
L of known and unknown pure materials to be known. It may be automatically determined by using the Hyperspectral Signal Subspace Identification by Minimum Error (HySime) method [
30]. It should be noted that in the proposed methods, among the
L spectra estimated by the VCA or SISAL methods, the
L1 estimated spectra which are the most similar to the known ones, which correspond to photovoltaic panels, are replaced by those known for these solar systems in the set of
L estimated spectra. In addition, each initial value of the matrix
may be set to 1/
L or estimated, from the observed image
X and the known and initial estimated unknown endmember spectra, by using the Fully Constrained Least Squares (FCLS) method [
31], separately applied to each pixel of the observed image.
The abundance sum-to-one property is an additional constraint to be considered during optimization. For this purpose, the approach mentioned in [
31] is used throughout the optimization stage mentioned hereafter, in addition to the initialization mentioned above.
After initializing the two considered matrices as mentioned above, the optimization phase consists of repeatedly updating these matrices, until convergence (reported hereafter), according to (11) and (12) for the Grd-Part-NMF method or (18) and (19) for the Mult-Part-NMF method, and by considering the technique described in [
31] to meet the abundance sum-to-one property.
Checking if the relative modification of the criterion
J takes a value below a predefined threshold
thresh constitutes the first stopping criterion of the proposed Part-NMF algorithms, which reads
where
t corresponds to an iteration. The predefined threshold can be empirically chosen, after running the proposed methods with different threshold values, to obtain the most convincing results, but not necessarily to improve results with respect to most used performance evaluation criteria. Indeed, the choice of other values of the above
thresh parameter may not influence the values of the performance criteria in some domains. In addition, and to prevent a high number of iterations, an additional stopping criterion is considered by checking if this number of iterations exceeds a predefined maximum.
The solar panels area is obtained, in each pixel, by multiplying the estimated abundance thereof by the pixel area of the considered image. The total solar panels area is obtained by aggregating all areas obtained in each pixel.
The complete algorithms of the proposed Part-NMF methods are described below.
Algorithm 1: Grd-Part-NMF unmixing for automatically detecting photovoltaic panels and estimating their areas in urban hyperspectral remote sensing data. |
Input: hyperspectral image X and known photovoltaic panel spectra stacked in the first L1 columns of the matrix A1. |
|
Output: Photovoltaic panel abundance fractions, which allow their detection and their area estimation, and unknown spectra and their abundance fractions. |
Algorithm 2: Multi-Part-NMF unmixing for automatically detecting photovoltaic panels and estimating their areas in urban hyperspectral remote sensing data. |
Input: hyperspectral image X and known photovoltaic panel spectra stacked in the first L1 columns of the matrix A1. |
|
Output: Photovoltaic panel abundance fractions, which allow their detection and their area estimation, and unknown spectra and their abundance fractions. |
Finally, it is worth emphasizing that the proposed approaches can be preceded by a preliminary and optional step that consists of selecting, from input data, zones that potentially contain photovoltaic panels. Thus, this step allows working only on small zones, and consequently it will lead to an acceleration of the detection and area estimation of the considered urban material. This step can be achieved by using a similarity parameter calculated between the reference spectrum of solar panels and each measured spectrum in the data. This will result in a similarity map from which it will be possible to select only the potential zones, around pixels with high similarity values, which contain the considered urban material. One of the most used similarity parameters is based on calculating the absolute value (|.|) of the correlation coefficient (CC) between the reference spectrum of solar panels and each measured spectrum. This parameter is defined as
where
rspv is the reference spectrum of photovoltaic panels and
ms corresponds to each measured spectrum. <., .> and ߠ
ߠ, respectively, stand for the inner product and vector norm. Unlike the core of the proposed approaches, it is clear that this preliminary step is not sufficient in itself, since it does not allow calculating surfaces of the considered urban material with a sub-pixel accuracy.
5. Experimental Results
5.1. Performance Evaluation Criteria
When considering the realistic synthetic data, the Normalized Mean Square Error (NMSE) and the absolute value of the correlation coefficient (|CC|), calculated between the original and estimated photovoltaic panel abundance fractions, are considered to evaluate the performance of all the tested methods. These criteria are defined as
where
is the estimate of the photovoltaic panel abundance fractions map
spv (in one dimensional vector form).
A smaller value of the NMSE criterion corresponds to a better detection / area estimation for solar panels. Also, and when considering the |CC| criterion, a greater value corresponds to a better detection / area estimation for these solar systems.
When considering the real data, and in addition to a visual inspection of the provided results, estimated surfaces of solar panels are compared with those manually digitized on the very high-spatial-resolution ortho-images. As mentioned above, these estimated areas are calculated from the abundance fractions obtained by using tested NMF-unmixing-based methods, after thresholding abundance fractions to prevent several false detections and to obtain a high precision (the used threshold is empirically set to 0.3, which gives the best results after performing tests with different threshold values between 0.2 and 0.4 with a step of 0.05), or from the solar panel pixels classified by using the tested one-class-classification-based approach described hereafter.
5.2. Description of Tests
The proposed Grd-Part-NMF (with fixed learning rate matrices set to 10
-3 for each element of these matrices) and Mult-Part-NMF approaches are tested on the considered data sets. The standard gradient-based (with the same fixed learning rate matrices) and multiplicative NMF methods (Grd-NMF and Multi-NMF) [
11,
24,
25] are also considered in the experiments performed for comparison purposes. The maximum number of iterations considered in these NMF-unmixing-based methods is set to 1000. The threshold value of the convergence criterion (20) is set to the default MATLAB epsilon value (i.e., 10
−6) for all these used methods after running them with different threshold values (between 10
−3 and 10
−9 with a geometric step of 10
−1, thus including the chosen value), and after noticing that the obtained performances are almost the same. Also, the VCA method is applied to estimate, as explained above, the initial unknown endmember spectra that are used in the optimization stage of these tested methods. Again, and as mentioned previously, it should be noted that the mean spectrum of the ground-measured solar panel spectra is considered to be the only known spectrum (i.e.,
L1 = 1) in all of the conducted tests. Moreover, each initial value of the matrix
is set to 1/
L.
For the realistic synthetic data, and since each pixel of these data is created by using one randomly selected spectrum from each the four considered sets, and in order to minimize the random effect of the results, one hundred runs (for the whole process: from the data creation to the results evaluation) are performed and the minimum (min), maximum (max), mean and standard deviation (std) values of the considered performance evaluation criteria are reported. It is clear here that for these synthetic data, L = 4 and therefore L2 = 3 (since L1 = 1).
For the considered real data, and for the three sub-images, the estimated number of endmembers L is 12, and thus L2 = 11 (again since L1 = 1). Also, and in addition to the above tested NMF-unmixing-based methods (Grd-Part-NMF, Multi-Part-NMF, Grd-NMF, and Multi-NMF), a one-class-classification-based method is applied to these real data. This approach contains two stages. In the first one, the correlation coefficient is calculated between each observed pixel-spectrum and the known mean of ground-measured photovoltaic panel spectra. In the second stage, only pixels with correlation coefficients higher than a fixed threshold (empirically set to 0.9 that gives the best performances after performing tests with three threshold values: 0.85, 0.9 and 0.95) are considered (again to avoid false detections and to obtain a high precision) to extract photovoltaic panels.
5.3. Results
The following tables (
Table 1 and
Table 2) show the values of the performance criteria obtained for the realistic synthetic data.
In the following figures (
Figure 8 and
Figure 9), histograms of the considered performance criteria (for the realistic synthetic data) are given.
The following figures (
Figure 10,
Figure 11 and
Figure 12) show the estimated (by the tested NMF-unmixing-based algorithms) photovoltaic panel abundance fraction maps for the three considered real airborne hyperspectral sub-images. These figures also show the obtained one-class (solar panels) classification maps.
Table 3 below shows the photovoltaic panel areas obtained from the three real hyperspectral sub-images.
5.4. Computational Complexity and Run Time
On the one hand, the proposed (and also the tested literature NMF-unmixing-based) approaches are very easy to implement, and their core (i.e., their update rules) contains only simple and direct matrix operations: standard matrix addition and product, element-wise matrix product and division, and matrix transposition. These matrix operations make the algorithmic complexity of the proposed (and the tested literature NMF-based) approaches very limited.
On the other hand, the running time of the core of tested NMF-unmixing-based methods depends on the input image size, the number of materials in the observed scene, the maximum number of iterations and the stopping criterion (20): based on the above description of tests, the running time of these methods (by using an Intel(R) Core(TM) i5 processor running at 2.40 GHz and a memory capacity of 8 GB) is about one second for each run when considering realistic synthetic data, and is less than 20 s for each considered real sub-image.
6. Discussion
For the realistic synthetic data,
Table 1 and
Table 2, and
Figure 8 and
Figure 9, globally show that the designed approaches achieve much better detection of solar panels than the tested literature NMF-unmixing-based algorithms. Indeed, these tables reveal that the designed Multi-Part-NMF (respectively Grd-Part-NMF) algorithm is able to reach 23.73% (respectively 84.80%) mean NMSE for solar panels, whereas the literature Multi-NMF (respectively Grd-NMF) technique attains a mean of 98.64% (respectively 103.03%). In the same way, and in terms of the |CC| criterion, the proposed Multi-Part-NMF (respectively Grd-Part-NMF) algorithm reaches a mean of 0.98 (respectively 0.74), whereas the literature Multi-NMF (respectively Grd-NMF) method attains a mean of 0.49 (respectively 0.47). Also, the above tables show that the other calculated NMSE and |CC| values (minimum, maximum and standard deviation) follow, globally, the same trends as the mean values.
For the considered real data,
Figure 10,
Figure 11 and
Figure 12 confirm, by visual checking with ortho-images, that the designed algorithms are able to better detect all solar panels than the tested standard techniques. Also,
Table 3 shows that the designed algorithms are able to automatically estimate solar panel surfaces much more precisely than the tested standard methods. Indeed, the areas obtained with the proposed methods, for the three considered sub-images, are much closer to manually digitized areas than those obtained by the other tested methods from the literature. For sub-image 1, with a manually obtained area of 15 m² of photovoltaic panels, the proposed Multi-Part-NMF (respectively Grd-Part-NMF) method gives an area of 15.27 m² (respectively 14.57 m²), whereas the standard Multi-NMF (respectively Grd-NMF) method estimates an area of 21.26 m² (respectively 22.27 m²). Moreover, 5 pixels (corresponding to an area of 12.80 m²) are extracted by the considered one-class-classification-based approach. For the second sub-image, which contains an area, again manually digitized, of 12 m² of solar panels, the two proposed methods give areas of around 13 m², whereas the literature NMF-unmixing-based approaches estimate areas of around 1 m². Also, an area of 20.48 m² (corresponding to 8 pixels) is estimated by the tested one-class-classification-based method. For sub-image 3, with an area of 16 m² of panels, manually calculated from the corresponding ortho-image, the proposed Grd-Part-NMF (respectively Multi-Part-NMF) method estimates an area of 16.20 m² (respectively 15.65 m²). The two tested standard NMF-unmixing-based methods give areas of about 90 m² and the one-class-classification-based method extracts 6 pixels (corresponding to an area of 15.36 m²) of photovoltaic panels.
In general, and from all obtained results, it is clear that the proposed methods achieve the desired goal of automatically detecting photovoltaic panels and estimating their areas correctly, and more precisely than the tested standard methods. For the tested literature NMF-unmixing-based methods, this result is expected, since, and unlike the proposed methods, they update all variables, including the panel spectrum that is supposed to be known. This update of the panel spectrum may have a negative influence on the abundance fraction estimation of this material, which may explain the obtained results. Also, and for the considered one-class-classification-based method, which gives acceptable but poorer results than those obtained by the proposed methods, the observation is the same: this result is expected since this literature method, and unlike the proposed methods, considers the total occupied area per pixel containing photovoltaic panels.
7. Conclusions
In these investigations, two approaches, called Grd-Part-NMF and Mult-Part-NMF, were designed to automatically detect photovoltaic panels in urban hyperspectral remote sensing data and to estimate their surfaces. These original approaches, related to LSU techniques, are based on NMF. The designed methods, which use known photovoltaic panel spectra, optimize a criterion with new designed iterative update rules. The first approach uses an iterative projected gradient descent algorithm, whereas in the second one, an iterative multiplicative gradient-based algorithm was designed.
The proposed methods, which are very easy to implement, were tested on realistic synthetic and real airborne hyperspectral data, and their efficiency was evaluated with commonly used performance criteria when considering the synthetic data. For the real data, the performance of the designed methods was evaluated by comparing obtained photovoltaic panel areas with those calculated by a manual digitization on ortho-images of the same regions with very high spatial resolution.
The obtained results show that the proposed methods yield very satisfactory results and prove to be very attractive in the framework of the photovoltaic panel detection in urban hyperspectral remote sensing data, and can be of great help to electricity grid operators and decision makers. Also, according to the obtained results, the proposed methods significantly outperform, when considering the synthetic data, the tested NMF-unmixing-based approaches from the literature. In addition, still for the synthetic data, the proposed Multi-Part-NMF method gives the best results in comparison with the proposed Grd-Part-NMF method. When considering the real data, and in addition to their superiority in comparison with these literature methods, they also surpass the tested standard one-class-classification-based approach. Moreover, and still considering the real data, the two proposed methods give comparable results with, globally, a slight advantage for the Multi-Part-NMF method.
An interesting extension of this work may consist in developing new measures or adapting the standard F-score measure to assess in more detail the performance of the proposed approaches, by evaluating false positive/negative in the detection process. Indeed, the standard F-score measure is used when the detection is binary-valued, which is not the case for the proposed methods.
Other potential extensions of these investigations will mainly consist in applying the proposed approaches to automatically detect and estimate surfaces of other materials in urban hyperspectral remote sensing data, such as solar water heaters and roof windows (and in analyzing if it is possible to distinguish them from the solar panels), and also surfaces of some impervious surfaces such as asphalt, which contributes to the increase of the heat islands phenomenon in urban regions. Moreover, it would be interesting to apply the proposed methods to detect and estimate areas of non-urban materials in hyperspectral data, such as ground surfaces potentially indicating the presence of rocks containing native or complex metallic minerals.