1. Introduction
Polarization imaging is a way to analyze the particular direction of oscillation of the electric field described by the light. In opposition with conventional color or multispectral imaging that sample the spectral information, polarization imaging considers the electric field as a vector. Such a vector field is contained in a plane perpendicular to the direction of propagation. As the wave travels, it can oscillate in one particular direction (linear polarization), or in an ellipse (elliptic or circular polarization). Values of polarization images depend on the polarization properties of both the light source and the objects that compose the observed scene. The light can be partially polarized or unpolarized, resulting from either a rapidly changing state of polarization, or an interference effect of polarization.
Several polarization imaging systems, called polarimeters, have been developed in the last past few decades for recovering the polarization state of a lighted scene from few acquisitions. Such systems combine a standard panchromatic imaging device with polarizing optics, e.g., polarization filter, liquid crystal modulator, or prism. Reviews of recent polarimeters have been achieved in the literature [
1,
2]. The most simple optical setup consists in the rotation of a linear polarization filter at several polarization angles in front of a camera. After a preliminary calibration step (radiometric and polarimetric), the polarization states of the incoming irradiance that reaches the sensor can be estimated. However, this setup is sequential and slow, since several image acquisitions at different filter orientations are required to recover the polarization states of a single scene. To overcome this limitation, Polarization Filter Array (PFA) imaging provides a way for snapshot acquisition that could be useful for many imaging applications. It is an extension of the so-called Color Filter Array (CFA) and Spectral Filter Array (SFA) technologies that previously came on the market. We will briefly review the CFA and SFA technologies and concepts, before to introduce the specificities of PFA.
The CFA technology [
3] has quickly become the standard for one-shot color imaging. The technology is lightweight, cheap, robust, and small enough to be embedded in imaging systems. It is composed by a single silicon sensor fitted with a CFA, so that each sensor site senses only one spectral band according to the CFA. A demosaicking procedure is therefore required to recover the incomplete color samples per site. Such procedure uses reflectance properties of acquired images in order to recover the missing color components at each pixel position. Properties of reflectance consist in high spatial correlation in homogeneous areas that constitute an object, and spectral correlation between different channels. The widely-used Bayer CFA for instance samples the green band at half of sites, which makes it a prominent candidate to begin the demosaicking process using spatial correlation. Spectral correlation is then generally assumed in order to estimate red and blue channels using the well estimated green channel. The demosaicking algorithm has to be carefully selected since color reconstruction quality is highly affected by its artifacts, such as blur, zipper effect, etc.
The last past few decades have seen the emergence of an extension of CFA with more than three channels: The SFA technology [
4,
5,
6]. Supplementary channels are generally required for applications that need good color reproduction [
7], illuminant estimation and spectral adaptation [
8], reflectance reconstruction [
9], etc. SFA design considers a trade-off between spatial resolution for spatial reconstruction in the demosaicking process, and spectral resolution for spectrum reconstruction. Thus, some SFA demosaicking algorithms privilege spatial resolution by sampling a dominant channel that represents half of pixels [
10] (as for the Bayer CFA), while other privileges spectrum reconstruction by maximizing the number of channels [
11].
Polarization Filter Array (PFA) technology has been patented in 1995 [
12], but most of the practical implementations and technology advances were made from 2009 to nowadays. Manufacturing processes are various and done by designing metal wire grid micro-structures [
13,
14,
15], liquid crystals [
16,
17], waveplate array of silica glass [
18], or intrinsically polarization-sensitive detectors [
19,
20]. Some of the most evolved PFA structures are presented as being bio-inspired, and implement additional features, e.g., mixing spectral and polarization feature recovery [
21], high dynamic range [
22], or motion detection [
23].
The PFA is composed of pixel-size linear polarizers oriented at four different angles (
,
,
, and
are the polarization orientations employed in most of the PFA cameras), superimposed on a camera sensor chip, as shown in
Figure 1. In front of the sensor, the PFA samples the polarization direction by filtering the incoming irradiance according to polarizer angles. Therefore, each pixel measures the intensity coming from only 1 of the 4 different polarizers. Some PFA cameras appear on the market, like the Polarcam device from 4D Technology [
24], and more recently, the IMX250MZR polarization-dedicated sensor from SONY (Tokyo, Japan). Both PFA use the same filter arrangement that is described in
Figure 1. But the SONY sensor that comes in 2018 is particularly cheap, and holds the polarization matrix bellow the lens array, which limits the cross-talk effect in adjacent pixels [
6]. Moreover, as it was previously done for other computational imaging [
25,
26] and computer vision [
27] algorithms, Lapray et al. [
2] have recently proposed an implementation of a real-time polarization imaging pipeline using an FPGA.
Demosaicking PFA images aims to retrieve the full resolution images that represent the four polarization channels. Stokes imaging is a tool that uses these channels to represent in an efficient way the linear and circular state of polarization of the incoming light. Thus, the final goal of demosaicking is to minimize errors and artifacts in the reconstructed Stokes parameters and the derived descriptors. The Degree Of Linear Polarization (
) and the Angle Of Linear Polarization (
) descriptors are computed from the first three Stokes parameters of the Stokes vector
. In this work, we limit ourself to the linear case, as the most of existing PFA are based only on linear polarizers (but some existing tentatives add plasmonic quarter-wave retarders to retrieve the circular polarization component [
28]). Let us consider the intensities of light measured
,
,
,
after the light is filtered by linear polarization filters (oriented at 0°, 45°, 90°, and 135°). In the literature, the choice of these 4 angles forms an optimal system for polarization imaging in the presence of noise, as described in [
29]. The mathematical formulations for Stokes parameters and descriptors are as follows:
The total incoming irradiance is represented by , is the horizontal/vertical polarization difference, whereas is the +45/−45° polarization difference. If we consider that channels , are normalized intensity values comprised between 0 and 1, and have values between and . values are scaled in the range , whereas values are scaled in the range , and are often expressed in percentage of polarized light.
It is useful to note that a radiometric calibration is very important in case of polarimetric imaging, even more than for color imaging, as the different channel errors are coupled, and thus it can invalidate greatly the parameter estimation [
1]. An example of a complete 2D Stokes processing starting from a PFA image is given in
Figure 2.
The purpose of this paper is to first study the correlation properties of polarization channels and their similarities with those of spectral channels. Then to review some existing interpolation strategies dedicated to filter array imaging, i.e., CFA, SFA, and PFA. Finally, we propose to evaluate objectively the methods and those we have adapted to the PFA case, in the special context of PFA. A diagram of the proposed analysis is shown in
Figure 3. We organize the paper as follows. First, a data correlation study across the polarization channels is presented in
Section 2. Next, different CFA, SFA, and PFA interpolation techniques are presented in
Section 3. Results and discussion of the surveyed methods is proposed in
Section 4. The paper ends with several conclusions in
Section 5.
2. Polarimetric Channel Correlation Study
All demosaicking methods estimate missing values using spatial (intra-channel) (i) and/or inter-channel (ii) correlation assumptions. (i) The spatial correlation assumes that; if a pixel
p and its neighborhood belong to the same homogeneous area, the value of
p is strongly correlated with the values in its neighborhood. Thus, assuming that a channel is composed of homogeneous areas separated by edges, the value of a pixel can be estimated by using its neighbors within the same homogeneous area. Spatial gradients are often used as indicators to determine whether two pixels belong to the same homogeneous area. Indeed, gradient considers the difference between values of two spatially close pixels. We can therefore assume that these pixels belong to the same homogeneous area if the gradient is low, and that they belong to different homogeneous area otherwise. (ii) The inter-channel correlation (also called spectral correlation in CFA and SFA imaging) assumes that the high frequencies (textures or edges) of the different channels are strongly correlated. If the filter array contains a spatially dominant band, demosaicking generally estimates the associated channel whose high frequencies can be faithfully reconstructed, then uses it as a guide to estimate other channels. The faithfully reconstructed image can be used to guide the high frequency estimation within the different channels [
30].
Usual PFA demosaicking methods assume only spatial correlation, thus disregarding correlation among polarization channels. In order to extend CFA and SFA demosaicking methods that also use the inter-channel correlation to PFA images, we propose to compare the spatial and inter-channel correlations in multispectral images with those of polarization images. For this purpose, we use the database proposed in [
31]. Images were acquired by the single-band near-infrared sensor from the JAI AD080 GE camera, coupled with a linear polarizer from Thorlabs (LPNIRE100-B). A precision motorized rotation stages (Agilis™ Piezo Motor Driven Rotation Stages) allowed to take the four images at four orientation angles (
). A registration procedure aligned the images [
32] pixel-to-pixel. The images were also calibrated with respect to the spatial deviation of the illuminant and the non-linearities. There are ten multispectral images, each one being provided with four different polarization angles
. Scenes imply different objects with materials like fabrics, plastics, food, color checkers, glass, and metal. Conditions of acquisition are constant for all scenes, i.e., constant illuminant source (tungsten halogen source) and location, constant field of view and constant lens aperture. Multispectral recoverred images are composed of six spectral channels: Five channels are associated with the visible domain, whereas one channel is associated with the Near-InfraRed domain (NIR). The six spectral channels
are arranged so that their associated spectral band wavelengths increase with respect to
.
Let us first study the properties of multispectral images with respect to the polarization angle of analysis. For this purpose we assess the spatial correlation within a given channel
using the Pearson correlation coefficient (PCC) between the value
of each pixel
p and that of its right neighbor
at spatial distance 2. This coefficient is defined as [
33]
where
is the mean value of channel
. We also assess the inter-channel correlation using the PCC between any pair of spectral channels
and
,
as
Note that in Equations (
4) and (
5), we select a centered area excluding the 16 pixels on the image borders to avoid border effects, that are induced by the registration step used on raw images (described in [
31]). Moreover the choice of 16 border pixels is done to match with the experimental assessment (see
Section 4) of demosaicking methods presented in
Section 3.
Table 1 is the spatial correlation within each spectral channel and the inter-channel correlation between the six spectral channels according to each of the four polarization angles.
Table 1 shows that the spatial correlation is relatively high (0.9504 on average over all channels and polarization angles), which validates the use of the spatial correlation assumption for both SFA and PFA demosaicking. According to
Table 1a,d, the spatial correlation has the same behavior for the four polarization angles. It also highlights that the channel
has low spatial correlation. We believe that it is due to the database acquisition setup, which uses the dual-RGB method leading to unbalanced channel sensitivities. In this configuration, the spectral sensitivity function associated with the channel
is lower than other channels over the spectrum. Thus, all channels don’t share the same noise level, and poor information recovery (especially for
) could lead to low correlation values.
Regarding the spectral inter-channel correlation, the usual behavior is that close spectral channels in term of wavelength band are more correlated than distant ones, and channels in the visible are weakly correlated with the near-infrared channel [
11]. Except the channel
that exhibit low correlation values, this behavior is observed in
Table 1. Indeed,
for instance. Moreover the correlation between the NIR channel
and other channels is low (ranges on average between 0.7953 and 0.8787), while the correlation between channels in the visible domain reaches up to 0.9596 (correlation between
and
).
Table 1a,d show that the inter-channel correlation depends on the polarization angle. Indeed,
Table 1a has values close to
Table 1d, whereas
Table 1b has values close to
Table 1c. We can therefore expect that the polarization channels at 0° are more correlated with those at 135° than those at 45° or 90°.
Now, let us consider the polarization images composed of four polarization angles for a given spectral band. The spatial and inter-channel correlations are assessed using the
PCC applied respectively to channels
,
(see Equation (
4)), and to any pair of polarization channels
and
,
(see Equation (
5)).
Table 2 is the average polarization correlation between the four channels of polarization images, according to each of the six spectral bands. Results highlight that the spatial correlation is high and does not depend on the considered spectral band (except for channel
). Results also confirm that channel
is highly correlated with channel
and channel
is highly correlated with channel
. In general terms, inter-channel correlation between polarization channels is higher than inter-channel correlation between spectral channels (see
Table 1). Indeed, if the incoming irradiance is not polarized, the associated pixel has only the information of the total intensity divided by two, that is the same from one channel to another.
Since the inter-channel correlation is high in polarization images, we propose to apply SFA demosaicking schemes based on inter-channel correlation assumption on PFA images. For this purpose, we can choose the four polarization channels associated to any spectral band but not the one associated to
. Since dual-RGB method is not applied for the channel
, we selected it for the experimental assessment in
Section 4.
4. Performance Evaluation of Demosaicking Algorithms
4.1. Experimental Setup
PFA image simulation is employed to assess the interpolation strategies. As for the correlation study in
Section 2, the polarimetric images from the database of Lapray et al. [
31] was used as references.
All methods of
Table 3 were either re-coded (R), adapted to PFA (A), or provided by authors in Matlab/ImageJ language software (P). They are further integrated into the framework presented in
Figure 4 in order to assess and compare the performances of demosaicking. Stokes descriptors are then computed for both reference and estimated images, according to Equations (
1)–(
3). To avoid precision errors during image conversions, all considered images and processing are using 32-bit float data representation.
We consider the Peak Signal-to-Noise Ratio (PSNR) as quality metric. Between each couple of reference (R) and estimated (E) channel/descriptor, the PSNR is computed as follows:
where
denotes the mean squared error between
R and
E. Because
can differ from a channel (or a descriptor) to another, Equation (
11) takes into account this actual maximal level rather than the theoretical one to avoid misleading PSNR values. In PSNR computation, as for the previous correlation study, we exclude the 16 pixels in each of the four borders of the image to avoid inherent border effect related to either registration or demosaicking processing.
4.2. Results and Discussion
Table 4 displays the PSNR values provided by the demosaicking methods on average over the ten database scenes. Results show that among bilinear filters,
provides the best results for
,
,
,
,
,
, and
, while
slightly exceeds it for
and
. Among PFA-oriented methods, CB and CBSP generally provide the best results. Our proposition to adapt RI and ARI CFA demosaicking methods to the PFA case (with
as guide) provides better results than classical PFA-oriented methods. We remark that RI and ARI are very close together in the PSNR results. RI also provides the best results among all tested methods regarding
parameter and
descriptor.
For PFA methods, it is important to note that the output interpolated pixel is shifted by half pixel when using bilinear kernels , , and , compared to other bilinear kernels , . The output pixel is either aligned to the original interpolated pixel position center, or in the pixel boundaries. We did not correct for this misalignment because applying an image translation by half pixel needs an additional cubic or linear interpolation. So such a registration process cannot be used as a pre-processing for an acceptable comparison methodology over the bilinear demosaicking methods. Thus, the results for , , and should be interpreted with care.
For tested SFA-oriented methods, the use of spectral correlation generally provides better performance than simple bilinear interpolations. Moreover, methods based on gradient computation (BTES, MLDI, and PPID) exhibit the best demosaicking performances. By considering the PPI as a guide for demosaicking, PPID shows the best demosaicking performances among all methods for all polarization channels, also for , parameters and descriptors.
To visually compare the results provided by demosaicking methods on
,
, and
descriptors, we select a zoomed area from the “macbeth_enhancement” scene of the database. Among demosaicking methods, we show the results of the most intuitive method (bilinear interpolation using
filter), and the pseudo panchromatic image difference (PPID) that globally provides the best PSNR results.
Figure 7 shows that there is no significant difference regarding the
parameter, except that the two highlight dots are more apparent in PPID demosaicked image. Computing
and
parameter from a bilinearly interpolated image generates many artifacts that are fairly reduced using PPID demosaicking method.
Generally speaking, we found that demosaicking that are dedicated to PFA don’t necessary give better PSNR result. Thus, it was not obvious that considering color and spectral demosaicking techniques applied to PFA arrangement could be beneficial. The results highlights that this can benefit the pre-processing of PFA.
However, we can express some reservations about the results obtained. First, we limited our study on a relatively small database. Other polarization database in the literature [
59] furbish only the Stokes parameters and polarization descriptors, but no fully-defined reference image
are available. Natural scene samples could also be beneficial for a complementary algorithm classification. Secondly, the database used in this work was made with the same experimental conditions, i.e., constant angle/plan of incidence and a unique non-polarized illuminant. We think that supplementary tests of the best identified algorithms in an extended database containing a better statistical distribution of data could be valuable. Thirdly, the noise associated with reference images is not quantified, and is slightly different from a PFA acquisition system. We thus disregarded recent denoising-demosaicking algorithms that estimate sensor noise to improve the accuracy of demosaicking [
60,
61,
62,
63].
The arrangement of the filter array investigated consists in a
square periodic basic pattern with no dominant band. Our goal was to stay general and to apply the evaluation on a well-used pattern. But some other tentatives of designing extended pattern have been proposed in the literature [
64], for a better spatial distribution of linear polarization filters. An extensive evaluation of best demosaicking algorithms on different arrangements would be considered in a future work.
We found that the acquisition setup may induces correlation between some polarized channels that could be exploited for demosaicking. Since these properties are data-dependent, we have chosen to not use them in our study, despite that they are used in few SFA demosaicking methods.
We remarked that some algorithms need more computation time than others, without necessary giving better results. No computational complexity consideration has been reported in this work. We think that there is a lake of information about these aspects in the original articles. Moreover, Matlab or ImageJ can not provide a consistent evaluation of the complexity of the selected algorithms, e.g., for their potential ability to be parallelized in a hardware acceleration for real-time computing.
5. Conclusions
By considering the inter-channel correlation, CFA and SFA schemes aim to improve the spatial reconstruction of channels from the information of other channels. Experiments on the only available polarization image database have shown that such methods provides better results in term of PSNR than PFA-oriented methods. More particularly, we proposed to adapt two CFA demosaicking algorithms based on residual interpolation to the PFA case, and showed that they provide better results than classical PFA-oriented methods. Moreover, the SFA PPID method provides the overall best results, and largely reduces visual artifacts in the reconstructed polarization descriptors in comparison with bilinear method.
Correlation study has shown that the spectral band considered in the acquisition of polarization channels has no influence on the correlation between polarization channels. The correlation results from this study could be an input and provide assumptions for the design of new demosaicking algorithms applied on cameras that mix spectral and polarization filters.
All algorithms were tested on a small database of images. As future work, we hope that an extensive database of polarization and spectral images will be available soon in the research community. Thus, further evaluations on more various materials and image statistics would validate more deeply our conclusions. Furthermore, we believe that a real-time pipelined implementation of the PPID method using GPU or FPGA needs to be investigated, that would be a valuable tool for machine vision applications.