Spatial-Temporal Sub-Pixel Mapping Based on Swarm Intelligence Theory

He, Da; Zhong, Yanfei; Feng, Ruyi; Zhang, Liangpei

doi:10.3390/rs8110894

Open AccessArticle

Spatial-Temporal Sub-Pixel Mapping Based on Swarm Intelligence Theory

by

Da He

^1,2,

Yanfei Zhong

^1,2,*

,

Ruyi Feng

³

and

Liangpei Zhang

^1,2

¹

State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Wuhan 430079, China

²

Collaborative Innovation Center of Geospatial Technology, Wuhan University, Wuhan 430079, China

³

School of Computer Science, China University of Geosciences, Wuhan 430074, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2016, 8(11), 894; https://doi.org/10.3390/rs8110894

Submission received: 8 April 2016 / Revised: 23 September 2016 / Accepted: 25 October 2016 / Published: 29 October 2016

Download

Browse Figures

Versions Notes

Abstract

:

In the past decades, sub-pixel mapping algorithms have been extensively developed due to the large number of different applications. However, most of the sub-pixel mapping algorithms are based on single-temporal images, and the results are usually compromised without auxiliary information due to the ill-posed problem of sub-pixel mapping. In this paper, a novel spatial-temporal sub-pixel mapping algorithm based on swarm intelligence theory is proposed for multitemporal remote sensing imagery. Swarm intelligence theory involves clonal selection sub-pixel mapping (CSSM), which evolves the solution by emulating the biological advantage of the human immune system, and differential evolution sub-pixel mapping (DESM), which optimizes the solution by intelligent operations and heuristic searching in the solution pool. In addition, considering the under-determined problem of sub-pixel mapping, the spatial-temporal sub-pixel mapping method is used to obtain the distribution information at a fine spatial resolution from the bitemporal image pair, which exactly regularizes the ill-posed problem. Furthermore, the short-interval temporal information and the fine spatial distribution information within the bitemporal image pair can be integrated for further use, such as timely and detailed land-cover change detection (LCCD). To verify the validation of the swarm intelligence theory based spatial-temporal sub-pixel mapping algorithm, the proposed algorithm was compared with several traditional sub-pixel mapping algorithms, in both synthetic and real image experiments. The experimental results confirm that the proposed algorithm outperforms the traditional approaches, achieving a better sub-pixel mapping result both qualitatively and quantitatively, as well as improving the subsequent LCCD performance.

Keywords:

spatial-temporal sub-pixel mapping (SSM); swarm intelligence theory; clonal selection sub-pixel mapping (CSSM); differential evolution sub-pixel mapping (DESM); land-cover change detection (LCCD)

1. Introduction

Due to the resolution constraint of sensors, in that the instantaneous field of view (FOV) is usually larger than the land-cover objects the sensors observe [1], the mixed pixel is a common phenomenon in remote sensing imagery acquired by moderate/low-resolution sensors. Such pixels contain more than one land-cover class and cannot simply be attributed to only one class by hard classification. Therefore, the technique of spectral unmixing has been developed to handle the mixed pixel problem by analyzing and decomposing the mixed spectra to a combination form of several single pure spectra (endmembers) [2,3], as well as the fraction each spectra comprises in the mixed model [4,5,6,7]. In spite of this, the outputs of the spectral unmixing still retain a coarse spatial resolution, which cannot meet the requirement for detailed information.

Thus, the sub-pixel mapping method has been proposed to further improve the unmixing result by not only determining the proportion of each land-cover class within the mixed pixel, but also the specific distribution of each land-cover class, with the unmixing output as the input, generating a fine spatial resolution land-cover map. Sub-pixel mapping is based on the spatial dependence assumption, i.e., pixels which are nearer are more likely to be the same land-cover class than pixels which are further apart [8]. This idea was derived from the First Law of Geography, proposing the idea that everything is related to everything else, but near things are more related than distant things. It has also been proved to be an effective assumption for sub-pixel mapping techniques [9,10]. The goal is to maximize the spatial dependence and find the most plausible sub-pixel distribution, subsequently improving the spatial resolution relative to the input coarse image. In the past decades, sub-pixel mapping algorithms have been widely developed. An early sub-pixel mapping method was direct neighboring sub-pixel mapping (DNSM) [11], which is a simple and effective way of reconstructing a more accurate land-cover map than the traditional hard classification. Mertens et al. [9] proposed the spatial attraction sub-pixel mapping (SASM) method considering the spatial dependence between pixels and sub-pixels, and substantially enhanced the accuracy of the reconstructed map. Atkinson et al. [10,12,13,14,15] proposed the pixel swapping algorithm (PSSM) to maximize the spatial dependence by swapping the sub-pixels within a coarse pixel. However, only the spatial dependence between sub-pixels is taken into consideration, and the blind interactive swap just increases the computational complexity. More recently, Hopfield neural networks (HNNs) [16,17,18,19], particle swarm optimization [20], Markov random fields [21,22,23,24,25], the back-propagation neural network (BPSM) [26,27,28,29,30], multiagent system [31], and maximum a posteriori estimation [32,33,34] have been proposed to solve the problem, and have improved the reconstruction result compared to the traditional methods. Nevertheless, most of the current sub-pixel mapping algorithms are based on single-temporal images and only rely on the spatial dependence assumption, which cannot provide sufficient spatial distribution information for the reconstruction of mixed pixels.

Aiming at solving this ill-posed problem, spatial-temporal sub-pixel mapping algorithms have been proposed to obtain the spatial distribution information from the fine spatial resolution image with the same FOV to constrain the reconstruction of the land-cover map’s spatial distribution pattern, regularizing the under-determined problem and thus improving the accuracy of the sub-pixel mapping result. Some research has also been done to incorporate a fine spatial resolution image from a closer acquisition time [35,36,37,38,39,40,41,42,43,44,45,46,47,48] to solve the sub-pixel mapping problem. Ling et al. [35] proposed a method integrating the fine image in the process of sub-pixel mapping to constrain the solution, and further improved it in [36,37,38,39]. Wu et al. [40] also proposed an algorithm using images of different resolutions to provide a constraint; nevertheless, the spatial attraction model used in this framework only assigns class attributes pixel by pixel, which can easily fall into local optima. Gao et al. [41] proposed the spatial-temporal adaptive reflectance fusion model (STARFM), considering both spatial information from Landsat images and temporal information from Moderate Resolution Imaging Spectroradiometer (MODIS) images; however, the result is a reflectance product instead of a land-cover map. Huan et al. [42], Hilker et al. [43], and Hansen et al. [44] further improved the performance of STARFM by considering sensor observation differences. Wang et al. [45,46,47] also used the integration strategy to improve the sub-pixel mapping accuracy by the traditional fast sub-pixel mapping algorithm, and they further improved the performance by adding an HNN to the framework. However, they were only concerned with the method of integration of the fine image within the bitemporal image pair, and the inner sub-pixel mapping algorithm, which only considers the local spatial distribution information, was out of date, leading to a locally optimal solution.

Swarm intelligence theory is one of the most up-to-date population-based algorithms, and it has been successfully applied to the sub-pixel mapping problem [49]. It searches for the most plausible solution in the global population stochastically but heuristically by emulating the biological advantage of the human system, and the typical swarm intelligence theory method is the genetic algorithm (GA). The GA, which is based on natural selection and the natural genetic principle [49], is implemented with several genetic operators, such as crossover, mutation, and selection, on a set of solutions (population). Each solution is scored by a fitness value calculated based on the spatial dependence, and the solution with the highest score is selected to be the parent of the next generation. After a fixed number of iterations, the solution with the highest fitness value is selected as the optimal configuration of the coarse pixel. Nevertheless, traditional swarm intelligence theory based sub-pixel mapping algorithms are mainly implemented on single-temporal images, which cannot provide enough distribution pattern information for the reconstruction. In this paper, in order to obtain more auxiliary information for the constraint of the reconstruction, and based on the principle of the GA, spatial-temporal sub-pixel mapping based clonal selection (SSMCS) and spatial-temporal sub-pixel mapping based differential evolution (SSMDE) are proposed based on the framework of swarm intelligence theory. The clonal selection algorithm (CSA) [50,51], which is inspired by the artificial immune system (AIS), has a powerful information processing capability. It evolves the solution population by a number of genetic operators, as in the GA, but is improved by some advanced operators of initiation, selection, cloning, mutation, reselection, and population replacement according to the AIS. It is thus able to deal with a more complex search space than the GA. The differential evolution algorithm (DEA), which takes advantage of the differentiation information among the population to find the global optimum in the continuous search space, is an effective optimization technique because of its fast convergence, robustness, and simplicity [52]. It uses genetic operators to evolve from a randomly generated initial population to the final solution [52], during which new candidate solutions are generated and a greedy scheme is applied to decide whether the new candidate or its parent will survive in the next generation. Furthermore, a fusion strategy is carried out for the integration of the bitemporal image pair to provide enough distribution information, giving a constraint for the reconstruction. The proposed methods not only transform the sub-pixel mapping problem into a global optimization problem, which overcomes the drawback of falling into local optima, but they also consider the integration of the distribution pattern of the fine spatial resolution image. Meanwhile, the spatial dependence between sub-pixels and pixels is maximized, which yields more continuous land-cover boundaries, thereby enhancing the sub-pixel mapping accuracy.

LCCD is a way to analyze temporal changes in Earth surface properties from multitemporal datasets [53,54]. As a result of the diverse demands of ecosystem monitoring, disaster monitoring, urban expansion monitoring, and land management using remote sensing imagery, LCCD has become one of the most prevalent and effective methods. The rapid changes on the Earth’s surface mean that high-frequency temporal LCCD is becoming increasingly necessary for applications such as crop-growth monitoring and the detection of intraseasonal ecosystem disturbance, while maintaining a fine spatial resolution to provide adequate detail and boundary information [41]. Although sensors such as the Advanced Very High Resolution Radiometer (AVHRR) and MODIS can provide daily image sequences of the same district, the coarse spatial resolution of these sensors means that they are less useful in monitoring sufficient detailed changes, and the mixed pixels, which contain more than one land-cover class, seriously compromise the accuracy of LCCD. On the other hand, the Landsat Thematic Mapper (TM) can obtain a relatively fine spatial resolution, but the 16-day revisit period limits its application in detecting rapid surface changes and ephemeral events. Therefore, it is useful to be able to fulfill both a fine temporal resolution and a fine spatial resolution for LCCD through algorithm computation [35]. Meanwhile, in the framework of spatial-temporal sub-pixel mapping, bitemporal image pairs are used, one of which is the coarse image to be reconstructed, and the other is a fine spatial resolution land-cover map with the same FOV but a different acquisition time. Therefore, the proposed SSMDE and SSMCS can be applied to reconstruct the coarse image at a fine spatial resolution, which is then overlaid with the other fine spatial resolution image within the image pair to obtain the land-cover change information at the sub-pixel scale, achieving a differential map with both a fine spatial resolution (through the reconstruction process of sub-pixel mapping) and a fine temporal resolution, exactly meeting the demands of timely and detailed LCCD.

According to the above introduction, the main achievements of this paper can be summarized as follows:

(1) A spatial-temporal sub-pixel mapping model is built to obtain the distribution pattern information from the fine spatial resolution image with the same FOV according to the differential of each land-cover type between the fine and the coarse images, which helps to provide the sub-pixel mapping problem with a corroborative constraint and exactly regularizes this under-determined problem.

(2) A promising swarm intelligence algorithm, which includes a clonal selection algorithm (SSMCS) and a differential evolution algorithm (SSMDE), is successfully incorporated into the framework of the spatial-temporal sub-pixel mapping method, which transforms the sub-pixel mapping problem into an optimization problem and searches for an optimal solution by maximizing the spatial dependence index (SDI). The SDI is designed to quantify the spatial dependence and measure the spatial dependence of the spatial configuration according to the spatial dependence assumption. It is specifically introduced and formulated in Section 2.2.

(3) The genetic parameters in swarm intelligence theory, such as the mutation rate or crossover rate, are adaptively determined in the spatial-temporal framework by the proposed adaptive strategy, in which a higher fitness value deserves a lower mutation rate or crossover rate to avoid destroying the good structure of the antibodies in the population, while a lower fitness value deserves a higher mutation rate or crossover rate to remove the bad genetic information from the population.

The rest of this paper is organized as follows. Section 2 provides the basic background and rudimentary knowledge needed to better understand the sub-pixel mapping formulation. Section 3 logistically and systematically introduces the specific operation of the two swarm intelligence algorithms incorporated within the spatial-temporal model (SSMDE, and SSMCS). The qualitative and quantitative assessments implemented to verify the performance of the proposed algorithm are described in Section 4. Finally, the conclusion and future work are summarized in Section 5.

2. Sub-Pixel Mapping Problem

2.1. Background

The spectral unmixing technique is a kind of soft classification method used to determine the number of endmembers (land-cover classes) within a pixel and the fraction of each endmember when the acquired imagery has a low spatial resolution and contains many mixed pixels, which involve several land-cover classes and cannot be simply assigned to a specific class. Nevertheless, the spectral unmixing technique provides no information about the distribution pattern of these endmembers within the mixed pixel. Thus, the sub-pixel mapping method was developed to specifically determine the location of each land-cover class inside the mixed pixel, and the proportion each endmember occupies is determined by the fraction map yielded by the spectral unmixing. According to the spatial dependence assumption that analogous geographic elements have a tendency to be closer than disparate ones, sub-pixel mapping can reconstruct a reasonable solution compared to the corresponding real scene by quantifying the spatial dependence and maximizing the SDI.

To describe the operation necessary for sub-pixel mapping, we should ensure the corresponding relationship between the fine and coarse images by dividing every coarse pixel into

S P

sub-pixels according to the scale factor

s

(that is

S P = s^{2}

, where

S P

represents the total number of sub-pixels inside one coarse pixel). Let us assume that

X

represents the coarse image with a size of

M \times N

, and

Y

represents the reconstructed map, and thus has a size of

(s \times M) \times (s \times N)

. Figure 1 shows a simple example of the sub-pixel problem with two classes (class 1 and class 2), and the fraction image of class 1 is displayed in Figure 1a in a grid with the corresponding fraction value inside each coarse pixel. The scale factor is equal to 4, which means that one coarse pixel is divided into 16 (

4 \times 4

) sub-pixels, and the following step is to assign class labels to these sub-pixels based on the fraction of every class. For instance, the fraction value of class 1 in the left side of the upper coarse pixel is equal to 25%, which means that four (25% × 4 × 4) sub-pixels are assigned to class 1. After the operation of assigning classes to sub-pixels, Figure 1b and Figure 1c are two possible results. Since the neighboring pixels are considered to have a significant impact on the center pixel according to the spatial dependence assumption, sub-pixels within the same class are inclined to have a closer location, and therefore the distribution pattern of Figure 1b is more likely to conform with the real situation.

Although the process of sub-pixel mapping seems like a kind of per-pixel classification in a coarse pixel from the microscopic view, it does focus on a single divided pixel from a macro perspective. The sub-pixel mapping process is based on the spatial dependence principle that closer objects are more likely to belong to the same class. This is consistent with the First Law of Geography, i.e., the idea that everything is related to everything else, but near things are more related than distant things. Therefore, the neighboring pixels are considered to have an influence, attracting the sub-pixels of the same class in the center pixel. Furthermore, the sub-pixel mapping process also follows the fraction constraint generated by the spectral unmixing techniques, indicating the number of classes in each mixed pixel as well as the proportion each class occupies.

In order to clearly illustrate the sub-pixel mapping problem, a simple example is shown in Figure 2. Figure 2a is a coarse image, for which the spatial resolution is so poor that we cannot distinguish anything. However, based on the spatial dependence principle and the fraction constraint, we can look into the coarse pixels, exploring the intricate patterns and boundaries of each land-cover class, where blue represents river, green represents tree, brown represents house, and yellow represents soil, as shown in Figure 2b. From this example, we can see that the coarse pixels are related to each other, which is not just per-pixel classification. Thus, we do not just focus on dividing a coarse pixel into s × s sub-pixels and per-pixel classification, and the dividing process (s × s) is just a bridge to further discuss the specific distribution of the coarse image. Furthermore, with regard to the sub-pixel mapping result, the spatial resolution has improved, and, therefore, when compared to the original coarse pixel, the terminology “sub-pixel mapping” is naturally validated.

2.2. Formulation

According to the spatial dependence assumption, real geographic scenes usually have great spatial dependence both between and within land-cover classes, and we can therefore quantify the spatial dependence by transforming the sub-pixel problem into an optimization problem and searching for an optimal solution by maximizing the SDI. Supposing that the fraction map of the

c t h

land-cover class has been generated, then the number of sub-pixels within the center coarse pixel assigned to the

c t h

class can be formulated as follows:

N C_{c} = r o u n d (S P \cdot F r a c t i o n_{c})

(1)

where

S P

is the total number of sub-pixels within the center coarse pixel,

F r a c t i o n_{c}

is the fraction occupied by the

c t h

class, and

r o u n d ()

means confining the argument towards the nearest integer.

In order to measure the sub-pixels’ spatial dependence of the

c t h

class, the neighboring coarse pixels are taken into consideration, and the formula can be expressed as:

S D I_{c s} = \sum_{n = 1}^{N} ω_{n} \cdot F r a c t i o n_{c, n}

(2)

where

N

generally represents the eight neighboring pixels, and

ω_{n}

is the inverse distance weighting coefficient between the

s t h

sub-pixel and the

n t h

neighboring coarse pixel, which is calculated by Equations (2) and (3).

F r a c t i o n_{c, n}

represents the fraction value of the

c t h

class inside the

n t h

neighboring coarse pixel.

ω_{n} = 1 / d

(3)

d = \sqrt{{(l - j)}^{2} + {(m - k)}^{2}}

(4)

where

(l, m)

is the coordinates of the sub-pixel inside the center coarse pixel, and

(j, k)

is the coordinates of the neighboring coarse pixel. The process of distance calculation between the sub-pixel and pixel in the sub-pixel coordinate system is illustrated in Figure 3 by a simple example with the scale factor of 4. The coordinates of the sub-pixel are (4.5, 6.5), and the coordinates of the neighboring coarse pixel are (6, 2). According to Equations (2) and (3), the weight of the selected sub-pixel to the left middle neighboring pixel is equal to 0.213.

To better solve the optimization problem, we can mathematically formulate the total spatial dependence index (TSDI) of the sub-pixels within one coarse pixel as the binary sum of the sub-pixels inside the center coarse pixel using:

T S D I = \sum_{c = 1}^{C} \sum_{s = 1}^{S P} y_{c s} \times S D I_{c s}

(5)

where

y_{c s}

is a binary function to decide whether the

s t h

sub-pixel of the

c t h

class should be added to the total spatial dependence, and is illustrated as follows:

y_{c s} = {\begin{cases} 1, i f s u b p i x e l s i s a s s i g n t o c l a s s c \\ 0, o t h e r w i s e \end{cases}

(6)

s . t . \sum_{c = 1}^{C} y_{c s} = 1, s = 1, 2, ..., S P

(7)

\sum_{s = 1}^{S P} y_{c s} = N C_{c}, c = 1, 2, ..., C

(8)

where Equation (5) indicates that the total number of sub-pixels within the coarse pixel should be equal to 1, and Equation (6) expresses the fraction constraint that the number of sub-pixels assigned to the

c t h

class should be equal to

N C_{c}

. In this way, the sub-pixel mapping problem of simply assigning classes is transformed into an optimization problem of searching for the optimal solution in a continuous solution space through maximizing the TSDI. In this paper, we use swarm intelligence computation to solve the optimization problem.

2.3. Swarm Intelligence Theory for Sub-Pixel Mapping Based on A Single-Temporal Image

Swarm intelligence theory is one of the latest algorithms to imitate the human system, and it generally consists of the intelligent processes of coding, population initialization, evolution, and population updating. These processes have the ability to eliminate the inferior individuals and select superior individuals for the next generation. The main processes are as follows:

(1) Coding and population initialization

Firstly, one coarse pixel is to be divided into

s \times s

sub-pixels according to the principle of the sub-pixel mapping problem, and the class attributes are randomly assigned to these sub-pixels, but constrained by the fraction each class occupies in this coarse pixel. The process of coding is then implemented by connecting the first sub-pixel in each row to the last sub-pixel of the previous row, row by row (as shown in Figure 4), to generate a linear structure, which we generally call the individual. Finally, since the evolutionary algorithm is a population-based method which operates on a set of solutions to select the fittest one, the

N P

individuals are randomly generated with a fraction constraint, and each individual

a_{i}, i = 1, 2, ..., N P

represents one configuration of the reconstruction of the coarse pixel.

N P

is determined empirically by the complexity of the spatial distribution and scale factor s. If the spatial distribution is complex and the scale factor s is large, which means that a diverse scene is to be reconstructed, we should set a large solution searching space to obtain the best solution. In this paper,

N P

is set to 100.

(2) Evolution

After initialization of the population, a series of evolution operators are carried out on the solution pool to evolve the solutions. The typical evolution operators involve crossover and mutation. The crossover operator is essential to take advantage of the differential information. It randomly selects two individuals and a crossover point, and exchanges the part of the linear structure after the crossover point to yield two offspring. The mutation operator is mainly for individuals to escape from the local optima and maintain the diversity of the population. In the sub-pixel mapping problem, mutation is implemented by exchanging the class attributes of two randomly selected positions of the individual, thereby achieving the effect of mutation.

(3) Updating the antibody population

After the evolution operation, the TSDI values of each individual are calculated to select the parent individuals of the next generation, and the individual with the highest TSDI value will survive. The stopping condition is set as a fixed number of generations, and the individual with the maximum TSDI value is selected as the optimal solution for the reconstruction of the coarse pixel.

3. Swarm Intelligence Theory Based Spatial-Temporal Sub-Pixel Mapping

Sub-pixel mapping is an ill-posed problem, since not enough distribution information exists in a single-temporal image to properly constrain the solution of the reconstruction. Meanwhile, in the framework of spatial-temporal sub-pixel mapping, a bitemporal image pair can be considered to have similar geographic distributions, thus the distribution of the fine spatial resolution image within the bitemporal image pair can be obtained and incorporated into the process of sub-pixel mapping. This provides the sub-pixel mapping problem with a corroborative constraint and regularizes this under-determined problem, improving the outcome of the sub-pixel mapping.

A number of studies have addressed incorporating the fine spatial resolution image into the sub-pixel mapping process. Nevertheless, these traditional algorithms and neural network algorithms, which search for the solution scholastically, are out of date. In this paper, we propose two spatial-temporal sub-pixel mapping algorithms based on swarm intelligence theory, to incorporate the distribution constraint from the fine spatial resolution image into the intelligent processes of the evolutionary algorithm, such as the classical events of coding, initialization of the population, crossover, mutation, etc. The framework of spatial-temporal sub-pixel mapping based on swarm intelligence theory is shown in Figure 5. In order to reconstruct the spatial distribution of the coarse image at

T_{2}

, the fine image at

T_{1}

with the same FOV is borrowed by the spatial-temporal sub-pixel mapping to yield an integrated fine image at

T_{2}

. However, there are still many pixels that cannot be uniquely determined, which are represented by the black pixels in Figure 5. Therefore, swarm intelligence theory is used to generate the optimal solution of the undetermined black pixels, outputting the final reconstructed fine image at

T_{2}

.

3.1. Spatial-Temporal Sub-Pixel Mapping

We suppose that the fine image at acquisition time

T_{1}

has been classified into three land-cover classes (light gray represents class 1, dark gray represents class 2, and black represents class 3), and spectral unmixing has been carried out on the coarse image (assuming that the spatial resolution of the fine image is five times that of the coarse image) at acquisition time

T_{2}

(assuming that time

T_{1}

and

T_{2}

are close) to obtain the abundance map for the three land-cover classes (52%, 20% and 28%) (one coarse pixel is shown in Figure 6 as an example). A mean filter is then implemented to degrade the fine image (with the scale factor of five) at

T_{1}

to ensure that the proportion of the three land-cover classes (40%, 32% and 28%) is comparable to the unmixing result. During the time interval of

T_{1}

to

T_{2}

, it can be seen that class 1 has increased by 12%, which is equivalent to three (12% × 5 × 5) sub-pixels when transformed to the integer number of sub-pixels, and thus one can suppose that when one land-cover class has increased, some other land-cover class has changed to the increased class, but no increased class has changed to another class. Therefore, the unchanged part of the increased class can be copied to the distribution reconstruction of the coarse image, and the three extra sub-pixels of class 1 can only be allocated in the red circle. The upper situation in Figure 6 is a possible distribution of class 1 in the coarse image at

T_{2}

. Meanwhile, class 2 has decreased by 12% during the time interval, and thus one may assume that when the land-cover class has decreased, no other class has changed to the decreased class, but the decreased class has changed to some other class. Therefore, the three sub-pixels of class 2 change to class 1, which is shown in the middle sub-figure, with the dotted area meaning that class 2 is about to change to class 1. For the black class, however, the proportion remains the same, and thus one can make the assumption that when the land-cover class remains unchanged, the spatial distribution of the class is also unchanged, which can be copied to the reconstruction of the coarse image.

Thus, both the fraction and real distribution of each class at the previous time are used, and the fraction is used as the determinant to decide whether or not each class increased or decreased, or was even unchanged, between the two temporal images. In addition, according to the fraction change tendency, we can decide which part of the real distribution at the previous time can be borrowed for the sub-pixel mapping of the imagery of the current time.

Integrating the land-cover map of the three classes, one can see that many sub-pixels can be uniquely determined for a certain class, except for the area within the red circle, which contains three sub-pixels of the light gray class and five sub-pixels of the dark gray class. In this paper, we use swarm intelligence theory as the optimization method to uniquely determine the position of the sub-pixels while maximizing the TSDI.

3.2. Swarm Intelligence Theory Based Spatial-Temporal Sub-Pixel Mapping

Swarm intelligence theory based sub-pixel mapping of a single-temporal image was introduced in the previous section. In this section, differing from the traditional GA, swarm intelligence theory is applied to a bitemporal image pair. To further improve the genetic process of swarm intelligence theory, two specific evolutionary algorithms are introduced in this section.

3.2.1. Spatial-Temporal Sub-Pixel Mapping Based on the Clonal Selection Algorithm (CSA)

The CSA is inspired by the human immune system, which can resist the infection of a virus by recognizing the extrinsic antigens. This algorithm generates a pool of candidate antibodies (solutions), selecting the most suitable antibodies corresponding to the antigens, and eliminates the antigens by the combination of the antigens and antibodies, thus solving the optimization problem [55]. When detecting the assault of antigens, the immune system will generate a pool of candidate antibodies, and the antibody with the most affinity to the antigen will successfully combine with the antigen and eliminate it. Thus, the main mechanism of the immune system is the updating of the antibody pool to obtain the most suitable antibodies corresponding to the antigens. The updating mechanism consists of a series of maturation processes, such as cloning the antibodies with a higher fitness to pass on to the next generation for the reason of maintaining a suitable structure, replacing the antibodies with a lower fitness, and mutating to retain diversity of the antibody pool. Through this process of selecting superior antibodies to pass on and eliminating inferior antibodies generation by generation, the best antibody with the most suitable structure will be generated.

The CSA is derived from this mechanism and imitates the intelligent operations of cloning, selection, mutation, and elimination [56]. The antigens are analogous to the coarse pixel to be reconstructed, and the antibody is the solution of the reconstruction map of that coarse pixel. The suitability degree of the antibody to the antigen is measured by the TSDI. In the CSA, the solutions with a higher TSDI are selected to reproduce clonal solutions as the next generation for the sake of maintaining good genetic information, and the solutions with a lower TSDI are eliminated to remove the bad genetic information from the population. In order to facilitate the maturation of the solutions while retaining the diversity of the solution pool, hypermutation is used, giving a chance for solutions to escape from local regions. More importantly, the operations of cloning and selection corroboratively confirm the convergence of the optimization problem [57]. The specific operation of the CSA can be described in the following steps.

(1) Coding and population initialization

In the first step, the integrated land-cover map at

T_{2}

is first coded to the linear structure shown in Figure 7 according to the clonal selection principle by connecting the first sub-pixel in each row to the last sub-pixel of the previous row, row by row. The operation of population initialization involves randomly assigning the three light gray classes and five dark gray classes to the empty position within the red circle, generating a certain number of antibodies according to the population size

N P

. Each antibody

a n_{i}, i = 1, 2, ..., N P

represents a possible spatial distribution pattern inside the coarse pixel. After initialization of the population, the TSDI of each antibody is calculated to obtain the highest TSDI value

m a x (T S D I (a n_{i}))

and the lowest TSDI value

m i n (T S D I (a n_{i}))

for further use.

(2) Evolution

After initialization of the population, the clonal operation is implemented to ensure that the better genetic information is passed on to the next generation. Each antibody

a n_{i}

produces

n_{c l}

clonal antibodies according to the formula:

n_{c l} = r o u n d (α \cdot N P)

(9)

where

α

is the clonal ratio parameter,

r o u n d ()

indicates the operation of confining the argument toward the nearest integer, and

n_{c l}

represents the number of clones each antibody generates. After the process of cloning, each antibody has its clonal set

a n_{i}^{t}, i = 1, 2, ..., N P, t = 1, 2 ..., n_{c l}

, and to ensure there is at least one antibody in the clonal set,

n_{c l}

is set as no less than 1. Thus, the total number of antibodies in the population is equal to

N P \times n_{c l}

.

The hypermutation operation is undertaken to avoid the solution falling into the local regions, while promoting the convergence of these intelligent processes to obtain the best solution with the highest TSDI. In this paper, in order to improve the intelligence of the evolutionary processes, we adopt an adaptive mutation operation, in which the mutation rate

p m_{i}

is adaptively determined according to the fitness degree of the antibody

a n_{i}

, as formulated below:

p m_{i} = e x p (- β \cdot R (a n_{i})), i = 1, 2 ..., N P

(10)

R (a n_{i}) = \frac{T S D I (a n_{i}) - m i n (T S D I (a n_{i}))}{m a x (T S D I (a n_{i})) - m i n (T S D I (a n_{i}))}, i = 1, 2, ..., N P

(11)

which is based on the principle that a higher fitness value deserves a lower mutation rate to avoid destroying the good structure of the antibodies in the population, while a lower fitness value deserves a higher mutation rate to remove the bad genetic information from the population, to better handle the hypermutation process. Parameter

β

is the decay factor defined by the user, and is usually set to 2. Because of the fraction constraint calculated before for determining the number of sub-pixels allocated to each land-cover class, it is unreasonable to simply change the attributes of any sub-pixel and subsequently break the constraint, and thus we take the measure of exchanging the attributes of the two randomly selected positions of the sub-pixel, achieving the effect of mutation.

(3) Updating the antibody population

After the hypermutation operation, the TSDI values of each antibody, as well as the mutated clones, are calculated to decide which antibodies can pass structural information on to the next generation, and the

N P

highest antibodies are selected as parents for the next generation. To retain diversity of the population and exploit the unknown solution space, the

N D

antibodies ranking at the end of the

N P

parents are replaced by newly produced antibodies.

N D

represents the number of displaced individuals, and was previously defined by the user. The stopping condition is set as a fixed number of generations, and the antibody with the maximum TSDI is selected as the optimal solution for the reconstruction of the coarse pixel and to output the result.

3.2.2. Spatial-Temporal Sub-Pixel Mapping Based on the Differential Evolution Algorithm (DEA)

The DEA is one of the latest global optimization methods, which searches for the solution heuristically in the global space [58], and is incorporated into the framework of spatial-temporal sub-pixel mapping in this paper. A genetic operator such as crossover is adopted to take advantage of the differential information for facilitating the convergence of the solution, and the genetic parameters of mutation rate, crossover rate, and scale factor are adaptively determined by the adaptive scheme.

(1) Coding and population initialization

At the inception, differing from the coding mechanism in the CSA, the DEA codes the serial number of the sub-pixel instead of the class label, as shown in Figure 8. The linear structure filled with only 1, 2, and 3 represents the class attribute position, which is constructed according to the fraction constraint of the coarse pixel at

T_{2}

. Therefore, the individual is generated by filling the attribute block with serial numbers according to the class block of the class attribute position. For example, in Figure 8, the position with a serial number of 3 has the attribute number of 1, and thus the coding operation is implemented by filling the attribute block of class 1 with the serial number of 3, and the position with a serial number of 11 has the attribute number of 3, so the attribute block of class 3 is filled with the serial number of 11.

Each individual

I D V_{i}, i = 1, 2, ..., N P

represents a possible spatial distribution pattern inside the coarse pixel, and the TSDI of each antibody is calculated for further use.

(2) Evolution

Differing from the traditional mutation operation of simply exchanging the attributes of two sub-pixel positions, the individual undergoing the mutation is regenerated by adding the weighted difference of two randomly selected individuals to a third randomly selected one, which is formulated as follows:

I D V_{i, G}^{'} = I D V_{r 1, G} + F_{i, G} \cdot (I D V_{r 2, G} - I D V_{r 3, G})

(12)

where the indices of

r 1, r 2,

and

r 3

are randomly generated within the range [1, NP] and are mutually exclusive, as well as

i

, which means

r 1 \neq r 2 \neq r 3 \neq i

.

F_{i, G}

is a scale factor parameter of the

i t h

individual in the

G t h

generation, which can balance the scale of the difference.

I D V_{r 1, G}

represents the

r 1 t h

individual of the

G t h

generation, and

I D V_{i, G}^{'}

indicates the newly generated

i t h

individual of the

G t h

generation as an intermediate quantity.

The crossover operator is then implemented and can be defined as follows:

I D {V_{i, G}^{j}}^{' ​'} = {\begin{cases} I D {V_{i, G}^{j}}^{'}, i f (r a n d_{1} < p c_{i, G}) \\ I D V_{i, G}^{j}, o t h e r w i s e \end{cases}

(13)

where

I D V_{i, G}^{j}

represents the serial number of the

j t h

position in the

i t h

individual of the

G t h

generation, and

I D {V_{i, G}^{j}}^{'}

represents the corresponding serial number after the mutation operation.

r a n d_{1}

is a randomly generated number ranging from 0 to 1.

p c_{i, G}

is the genetic parameter of the crossover rate, which determines the probability of the crossover operation.

In order to determine the genetic parameters adaptively, instead of manually, which is time-consuming, an adaptive strategy is adopted based on the principle that the better individuals are more likely to survive and pass on better genetic information (better structural information) to the next generation, and is formulated as follows:

p c_{i, G + 1} = {\begin{cases} r a n d_{2}, i f r a n d_{3} < p m_{i, G} \\ p c_{i, G}, o t h e r w i s e \end{cases}

(14)

F_{i, G + 1} = {\begin{cases} 1 - r a n d_{4}^{{(1 - G / I G)}^{d}}, i f r a n d_{5} < p m_{i, G} \\ F_{i, G}, o t h e r w i s e \end{cases}

(15)

where

p c_{i, G}

and

F_{i, G}

represent the crossover rate and scale factor of the current generation, while

p c_{i, G + 1}

and

F_{i, G + 1}

indicate the genetic parameters of the next generation, respectively.

r a n d_{3}

,

r a n d_{4}

, and

r a n d_{5}

are randomly generated numbers within the range [0, 1]. IG represents the total number of generations of the differential evolution process defined by the user.

d

is the nonconforming degree parameter, and is empirically set to 3 [59].

p m_{i, G}

is the mutation rate, which acts as a threshold of the adaptive scheme, depending on the best and worst TSDI values of the current generation, and is formulated as follows:

p m_{i, G} = \frac{T (I D V_{i, G}) - T (I D V_{w o r s t, G})}{T (I D V_{b e s t, G}) - T (I D V_{w o r s t, G})}

(16)

where

T (I D V_{i, G})

means the TSDI value of the

i t h

individual in the

G t h

generation, and

T (I D V_{b e s t, G})

and

T (I D V_{w o r s t, G})

represent the best and the worst TSDI values of the

G t h

generation, respectively.

After the adaptive mutation and crossover operations, more intelligent processes need to be implemented to further improve the individual. Considering that repeated serial numbers inside the individual might emerge due to the process of mutation, further measures should be taken to repair the repeated positions, and the process is shown in Figure 9. Serial numbers 2, 6, and 18 are missing, and 9, 22, and 5 are repeated, and thus should be replaced to repair the integer of the serial number of the individual. The exchange operator is then implemented to improve the individual by randomly selecting two positions and exchanging the serial numbers inside the positions. Finally, an insertion operation is undertaken to stochastically disturb the solution and facilitate the convergence by randomly selecting two positions, putting the front serial number to the position behind the next serial number, as shown in Figure 6.

(3) Updating the population

After the process of evolution, the TSDI of each antibody is calculated to decide the parents of the next generation, as follows:

I D V_{i, G + 1} = {\begin{cases} I D {V_{i, G}}^{' ​'}, i f T (I D {V_{i, G}}^{' ​'}) > T (I D V_{i, G}) \\ I D V_{i, G}, o t h e r w i s e \end{cases}

(17)

where

I D V_{i, G}

and

I D {V_{i, G}}^{' ​'}

represent the

i t h

individual in the

G t h

generation before and after the process of evolution, respectively. Only if the TSDI of the

i t h

individual improves during the process of evolution can this individual pass the structural information on to the next generation. The best TSDI and the worst TSDI in the current generation are then calculated for the updating of the mutation rate, crossover rate, and scale factor.

The stopping condition in the differential evolution also adopts a fixed number of generations, and the individual with the maximum TSDI is selected as the optimal solution for the reconstruction of the coarse pixel, and to output the result.

Although the two proposed algorithms have similar main frameworks, both of which are based on natural selection and the natural genetic principle, searching for the most plausible solution in the global population stochastically but heuristically by emulating the biological advantage of the human system, they are different in the coding method and evolution process. Both algorithms are implemented with several evolution operators on a set of solutions (population) and an adopted adaptive strategy to determine the genetic parameters, and each solution is scored by a fitness value calculated based on the spatial dependence. The solution with the highest score is then selected to be the parent of the next generation. However, the coding method of SSMCS is based on the class attributes, and it simply transforms the square structure to a linear structure by connecting the first sub-pixel in each row to the last sub-pixel of the previous row. Meanwhile, the coding method of SSMDE is based on the serial number of each sub-pixel in the individual, and the linear structure is divided into several class attribute blocks according to the fraction constraint. We then fill these blocks with the serial numbers of each sub-pixel according to their class attributes. Furthermore, the evolution processes of SSMCS and SSMDE are also different. SSMCS evolves the solution population by a cloning operator and a mutation operator. On the other hand, SSMDE evolves the solution population by an advanced mutation operator which adds each individual to the differential of another two individuals, as well as the crossover operator, which is not included in SSMCS.

4. Experiments and Analysis

We designed three experiments to demonstrate the effectiveness of the proposed algorithms (SSMDE, SSMCS) visually and quantitatively, and a number of previous sub-pixel mapping algorithms were used to make a comparison: pixel swapping sub-pixel mapping (PSSM) [11], sub-pixel mapping based on a genetic algorithm (GASM) [49], sub-pixel mapping based on clonal selection (CSSM) [56], sub-pixel mapping based on differential evolution (DESM) [58], as well as the spatial-temporal sub-pixel mapping versions of these algorithms (spatial-temporal sub-pixel mapping based on a pixel swapping algorithm (SSMPS) [35], spatial-temporal sub-pixel mapping based on a genetic algorithm (SSMGA) (spatial-temporal sub-pixel mapping based on a genetic algorithm (SSMGA) has not been proposed so far, and we just added the spatial-temporal sub-pixel mapping principle to GASM to generate SSMGA), to examine the validation of the spatial-temporal sub-pixel mapping principle. The population size, crossover rate, and mutation rate in GASM and SSMGA were set to 100, 0.5, and 0.5, respectively. In CSSM and DESM, the crossover rate and mutation rate employ an adaptive strategy during the generation, so we just needed to determine the population size, which was set to 100 (the clonal rate

α

was set to 0.02 and

N D

was set to 10 in CSSM). The overall accuracy (OA) and Kappa coefficient were calculated for the quantitative assessment of the accuracy of the proposed sub-pixel mapping algorithm. Furthermore, since the experimental datasets were bitemporal image pairs, LCCD was conducted between the sub-pixel mapping result and the fine spatial resolution image during each experiment (the LCCD was simply implemented by differentiating the sub-pixel mapping result with the fine spatial resolution image and then classifying the differential map into two classes: changed areas and unchanged areas). OA and Kappa consider all the pixels of the image, including the pure pixels, as parents in the finer resolution. These sub-pixels all belong to the pure pixels and will only raise the value of OA and Kappa, without providing information about the algorithm’s predictive abilities. To improve the reliability, adjusted OA (OA*) and adjusted Kappa coefficient (Kappa*) were developed, which only take the mixed pixels into consideration and ignore the pure pixels, because the sub-pixel mapping method only reconstructs the mixed pixels, and the pure pixels would consequently increase the OA and Kappa, masking the true contribution of the tested algorithm. These criteria have been proved to be useful in the published paper by Mertens et al. [41].

4.1. Experiment 1: Landsat 8 Image

In Experiment 1, we obtained bitemporal Landsat 8 images from Cili County, Hunan province, in which the farmland areas are abundant, as shown in Figure 10a,c. The acquisition times of the bitemporal images were 13 July 2013, and 7 August 2013, respectively. The time interval is 25 days, and it is worth noting that 7 August is the beginning of the autumn in the 24 solar terms in China, by which time many farmland crops are ripe and being harvested, and thus more bare soil would appear compared to the image of 13 July. Together, these images exactly meet the condition of frequent change over a small time interval. We split the test area into subsets of 240 × 240, with a resolution of 30 m. The coordinates of the selected test area are (29°4′45.66″–29°8′38.60″N) and (111°0′19.21″–111°4′45.27″E), with the north arrow toward the upside. A supervised hard classification method—support vector machine (SVM)—was applied to the bitemporal Landsat 8 images to generate the land-cover maps (Figure 10b,d) to be used for the degrading and accuracy assessment. The land-cover maps were classified into five land-cover classes: water, farmland, vegetation, urban, and bare soil.

Synthetic images were used as the input fraction images, and were obtained by degrading the hard classification image from 7 August to a coarser scale using an averaging filter of four. We chose synthetic imagery to verify the validation of the proposed algorithm so as to avoid coregistration errors between the lower- and higher-resolution images, the sensor point spread function, the atmospheric effect, and the spectral unmixing error, and thus the sub-pixel mapping results solely illustrate the performance of the different algorithms. Figure 10e–i shows the degraded fraction image, and the sub-pixel mapping results of PSSM, GASM, DESM, CSSM, SSMPS, SSMGA, SSMDE, and SSMCS are displayed in Figure 10j–q. To further examine the performance of the reconstruction of detailed land cover, the small area of S1 is zoomed in on in Figure 11a–i.

According to the visual assessment, as can be clearly seen in the zoomed version in Figure 11, GASM is weak at preserving boundary features such as the coastline. Linear features, such as bridges and roads, are almost entirely missing in the results of the traditional methods. When compared with the other algorithms, SSMGA, SSMDE, and SSMCS outperform the other methods, giving a better linear reconstruction and smoother boundaries, and are more consistent with the reference. As for the quantitative evaluation in Table 1, the OA and Kappa values of the SSMDE and SSMCS are almostly the same, and are the best among all the algorithms, reaching 81.62%, 0.7584 and 81.61%, 0.7584, respectively, while the accuracies of SSMGA are slightly lower than SSMDE and SSMCS. The advantage of the proposed algorithms are also shown for LCCD in Table 2, in which the SSMDE and SSMCS algorithm obtains a higher OA of 84.15% and 84.13%, respectively. To solely examine the reconstruction of the mixed pixels and ignore the influence of the pure pixels, the OA* and Kappa* were also calculated, where SSMDE and SSMCS again obtain the best result. The reason for this is down to the advantage of the genetic operators such as crossover, mutation, and selection, which can exchange information adequately to obtain the optimal solution, and the advantage of the spatial-temporal sub-pixel mapping principle, which can incorporate the distribution information of the fine spatial resolution image to regularize the ill-posed sub-pixel mapping problem.

In this way, the synthetic coarse image of 7 August 2013 can be seen as the daily obtained MODIS image, with frequent temporal information but coarse spatial resolution. Meanwhile, the original Landsat 8 image of 13 July 2013 can provide the high spatial resolution, but the 16-day revisit period missed the growth cycle at the beginning of autumn. Therefore, the proposed algorithm can fuse the temporal information of MODIS and the spatial information of Landsat to obtain a high-resolution image at the missing growth cycle time. The experimental results confirm that the proposed algorithm outperforms the traditional approaches, achieving a better sub-pixel mapping result, both qualitatively and quantitatively. Furthermore, it also achieves a change map with both a fine spatial resolution and a fine temporal resolution, exactly meeting the demands of timely and detailed LCCD.

4.2. Experiment 2: QuickBird Image

In Experiment 2, bitemporal QuickBird images of 240 × 240 (at a 2.5-m spatial resolution) of the urban area located at Shiyan City, Hubei province, which were acquired in 2002 (Figure 12a) and 2004 (Figure 12c), respectively, were taken into consideration to further examine the performance of the proposed algorithm. The coordinates of the QuickBird images are (413,616.00–414,211.20 m E) and (3712,485.60–3713,080.80 m N), with the north arrow toward the upside. A supervised hard classification method—support vector machine (SVM)—was applied to the bitemporal QuickBird images to generate the land-cover maps (Figure 12b,d) to be used for the degrading and accuracy assessment. The land-cover maps were classified into five land-cover classes: impervious surface, vegetation, pond, building, and road. Experiment 2 also adopted a synthetic strategy, as in Experiment 1, to avoid the errors introduced by coregistration and spectral unmixing. The SVM hard classification of the QuickBird image from 2004 was degraded by a mean filter with a scale factor of four to yield the fraction maps shown in Figure 12e–i. Figure 12j–q shows the sub-pixel mapping results of PSSM, GASM, CSSM, DESM, SSMPS, SSMGA, SSMDE, and SSMCS, respectively, and the zoomed area of S1 showing the detailed reconstruction of each algorithm is shown in Figure 13a–i.

According to the visual assessment, when compared with Figure 13a, one can see that CSSM is better able to reconstruct the scattered impervious surfaces (shown in red) than PSSM, which reconstructs the discrete scattered land cover into a continuous blocky area, thus losing much of the detailed information. Meanwhile, the results of GASM and DESM are less pleasing due to the discontinuous land-cover boundaries and noise. The performance of the reconstruction of the linear road (shown in blue) is disappointing in all the sub-pixel mapping algorithms, except for SSMCS, as shown in Figure 13, where it effectively reconstructs the linear road, as well as the other land covers, and the distribution pattern is the most similar to Figure 13a. On the other hand, in Table 3, the quantitative accuracy result is consistent with the visual assessment, where for the OA, SSMCS achieves the best result of 80.98%, and the performance of SSMDE resembles SSMGA. In Table 4, the SSMDE and SSMCS also outperform the others in terms of LCCD with higher OA of 88.08% and 88.20%, respectively. A possible reason for this can be ascribed to the transformation of the sub-pixel mapping problem into the swarm intelligence based optimization problem, enabling us to search for a solution in a continuous solution space, thereby guaranteeing the continuity of the land cover and yielding a smoother result. The strategy of the spatial-temporal model also contributes to the more accurate result, due to the detailed spatial information it obtains from the fine spatial resolution image to regularize the under-determined sub-pixel mapping problem.

4.3. Experiment 3: Real Image

Although the synthetic images utilized in Experiments 1 and 2 can avoid the errors introduced by coregistration or spectral unmixing, allowing us to solely evaluate the performance of each sub-pixel mapping algorithm, this does not meet the practical requirement of applying the algorithm directly to a real low spatial resolution image, instead of a synthetic image generated by degrading the hard classification image. Therefore, sub-pixel mapping of a real low spatial resolution image (MODIS) was implemented in Experiment 3, and a Landsat TM image with almost the same acquisition area was chosen to evaluate the performance of each algorithm. Given that coregistration and spectral unmixing are considered as preprocessings of each sub-pixel mapping algorithm, the performances of each sub-pixel mapping algorithm could be considered to be comparable due to the same errors being introduced by these preprocessing operations. It is worth noting that the synthetic process can avoid the coregistration errors between the lower- and higher-resolution images, the sensor point spread function, the atmospheric effect, and the spectral unmixing error, and the sub-pixel mapping results solely illustrate the performance of the different algorithms; on the other hand, the real experiment process considers the practical requirement of applying the algorithm directly to a real low spatial resolution image, and can test the adaptability and expansibility of the proposed algorithm. Both the synthetic and real processes are feasible, and they just focus on different aspects and work for different purposes.

The MODIS surface reflectance product (24 × 24) acquired at Shenzhen City, Guangdong province, on 12 October 2009, with a spatial resolution of 500 m, was chosen as the low spatial resolution image to be reconstructed, which is shown in Figure 14e, and the Landsat TM image (408 × 408) acquired in the same area with a 30-m spatial resolution on 12 August 2001, as shown in Figure 14a, was selected as the fine spatial resolution image to provide a distribution pattern for the reconstruction process. The Landsat Enhanced Thematic Mapper (ETM) image acquired on October, 2009, at the same scene, was used to obtain a fine spatial resolution classification result as the reference to evaluate the accuracy of each algorithm by the k-means method, as shown in Figure 14c,d, respectively. The coordinates of the Landsat ETM+ image and MODIS image are (113°1′25.85″–113°8′32.45″E) and (22°7′21.00″–22°3′42.20″N), and the north arrow is toward the upside.

Three steps were taken to complete the preprocessing. Firstly, the TM image from 2001 was classified into three land-cover types (water, vegetation, and urban) using the k-means method, as shown in Figure 14b. The coregistration process was then implemented between the MODIS image from 2009 and the TM image from 2001, so that each fine pixel in the TM image had a corresponding coarse pixel in the MODIS image, using the ground control point function provided in ENVI software. During the coregistration, the MODIS image was resampled to 510 m to match the TM image (30 m) data by 17 times. Finally, fully constrained least squares (FCLS) linear spectral unmixing was carried out to generate fraction images of the MODIS image, as shown in Figure 14f–h. After the preprocessing, we could use the standard data to implement the sub-pixel mapping algorithms, and the results are displayed in Figure 14i–p. As for the quantitative accuracy assessment, we used the classification result of the 2009 TM image (Figure 14d) to compare with the sub-pixel mapping result to obtain the statistics shown in Table 5. In order to evaluate the performance of the LCCD generated by each sub-pixel mapping method, we overlaid the classification result of the 2001 TM image on the classification result of the 2009 TM image to produce a differential map as the reference LCCD result, and each sub-pixel mapping result was overlaid on the classification result of the 2001 TM image to generate the LCCD result, which was then compared to the reference LCCD result to obtain the statistics of the LCCD accuracy, as shown in Table 6.

In the visual assessment, one can see that PSSM loses almost all the structural information and boundary information between the different classes, showing a fuzzy classification for the whole map. Although the result of GASM contains some structural information when compared to PSSM, there is a lot of noise, which compromises the visual effect to a great extent. On the other hand, the results of DESM and CSSM are better than the results of PSSM, providing relatively clear boundaries between classes, but the inner structure of each class is still fuzzy. All the sub-pixel mapping results are more or less fuzzy and disappointing, except for the spatial-temporal fusion based sub-pixel mapping methods (SSMPS, SSMGA, SSMDE, and SSMCS), which better reconstruct the structural information between and within classes, and are less affected by noise, when compared to the results of the aforementioned traditional methods. Although the results (SSMPS, SSMGA, SSMDE, and SSMCS) appear similar, the quantitative assessment can be used to further distinguish the algorithms.

When it comes to the quantitative assessment, we can see in Table 5 that the accuracies of all the sub-pixel mapping algorithms are lower than those in the synthetic experiments, which may be due to the errors induced by the coregistration and spectral unmixing operation and the classification errors of the TM image. The scale factor

s

is another reason for the compromised statistical accuracy, where, according to the principle of the sub-pixel mapping problem, one coarse pixel should be reconstructed to an

s \times s

fine pixel, which indicates that a larger scale factor means a more complex reconstructed scene, thereby generating a lower accuracy. Nevertheless, SSMCS still acquires a reasonable accuracy when compared to the other algorithms with regard to OA and Kappa, reaching 64.50% and 0.4533, respectively. Meanwhile, SSMDE and SSMGA obtain a relatively high OA of 64.18% and 64.24%, respectively. Note that all the adjusted OA and Kappa values are the same as the normal OA and Kappa values in Experiment 3, since the adjusted OA and Kappa only take the mixed pixels into consideration, and, in this case, the spatial resolution of the MODIS image is 500 m, which is large enough that one pixel could contain several land-cover types. During the spectral unmixing process, all the pixels were determined as mixed pixels. Therefore, the adjusted OA and Kappa values are all the same as the normal OA and Kappa values. As for the subsequent LCCD results in Table 6, the results are consistent with the sub-pixel mapping results, and SSMCS obtains the best LCCD accuracy of 70.93%, confirming the superiority of the proposed algorithm, regardless of the low accuracy.

The main limitation of this technique is the full constraint of the fractions derived from the spectral unmixing technique. As we know, the abundance map generated by the spectral unmixing process is the input of the sub-pixel mapping method; however, the error of the spectral unmixing is introduced into the following sub-pixel mapping process and seriously compromises the sub-pixel mapping result. This is why the proposed method in real Experiment 3 has a limited advantage, while, in synthetic Experiments 1 and 2, which use the degraded image as input, it has a noticeable advantage over the other algorithms.

4.4. Computational Complexity

We can consider the computation time in terms of computational complexity, which can reflect the computation time from a macro perspective. Furthermore, the computational complexity is more convincing since the computation time is related to many uncertainties, such as the actual coding strategy in the program, the programing platform, the configuration of the user’s PC, etc.

The complexity of SSMCS can be split into four parts: (1) the integration of the previous high-resolution image; (2) the step of population initialization and coding; (3) the step of cloning and mutation according to the clonal rate

α

and mutation rate pm; and (4) the step of TSDI calculation and population updating. We assume that the number of land-cover classes is C, the total number of pixels in each dimension of the input abundance map is M × N, the scale factor of the sub-pixel mapping is s, the population size is NP, and the generation number is G. Since only one coarse pixel is processed at a time, the computational complexity should be multiplied by M × N, so the computational complexity of part 1 is O (M × N × C), which is independent of the following part. For part 2, the complexity is O (M × N × s²), since the initialization and coding involve a sub-pixel scale. In part 3, the computational complexity is O (M × N × s² ×

α

× NP × G) due to the mutation processing for every antibody in every generation, and the clonal processing only functions on the total number of antibodies. As the parts of initialization, coding, cloning, and TSDI calculation are dependent parts, they have the total computational complexity of O (M × N × s² ×

α

× NP × G). Considering all the aforementioned parts, the total computational complexity of SSMCS is O (M × N × C + M × N × s² × NP × G).

The complexity of SSMDE can also be split into the same parts as SSMCS, except for part 3, which contains the mutation operator and crossover operator. The total computational complexity of the generation part is therefore O (M × N × s² × 2 × NP × G). Taking all four parts into consideration, the total computational complexity of SSMDE is O (M × N × C + M × N × s² × 2 × NP × G).

5. Sensitivity Analysis

The proposed algorithms have several important parameters, including the scale factor s determining the complexity of the reconstructed scene, and the genetic parameters controlling the genetic process of evolution, which have a significant impact on the result of the reconstructed map. Since the sub-pixel mapping results were only analyzed at the scale factor of 4 in the synthetic experiments, multiple scale factors were also tested on the synthetic images used in Experiments 1 and 2 to examine the performance of the proposed algorithm.

The main genetic parameters are as follows:

(1): $p m$ is the mutation rate parameter in SSMCS.
(2): $p c$ is the crossover rate parameter in SSMDE.

Beside, the genetic parameters are adaptively determined by the adaptive strategy proposed in this paper, the two synthetic images used in Experiments 1 and 2 were again chosen to verify the adaptive strategy, and the results obtained by the adaptive parameters were compared with the non-adaptive results of different values of the genetic parameters.

5.1. Sensitivity in Relation to the Scale Factor

To explore the sensitivity of the proposed algorithm in relation to scale factor s, all the other parameters were kept the same as in Experiments 1 and 2. The scale factor was then set as s = {4, 5, 10}, and the corresponding OA accuracy of different algorithms were serve as the benchmark.

As shown in Figure 15, the OA values are negatively correlated with the scale factor, i.e. the higher the value of scale factor s, the lower the OA value. This is because the distribution of the land-cover classes becomes more complex as the scale factor s increases, leading to an increase in the difficulty of the sub-pixel mapping. Given the fact that the accuracy of the synthetic images decreases as the scale factor increases, the proposed algorithms are still superior to the others. From the line chart, one can see that SSMDE and SSMCS are relatively stable with the increase of the scale factor, as the slope is gentler. What is more, SSMDE obtains the highest OA when the scale factor equals 4, and SSMCS performs the best at the scale factors of 5 and 10 in Experiment 1. As for Experiment 2, SSMCS performs better than the other methods at all the scale factors.

5.2. Sensitivity in Relation to the Mutation Rate Parameter in SSMCS

In SSMCS, the mutation rate parameter

p m

plays an important role in the evolution process of the generation. It controls the balance of the convergence of the solution and the preservation of the suitable structural information of the antibodies. If the mutation rate is too low, there is not enough disturbance to escape from the local optima. On the other hand, the suitable structural information will be destroyed if the mutation rate is too high. Therefore, the value of

p m

has a significant impact on the accuracy of the output result. In this paper, we develop an adaptive strategy to automatically find the most suitable value, which is based on the principle that a better individual deserves a lower mutation rate to avoid destroying the good structure, whereas a worse individual deserves a higher mutation rate to remove the bad genetic information from the population, and thus the value adaptively changes in every generation. In order to verify the effect of the adaptive strategy, several non-adaptive parameters were selected within the range [0.1, 1] with the same footstep of 0.1, to obtain the best non-adaptive parameter setting, which was then compared to the best adaptive parameter setting. It is worth noting that the selected value of the non-adaptive parameter did not change during the iteration process.

Figure 16a,b shows the OA curves of the results generated by the series of non-adaptive parameter

p m

values for the synthetic images of Experiments 1 and 2, respectively. Since the adaptive values were updated in the iteration, we only show the final estimated adaptive values, as well as the OA values, which are represented by the stars. As displayed in Figure 16a,b, the OAs of the results obtained by the adaptively estimated parameters are almost the same with the best OAs obtained by the non-adaptive parameters. As the values of the adaptively estimated parameter

p m

are close to the best non-adaptive ones, this confirms the feasibility of the adaptive strategy. It is worth noting that since the

p m

values are randomly initialized in the adaptive strategy, the

p m

value will finally converge around the best

p m

value manually selected, which corresponds to better OA values. Therefore, the adaptive strategy can be considered to be stable and robust, and moreover, release the time consuming manual work.

5.3. Sensitivity in Relation to the Crossover Rate Parameter in SSMDE

The crossover rate

p c

is one of the most important parameters in the operation of SSMDE. The crossover rate

p c

controls the rate of the exchange of differential information between an individual of the current generation and the next generation. Since a larger parameter value means an inclination to exchange superfluous differential information and consequently destroy the structure of the individual, a smaller parameter value means less disturbance of the structure of the individual, which slows down the mutation process. Hence, an appropriate value of

p c

will result in a more rational reconstruction. In the framework of SSMDE, adaptive strategies were adopted for the crossover rate parameter

p c

and the differential scale factor F, and in order to test the validity of the adaptive strategy, parameter F was set to a fixed number during the iteration process. It should be noted that the adaptive strategy employed by parameter F could also be examined in a similar way. The same strategy that was used in SSMCS was used to obtain the optimum non-adaptive parameter value within the range [0.1, 1] and a footstep of 0.1.

Figure 16c,d displays the OA results obtained by the adaptive parameter

p c

and the non-adaptive parameters selected manually. As the adaptive parameter changes in every generation, only the final value of the parameter is shown in the figures, in the shape of a star. The optimum non-adaptive parameter value shows as the peak of the curve. It can be seen that the OA results acquired by the adaptive parameter

p c

are nearly the same with the ones acquired by the optimum non-adaptive parameter, proving the superiority of the adaptive strategy.

6. Conclusions

In this paper, we have proposed a novel sub-pixel mapping algorithm based on swarm intelligence theory for bitemporal remote sensing imagery, to regularize the ill-posed sub-pixel mapping problem and further improve the performance of sub-pixel mapping. The swarm intelligence theory involves clonal selection sub-pixel mapping (CSSM) and differential evolution sub-pixel mapping (DESM). Furthermore, a spatial-temporal sub-pixel mapping method is used to obtain the distribution information at a fine spatial resolution from the bitemporal image pair, which exactly regularizes the ill-posed sub-pixel mapping problem. To verify the effectiveness of the swarm intelligence theory based spatial-temporal sub-pixel mapping algorithm, we conducted two synthetic experiments and one real image experiment, and compared the proposed algorithm to several of the traditional sub-pixel mapping algorithms. The experimental results confirmed that the proposed algorithm surpasses the performance of the traditional approaches, both qualitatively and quantitatively, and can be successfully used for the application of timely and detailed LCCD. The proposed sub-pixel mapping algorithm was proved to be effective in Experiments 1 and 2, and the insignificant advantage in Experiment 3 was mainly due to the spectral unmixing error. As a result of the low spatial resolution of the MODIS image, which contains many mixed pixels, the blending pattern is very complicated, leading to a deficiency in the boundary and structure information, making it difficult for the spectral unmixing techniques to clearly distinguish each class. As these preprocessing errors were introduced into the sub-pixel mapping process, they compromised and masked the advantage of the proposed algorithm. Therefore, further work will be undertaken to ensure that the proposed method is more compatible with the spectral unmixing error, and to ensure that the proposed method can be better applied to real remote sensing images.

Acknowledgments

This work was supported by National Natural Science Foundation of China under Grant Nos. 41622107 and 41371344, Natural Science Foundation of Hubei Province under Grant No. 2016-29, and State Key Laboratory of Earth Surface Processes and Resource Ecology under Grant No. 2015-KF-02. The authors would like to thank the editor, associate editor and anonymous reviewers for their helpful comments and suggestions, Ke Wu of China University of Geosciences, China, for providing part of the experiment dataset.

Author Contributions

All the authors made significant contributions to the work. Da He and Yanfei Zhong designed the research and analyzed the results. Ruyi Feng assisted in the prepared work and validation work. Liangpei Zhang provided advice for the preparation and revision of the paper.

Conflicts of Interest

The authors declare no conflict of interest. The founding sponsors had no role in the design of the study; in the collection, analysis, or interpretation of data; in the writing of the manuscript; and in the decision to publish the results.

References

Chen, J.; Jia, X.P.; Yang, W.; Matsushita, B. Generalization of subpixel analysis for hyperspectral data with flexibility in spectral similarity measures. IEEE Trans. Geosci. Remote Sens. 2009, 47, 2165–2171. [Google Scholar] [CrossRef] [Green Version]
Johnson, B.; Tateishi, R.; Kobayashi, T. Remote sensing of fractional green vegetation cover using spatially-interpolated endmembers. Remote Sens. 2012, 4, 2619–2634. [Google Scholar] [CrossRef]
Sun, X.; Yang, L.; Zhang, B.; Gao, L.; Gao, J. An endmember extraction method based on artificial bee colony algorithms for hyperspectral remote sensing images. Remote Sens. 2015, 7, 16363–16383. [Google Scholar] [CrossRef]
Clasen, A.; Somers, B.; Pipkins, K.; Tits, L.; Segl, K.; Brell, M.; Kleinschmit, B.; Spengler, D.; Lausch, A.; Forster, M. Spectral unmixing of forest crown components at close range, airborne and simulated Sentinel-2 and EnMAP spectral image scale. Remote Sens. 2015, 7, 15361–15387. [Google Scholar] [CrossRef]
Doxani, G.; Mitraka, Z.; Gascon, F.; Goryl, P.; Bojkov, B.R. A spectral unmixing model for the integration of multi-sensor imagery: A tool to generate consistent time series data. Remote Sens. 2015, 7, 14000–14018. [Google Scholar] [CrossRef]
Sun, H.; Qie, G.; Wang, G.; Tan, Y.; Li, J.; Peng, Y.; Ma, Z.; Luo, C. Increasing the accuracy of mapping urban forest carbon density by combining spatial modeling and spectral unmixing analysis. Remote Sens. 2015, 7, 15114–15139. [Google Scholar] [CrossRef]
Bioucas-Dias, J.M.; Plaza, A.; Dobigeon, N.; Parente, M.; Du, Q.; Gader, P. Hyperspectral unmixing overview: Geometrical, statistical and sparse regression-based approaches. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2012, 5, 354–379. [Google Scholar] [CrossRef]
Atkinson, P.M. Mapping sub-pixel proportional land cover with AVHRR imagery. Int. J. Remote Sens. 1997, 18, 917–935. [Google Scholar] [CrossRef]
Mertens, K.C.; Basets, B.D.; Verbeke, L.P.C.; Wulf, R.D. Direct sub-pixel mapping exploiting spatial dependence. In Proceedings of the 2004 IEEE International Geoscience and Remote Sensing Symposium, Quebec City, QC, Canada, 20–24 September 2004.
Mertens, K.C.; Basets, B.D.; Verbeke, L.P.C.; Wulf, R.D. A sub-pixel mapping algorithm based on sub-pixel/pixel spatial attraction models. Int. J. Remote Sens. 2006, 27, 3293–3310. [Google Scholar] [CrossRef]
Atkinson, P.M. Sub-pixel target mapping from soft-classified, remotely sensed imagery. Photogramm. Eng. Remote Sens. 2005, 71, 839–846. [Google Scholar] [CrossRef]
Thornton, M.W.; Atkinson, P.M.; Holland, D.A. A linearised pixel swapping method for mapping rural linear land cover features from fine spatial resolution remotely sensed imagery. Comput. Geosci. 2007, 33, 1261–1272. [Google Scholar] [CrossRef]
Makido, Y.; Shortridge, A. Weighting function alternatives for a subpixel allocation model. Photogramm. Eng. Remote Sens. 2007, 73, 1233–1240. [Google Scholar] [CrossRef]
Shen, Z.; Qi, J.; Wang, K. Modification of pixel-swapping algorithm with initialization from a sub-pixel/pixel spatial attraction model. Photogramm. Eng. Remote Sens. 2009, 75, 557–567. [Google Scholar] [CrossRef]
Villa, A.; Chanussot, J.; Benediktsson, J.A.; Jutten, C.; Dambreville, R. Unsupervised methods for the classification of hyperspectral images with low spatial resolution. Pattern Recognit. 2013, 46, 1556–1568. [Google Scholar] [CrossRef]
Tatem, A.J.; Lewis, H.G.; Atkinson, P.M.; Nixon, M.S. Superresolution target identification from remotely sensed images using a Hopfield neural network. IEEE Trans. Geosci. Remote Sens. 2001, 39, 781–796. [Google Scholar] [CrossRef]
Nguyen, M.Q.; Atkinson, P.M.; Lewis, H.G. Superresolution mapping using a Hopfield neural network with fused images. IEEE Trans. Geosci. Remote Sens. 2006, 44, 736–749. [Google Scholar] [CrossRef]
Muad, A.M.; Foody, G.M. Impact of land cover patch size on the accuracy of patch area representation in HNN-based super resolution mapping. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2012, 5, 1418–1427. [Google Scholar] [CrossRef]
Collins, M.; Jong, M.D. Neuralizing target superresolution algorithms. IEEE Geosci. Remote Sens. Lett. 2004, 1, 318–321. [Google Scholar] [CrossRef]
Wang, Q.; Wang, L.; Liu, D. Particle swarm optimization-based subpixel mapping for remote-sensing imagery. Int. J. Remote Sens. 2012, 33, 6480–6496. [Google Scholar] [CrossRef]
Zhao, J.; Zhong, Y.; Wu, Y.; Zhang, L.; Shu, H. Sub-Pixel mapping based on conditional random fields for hyperspectral remote sensing imagery. IEEE J. Sel. Top. Signal Process. 2015, 9, 1049–1060. [Google Scholar] [CrossRef]
Ardila, J.P.; Tolpekin, V.A.; Bijker, W.; Stein, A. Markov-random-field-based super-resolution mapping for identification of urban trees in VHR images. ISPRS J. Photogramm. Remote Sens. 2011, 66, 762–775. [Google Scholar] [CrossRef]
Li, X.; Ling, F.; Du, Y.; Zhang, Y. Spatially adaptive superresolution land cover mapping with multispectral and panchromatic images. IEEE Trans. Geosci. Remote Sens. 2014, 52, 2810–2823. [Google Scholar] [CrossRef]
Li, X.; Du, Y.; Ling, F. Spatially adaptive smoothing parameter selection for Markov random field based sub-pixel mapping of remotely sensed images. Int. J. Remote Sens. 2012, 33, 7886–7901. [Google Scholar] [CrossRef]
Wang, L.; Wang, Q. Subpixel mapping using Markov random field with multiple spectral constraints from subpixel shifted remote sensing images. IEEE Geosci. Remote Sens. Lett. 2013, 10, 598–602. [Google Scholar] [CrossRef]
Mertens, K.C.; Verbeke, L.P.C.; Westra, T.; Wulf, R.D. Subpixel mapping and sub-pixel sharpening using neural network predicted wavelet coefficients. Remote Sens. Environ. 2004, 91, 225–236. [Google Scholar] [CrossRef]
Gu, Y.; Zhang, Y.; Zhang, J. Integration of spatial-spectral information for resolution enhancement in hyperspectral images. IEEE Trans. Geosci. Remote Sens. 2008, 46, 1347–1358. [Google Scholar]
Nigussie, D.; Zurita-Milla, R.; Clevers, J.G.P.W. Possibilities and limitations of artificial neural networks for subpixel mapping of land cover. Int. J. Remote Sens. 2011, 32, 7203–7226. [Google Scholar] [CrossRef]
Zhang, L.; Wu, K.; Zhong, Y.; Li, P. A new sub-pixel mapping algorithm based on a BP neural network with an observation model. Neurocomputing 2008, 71, 2046–2054. [Google Scholar] [CrossRef]
Shao, Y.; Lunetta, R.S. Sub-pixel mapping of tree canopy, impervious surfaces, and cropland in the Laurentian great lakes basin using MODIS time-series data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2011, 4, 336–347. [Google Scholar] [CrossRef]
Xu, X.; Zhong, Y.; Zhang, L. Adaptive subpixel mapping based on a multiagent system for remote-sensing imagery. IEEE Trans. Geosci. Remote Sens. 2014, 52, 787–804. [Google Scholar] [CrossRef]
Zhong, Y.; Wu, Y.; Xu, X.; Zhang, L. An adaptive subpixel mapping method based on map model and class determination strategy for hyperspectral remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 2015, 53, 1411–1426. [Google Scholar] [CrossRef]
Zhong, Y.; Wu, Y.; Zhang, L.; Xu, X. Adaptive map sub-pixel mapping model based on regularization curve for multiple shifted hyperspectral imagery. ISPRS J. Photogramm. Remote Sens. 2014, 96, 134–148. [Google Scholar] [CrossRef]
Feng, R.; Zhong, Y.; Xu, X.; Zhang, L. Adaptive sparse subpixel mapping with a total variation model for remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 2016, 54, 2855–2872. [Google Scholar] [CrossRef]
Ling, F.; Li, W.; Du, Y.; Li, X. Land cover change mapping at the subpixel scale with different spatial-resolution remotely sensed imagery. IEEE Geosci. Remote Sens. Lett. 2011, 8, 182–186. [Google Scholar] [CrossRef]
Ling, F.; Li, X.; Du, Y.; Xiao, F. Super-Resolution Land Cover Mapping with spatial–temporal dependence by integrating a former fine resolution map. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 1816–1825. [Google Scholar] [CrossRef]
Li, X.; Ling, F.; Du, Y.; Feng, Q.; Zhang, Y. A spatial-temporal Hopfield neural network approach for super-resolution land cover mapping with multi-temporal different resolution remotely sensed images. ISPRS J. Photogramm. Remote Sens. 2014, 93, 76–87. [Google Scholar] [CrossRef]
Li, X.; Du, Y.; Ling, F. Super-resolution mapping of forests with bitemporal different spatial resolution images based on the spatial temporal Markov random field. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 29–39. [Google Scholar]
Li, X.; Ling, F.; Foody, G.M.; Du, Y. A superresolution land-cover change detection method using remotely sensed images with different spatial resolutions. IEEE Trans. Geosci. Remote Sens. 2016, 54, 3822–3841. [Google Scholar] [CrossRef]
Wu, K.; Yi, W.; Niu, R.; Wei, L. Subpixel land cover change mapping with multitemporal remote-sensed images at different resolution. J. Appl. Remote Sens. 2015. [Google Scholar] [CrossRef]
Gao, F.; Masek, J.; Schwaller, M.; Hall, F. On the blending of the Landsat and MODIS surface reflectance: predicting daily Landsat surface reflectance. IEEE Trans. Geosci. Remote Sens. 2006, 44, 2207–2218. [Google Scholar]
Shen, H.; Wu, P.; Liu, Y.; Ai, T.; Wang, Y.; Liu, X. A spatial and temporal reflectance fusion model considering sensor observation differences. Int. J. Remote Sens. 2013, 34, 4367–4383. [Google Scholar] [CrossRef]
Hilker, T.; Wulder, M.; Coops, N.; Linke, J.; Mcdermid, G.; Masek, J.; Gao, F.; White, J. A new data fusion model for high spatial-and temporal-resolution mapping of forest disturbance based on Landsat and MODIS. Remote Sens. Environ. 2009, 113, 1613–1627. [Google Scholar] [CrossRef]
Hansen, M.; Roy, D.; Lindquist, E.; Adusei, B.; Justice, C.; Altstatt, A. A method for integrating MODIS and Landsat data for systematic monitoring of forest cover and change in the Congo Basin. Remote Sens. Environ. 2008, 112, 2495–2513. [Google Scholar] [CrossRef]
Wang, Q.; Shi, W.; Atkinson, P.M.; Li, Z. Land cover change detection at subpixel resolution with a hopfield neural network. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2015, 8, 1339–1352. [Google Scholar] [CrossRef]
Wang, Q.; Atkinson, P.M.; Shi, W. Fast Subpixel mapping algorithms for subpixel resolution change detection. IEEE Trans. Geosci. Remote Sens. 2015, 53, 1692–1706. [Google Scholar] [CrossRef]
Wang, Q.; Shi, W.; Atkinson, P.M. Spatiotemporal subpixel mapping of time-series images. IEEE Trans. Geosci. Remote Sens. 2016, 1–15. [Google Scholar] [CrossRef]
Xu, Y.; Huang, B. A spatio-temporal pixel-swapping algorithm for subpixel land cover mapping. IEEE Geosci. Remote Sens. Lett. 2014, 11, 474–478. [Google Scholar] [CrossRef]
Mertens, K.C.; Verbeke, L.P.C.; Ducheyne, E.I. Using genetic algorithms in sub-pixel mapping. Int. J. Remote Sens. 2003, 24, 4241–4247. [Google Scholar] [CrossRef]
Burnet, F.M. The Clonal Selection Theory of Acquired Immunity; Cambridge University Press: Cambridge, UK, 1959. [Google Scholar]
DeCastro, L.N.; VonZuben, F.J. Learning and optimization using the clonal selection principle. IEEE Trans. Evol. Comput. 2002, 6, 239–250. [Google Scholar] [CrossRef]
Wang, L.; Pan, Q.; Suganthan, P.N.; Wang, W.; Wang, Y. A novel hybrid discrete differential evolution algorithm for blocking flow shop scheduling problems. Comput. Oper. Res. 2010, 37, 509–520. [Google Scholar] [CrossRef]
Lu, D.; Mausel, P.; Brondizio, E.; Moran, E. Change detection techniques. Int. J. Remote Sens. 2004, 25, 2365–2407. [Google Scholar] [CrossRef]
Hussain, M.; Chen, D.; Cheng, A.; Wei, H.; Stanley, D. Change detection from remotely sensed images: From pixel-based to object-based approaches. ISPRS J. Photogramm. Remote Sens. 2013, 80, 91–106. [Google Scholar] [CrossRef]
Ang, J.H.; Tan, K.C.; Mamum, A.A. An evolutionary memetic algorithm for rule extraction. Expert Syst. Appl. 2010, 37, 1302–1315. [Google Scholar] [CrossRef]
Zhong, Y.; Zhang, L. Sub-pixel mapping based on artificial immune systems for remote sensing imagery. Pattern Recognit. 2013, 46, 2902–2926. [Google Scholar] [CrossRef]
Woldemariam, K.M.; Yen, G.G. Vaccine-enhanced artificial immune system for multimodal function optimization. IEEE Trans. Syst. Man Cybern. 2010, 40, 218–228. [Google Scholar] [CrossRef] [PubMed]
Zhong, Y.; Zhang, L. Remote sensing image subpixel mapping based on adaptive differential evolution. IEEE Trans. Syst. Man Cybern. 2012, 42, 1306–1329. [Google Scholar] [CrossRef] [PubMed]
Zhang, L.; Zhong, Y.; Huang, B.; Li, P. Dimensionality reduction based on clonal selection for hyperspectral imagery. IEEE Trans. Geosci. Remote Sens. 2007, 45, 4172–4185. [Google Scholar] [CrossRef]

Figure 1. Example of the sub-pixel mapping problem: (a) A 3 × 3 grid coarse pixel divided into 16 sub-pixels (4 × 4), with the scale factor of 4; (b) The optimal sub-pixel distribution of one class; and (c) The possible sub-pixel distribution of one class.

Figure 2. Specific example of sub-pixel mapping: (a) Coarse image; and (b) Fine image.

Figure 3. The sub-pixel coordinate system and distance calculation between pixel and sub-pixel.

Figure 4. Process of the coding technique.

Figure 5. The framework of spatial-temporal sub-pixel mapping based on swarm intelligence theory.

Figure 6. The framework of spatial-temporal sub-pixel mapping.

Figure 7. The framework of spatial-temporal sub-pixel mapping based on clonal selection.

Figure 8. The framework of spatial-temporal sub-pixel mapping based on differential evolution.

Figure 9. Specific illustration of the intelligent operations of repair, exchange, and insertion.

Figure 10. Sub-pixel mapping results for the Landsat 8 image in Experiment 1: (a) Original Landsat 8 image from 13 July 2013 (bands 4, 3, 2) (240 × 240); (b) Original SVM classification result of the 13 July 2013, Landsat 8 image (250 × 250); (c) Original Landsat 8 image from 7 August 2013 (bands 4, 3, 2) (240 × 240); (d) Original SVM classification result of the 7 August 2013, Landsat 8 image (250 × 250); (e) Fraction image of water (60 × 60); (f) Fraction image of farmland (60 × 60); (g) Fraction image of vegetation (60 × 60); (h) Fraction image of urban (60 × 60); (i) Fraction image of bare soil (60 × 60); (j) pixel swapping sub-pixel mapping (PSSM); (k) Sub-pixel mapping based on a genetic algorithm (GASM); (l) Sub-pixel mapping based on differential evolution (DESM); (m) Sub-pixel mapping based on clonal selection (CSSM); (n) Spatial-temporal sub-pixel mapping based on a pixel swapping algorithm (SSMPS); (o) Spatial-temporal sub-pixel mapping based on a genetic algorithm (SSMGA); (p) Spatial-temporal sub-pixel mapping based on differential evolution (SSMDE); and (q) Spatial-temporal sub-pixel mapping based on clonal selection (SSMCS).

Figure 11. Zoomed areas of each sub-pixel mapping result: (a) Zoom of the original classification result of the 2005 Landsat TM image; (b) Zoom of the PSSM result; (c) Zoom of the GASM result; (d) Zoom of the DESM result; (e) Zoom of the CSSM result; (f) Zoom of the SSMPS result; (g) Zoom of the SSMGA result; (h) Zoom of the SSMDE result; and (i) Zoom of the SSMCS result.

Figure 12. Sub-pixel mapping results for the Hubei QuickBird image in Experiment 2: (a) Original QuickBird image from 2002 (bands 3, 2, 1) (250 × 250); (b) Original SVM classification result of the 2002 QuickBird image (250 × 250); (c) Original QuickBird image from 2004 (bands 3, 2, 1) (250 × 250); (d) Original SVM classification result of the 2004 QuickBird image (250 × 250); (e) Fraction image of impervious surface (50 × 50); (f) Fraction image of vegetation (50 × 50); (g) Fraction image of pond (50 × 50); (h) Fraction image of building (50 × 50); (i) Fraction image of road (50 × 50); (j) PSSM; (k) GASM; (l) DESM; (m) CSSM; (n) SSMPS; (o) SSMGA; (p) SSMDE; and (q) SSMCS.

Figure 13. Zoomed areas of each sub-pixel mapping result: (a) Zoom of the original classification result of the 2004 QuickBird Image; (b) Zoom of the PSSM result; (c) Zoom of the GASM result; (d) Zoom of the DESM result; (e) Zoom of the CSSM result; (f) Zoom of the SSMPS result; (g) Zoom of the SSMGA result; (h) Zoom of the SSMDE result; and (i) Zoom of the SSMCS result.

Figure 14. Sub-pixel mapping results for the 2009 MODIS image in Experiment 3: (a) Original TM image from 2001 (bands 4, 3, 2) (408 × 408); (b) K-Means classification result of the 2001 TM image (408 × 408); (c) Original TM image from 2009 (bands 4, 3, 2) (408 × 408); (d) K-Means classification result of the 2009 TM image (408 × 408); (e) Original MODIS image from 2009 (bands 4, 1, 2) (24 × 24); (f) Fraction image of water (24 × 24); (g) Fraction image of vegetation (24 × 24); (h) Fraction image of urban (24 × 24); (i) PSSM; (j) GASM; (k) DESM; (l) CSSM; (m) SSMPS; (n) SSMGA; (o) SSMDE; and (p) SSMCS.

Figure 15. Sensitivity analysis of SSMCS and SSMDE in relation to scale factor s: (a) Sub-Pixel mapping in relation to scale factor s for the Landsat 8 image in Experiment 1; and (b) Sub-Pixel mapping in relation to scale factor s for the QuickBird image in Experiment 2.

Figure 16. Sensitivity analysis of SSMCS and SSMDE in relation to parameters pm and pc, respectively: (a) SSMCS in relation to parameter pm for the Landsat 8 image; (b) SSMCS in relation to parameter pm for the QuickBird image; (c) SSMDE in relation to parameter pc for the Landsat 8 image; and (d) SSMDE in relation to parameter pc for the QuickBird image.