1. Introduction
The performance of built-up object recognition methodologies is often dependent on the seasonal environment. The urban lifestyle and transformations of the city of Yakutsk are seasonally dependent, too. For example, most of the work in the growing and building construction industry takes place between April and October, while the population of the city during summertime doubled from 1989 to 2016 and tripled from 1970 to 2017, from 108,000 inhabitants in 1970 to approximatively 311,000 inhabitants in 2017 [
1]. The growth in terms of the population generates a fast increase of urban structures and built-up areas. The first goal of this paper is to study inter-annual and intra-annual urban fabric change through remote sensing imagery. In this context, several studies have explored land use changes in urban environments by using different remote sensing sensors [
2,
3,
4,
5,
6]. For this task, summer season imagery (June–September) seems to be the best choice in terms of scene illumination, minimal snow cover, and vegetated surfaces filtering (e.g., [
7,
8]). Inter-annual and intra-annual image processing and analysis over sub-Arctic regions must take into consideration the seasonal component, which will guide built-up areas detection and land use change estimation efficiencies.
The built-up areas extraction from multi-band imagery is a challenging task, since initial scenes are mosaics of multiple land use heterogeneous classes, and the extraction of built-up areas requires subtracting the other existing classes. In this context, the existing studies of built-up areas extraction could be divided into two categories according to the methodology of extraction: (1) the supervised classification group and (2) the spectral index-based group. The first group requires accurate validation, training points, and high computational cost. The second group focuses on dedicated built-up indices (e.g., normalized difference built-up index (NDBI) [
7] as well as enhanced built-up and bareness index (EBBI) [
9]). Most of the proposed built-up indices are based on short-wave infrared (SWIR) bands for calculation. Nevertheless, some exceptions, such as the combinational build-up index (CBI) [
10], could be reported. This latter index uses only visible and near infrared (Vis-NIR) bands along with a first Principal Component Analysis (PCA) band and a correction coefficient. In this study, we experimented with the use of a non-dedicated built-up Index, which is an alternative version of the brightness index (BI) [
11] and is called the second brightness index (BI2) [
11]. This was developed initially for bare soil identification based on the calculation of an average brightness value of the multiband image. The other important aspect regarding the BI index choice concerns the nonrequirement of SWIR bands for calculation. The BI index only requires Vis-NIR bands, which is adequate for many existing satellite imagery solutions. In addition to the BI index, two environmental indices were used to subtract green frame and waterbodies from the image, namely the normalized difference vegetation index (NDVI) [
12] and second normalized water index (MNDWI 2) [
13]. As for any spectral index, the calculation threshold value is required to separate the classes of interest within the calculated index maps. In this context, thresholding could be carried out using an automatic or semi-automatic (e.g., intersection of the classes’ histograms, rules definition) method. The second solution was sufficient for the BI index. Concerning NDVI and MNDWI, state-of-the-art thresholds were used and slightly rectified for the current study. Once built-up areas had been extracted, a change detection was conducted over each pair of maps and change detection could be either applied over original images [
14,
15] or over classified maps [
16,
17]. The second solution was preferred due to the specificity of the detection. The first one is better-suited for more general detection of land use elements.
The second treated aspect of this study concerns the morphometric recognition and characterization of the settlements previously extracted by spectral indices. This additional and important step of the proposed methodology is part of the geographic object-based image analysis (GEOBIA) field [
18,
19], which considers the image as an ensemble of objects rather than pixels. This recent paradigm is increasingly embedded in the fields of image processing, geographic information science, and remote sensing [
20,
21,
22]. It brings additional elements of analysis related to urban structures of the city. Post-Soviet urban areas like Yakutsk are a mixture of Soviet blocks, post-modern buildings, wooden buildings, and individual houses mostly constructed from wood, where the roofing panels are mostly made of metal, fiber cement, and bitumen. An object-oriented morphometric approach combining the calculation of geometric attributes (e.g., area, elongation, convexity, circularity) for each urban object detected, and morphometric rules enable us to extract object urban sprawl, built-up areas categories, and their associated socioeconomic uses [
23,
24]. There is a growing interest in morphological image processing due to the increasing availability of high-resolution imagery. In this context, many methods integrating the shape and geometry of the objects composing a natural scene were developed. The use of such operators can be considered following two types of implementation: (1) attributes are integrated in the segmentation procedures using a tree hierarchical representation of the image (e.g., morphological trees), and (2) attributes are calculated over segmented objects. The first implementation requires complementary filtering operations during the tree construction procedure (e.g., using morphological filters). An additional classification step is applied over the filtered map using supervised classifiers [
25,
26,
27]. The first implementation could be interesting for enhancing conventional land use classification by adding spatial information. Nevertheless, it does not permit a precise characterization of objects, in addition to a higher computational cost. The second implementation is based on three steps, which are segmentation of the input image, geometric attribute calculation over generated objects, and rule construction. It presents a better trade-off for our case study and offers more flexibility and more precise characterization of the objects by computing geometric attributes over segmented objects [
28,
29,
30].
The proposed methodology merges image analysis (i.e., segmentation, classification, morphology) and urban knowledge (i.e., architecture, urbanism, and geography) for a better understanding of the urban change dynamics and for socioeconomic modeling of the urban fabric. The season and remote sensing sensors help in built-up areas estimation and urban sprawl modeling according to temporal interval availability, while urban knowledge helps in creating the morphometric rules modeling the socioeconomic functions of the detected urban objects.
2. Study Zone and Satellite Data
The study zone concerns the city of Yakutsk, which is the capital city of Sakha Republic, Russia and is considered as one of the coldest cities, with temperatures in January often below −40 °C and a peak of cold around −63 °C. On the other hand, relatively hot summer temperatures are noticeable, with a mean temperature between +15 °C and +20 °C from June to August and a peak of heat exceeding +35 °C in the same interval [
31]. Given this summer/winter heat gap, the city holds the record for annual thermal amplitude. The Sakha Republic is characterized by low landscape diversity, with mainly mountains, highlands, and plateaus. Several rivers run through this vast territory, with a south–north flow. The most important is the Lena River with a length of over 4000 km and the entire basin covering 2.5 million km
2, which makes it the third longest river in Russia and one of the 10 longest rivers in the world.
Yakutsk is located on the western part of the Lena River and is the main industrial center of the region, with construction material production, a food industry, wood working, and coal mining. The Lena River is an important economic axis for the city and is used as an ice road in the winter and as a sea route in the summer. Thus, it is the principal way of commercial transit between Yakutsk and other areas [
32,
33]. The city is also characterized by a growing metropolization and important demographic and structural changes [
34].
For this study, two datasets were used including a time series Sentinel-2A of a 10-m ground sample distance (GSD) and four bands Vis-NIR (
Figure 1) (i.e., 490 nm, 560 nm, 665 nm, and 842 nm), and a mono-date SPOT 6 acquired in July 2016 of 1.6 m GSD and 4 bands Vis-NIR (i.e., 485 nm, 560 nm, 660 nm, and 825 nm). Despite a larger original spectral interval (i.e., Vis-NIR + SWIR) of Sentinel-2A images over 13 bands, we preferred working on the higher available spatial resolution of 10 m GSD, which included only four bands in Vis-NIR. No Pan-sharpening with 20 m and 60 m bands was done. Resolutions exceeding 10 m are not well suited for precise characterization of dense urban areas and such a fusion operation could negatively affect the spectral information and create misclassifications. Concerning SPOT 6 image, an additional pre-processing of Pan-sharpening was carried between the panchromatic band of 1.6 m GSD and the four multispectral bands of 6 m GSD. The spectral integrity was less affected than for the Sentinel-2A case after the fusion procedure. For morphological processing, a comparison in the same year and month between SPOT 6 and Sentinel 2A was planned. Nevertheless, due to the less temporal resolution of SPOT 6, only the date of July 2016 presented a good tradeoff between the cloud-free scene and comparable environmental conditions at the time of acquisition.
3. Method
The method was based on the simultaneous use of Sentinel 2 high temporal resolution imagery for the built-up areas change detection estimation, and SPOT 6 pan-sharpened high spatial resolution imagery for built-up areas socioeconomic function modeling. The first step included the processing of two images at different dates, using spectral indices (SI) (e.g., [
7,
9,
10,
11,
12,
13,
35,
36]) and spectral classification with a spectral angle mapper (SAM) [
37] (
Figure 2). Three environmental indices were combined using specific thresholds to estimate two categories of built-up areas (i.e., dark built-up and clear built-up areas) over the city of Yakutsk. The spectral indices were used and combined to estimate manmade surfaces and more precisely built-up areas, using the second normalized water index (NDWI 2) [
13] for waterbody detection.
the second brightness index (BI2) [
20] for soil surfaces detection:
and the normalized vegetation difference index (NDVI) for vegetation density detection:
The NDVI and NDWI2 indices were used to create non-built-up areas masks (i.e., green frame and waterbodies). Concerning NDVI, many studies suggested a limit value of 0.2 to extract the green frame from the scene [
38,
39,
40]. For our study, two thresholds around a 0.2 value were used depending on the date of acquisition with [0.15–1] and [0.25–1] being a good tradeoff, respectively, for the June and September scenes. June in Yakutsk is considered the beginning of the period of vegetation growth, and a decrease in the NDVI threshold was necessary to capture the weakly vegetated surfaces. Concerning NDWI2, in Reference [
13], a threshold of 0 was considered to detect waterbodies from the background. Nevertheless, this hypothesis is only valid for non-urban or slightly urbanized scenes. Otherwise, the generated map will contain shadows and dark built-up areas [
41]. We slightly increased the NDWI2 index limit value with reference to some state-of-the-art studies [
42,
43,
44], and a value of 0.25 was a good tradeoff to extract maximum waterbodies without conflicting the built-up areas.
Concerning built-up areas and bare soil detection, many indices have been developed, such as NDBI, Urban Index (UI), BI, and EBBI. Most of the proposed built-up indices require SWIR bands, which is not compatible with our datasets, since the available Seninel-2A SWIR bands are of a coarse resolution. For this study, we used a brightness index called the second brightness index (BI2) to extract built-up areas using three levels of thresholding over the generated brightness map instead of a dedicated built-up index. The BI2 index is based on the quantification of Albedo over a visible spectral interval [
34] and is, therefore, a good candidate for a built-up areas extraction attempt. After eliminating the vegetation and waterbody pixels from the BI2 map, we considered that the maximum value of the histogram defines the limit between dark and clear pixels. To eliminate bare soils, we added and removed 4% from both sides of the max value. Let us consider a reference peak of 15% associated with a BI2 map. Then a clear interval will be defined at [19%–max_value], a moderately clear interval will be defined at [11–19%], and a dark interval will be defined at [>0–10%]. We hypothesized that major built-up areas could be divided into two big classes of clear and dark roofs. Bare soil was apparently moderate in terms of Albedo and was corresponding to a small interval portion around the peak value of BI2. Clear and dark intervals were used to extract the classes of interest. Lastly, built-up areas maps were generated for each available date following the proposed BI2 thresholding. The accuracy evaluation was made for all built-up pixels and for two classes of built-up areas in the second step.
A SAM classifier was trained with five different categories representing the major roofing materials of the city (i.e., clear metal, clear bitumen, dark_metal_1, dark_metal_2, and dark bitumen) and three environmental classes (i.e., water, vegetation, and bare soil), with 50 to 300 samples for each class. The spectral angle was set to 0.12 for all the classes, which is slightly above the default value of 0.10 and permitted avoiding potential under classifications. The resulting maps were evaluated for a single built-up class grouping all the identified classes of interest. In the second step, more precise identification was carried out for two classes of built-up areas, which are of clear and dark types. Non-built-up classes were used to avoid misclassifications with built-up classes. They were masked after the classification procedure. Once built-up areas had been extracted for both dates, different statistics were generated to estimate the built-up areas change. Change detection accuracy was evaluated for each mapping method using precise ground truth polygons.
The second step was based on the use of morphological processing over a high spatial resolution imagery for built-up areas socioeconomic characterization. First, an automatic segmentation was applied over built-up areas masks generated by the previous method. The segmentation process generated different urban objects composing the initial urban scene. For each object, several geometric attributes were calculated (e.g., area, elongation, compactness, circularity). Lastly, geometric attributes were used separately or jointly (i.e., morphometric rules) to characterize specific shapes, sizes, and functions related to the objects of interest. The idea was to take advantage of the spatial dimension given by high-resolution imagery to characterize specific built-up areas and to extract their socioeconomic categories. The study was tested over a limited zone over the city and three socioeconomic categories were defined over this zone of interest. The morphological recognition was tested over a Sentinel-2A time series and SPOT 6 mono-date images.
4. Results
4.1. Intra-Annual Study
For the intra-annual study, two Sentinel-2A images of 2017 were acquired. The goal was to evaluate built-up areas estimation efficiency using SI and to what degree non-manmade materials (bare soil, vegetation, waterbodies, sandbanks, etc.) could affect detection efficiency. The first image was acquired in early June and the second one was acquired in mid-September, so the temporal interval was over 3 months. A threshold of [0.25–1] was used NDWI 2 for both dates. Two thresholds were used for NDVI, [0.15–1] for June 2017 and [0.25–1] for September 2017. Two thresholds were used for BI2, BI2_C1 = [20–100%] for June 2017 and [17–100%] for September 2017 to detect clear build-up, and BI2_C2 = [<0–11%] for June 17 and [<0–10%] for September 2017 to detect dark build-up.
The comparison between the two 2017 images of June and September showed a decrease in waterbodies due to the appearance of aquatic vegetated surfaces and sandbanks during the summertime (
Figure 1b,c). The clear built-up areas detection decreased in September due to a decrease in global sun illumination. The dark built-up areas increased by about 12% (
Table 1). Built-up areas recognition using SI over Sentinel-2A data could be an interesting alternative to land use detection with a classification algorithm. Manmade materials were detected using the BI2 index with different threshold intervals. Nevertheless, the index was sensible to the date of acquisition (i.e., illumination condition) and some confusion with other land use classes was noticeable (e.g., vegetation, waterbodies). To compensate for the confusion, vegetation and waterbodies masks were calculated and used via the NDVI and NDWI 2 to increase the built-up areas detection accuracy.
To estimate the degree of change between two dates, we intersected the two built-up areas maps and extracted the degree of decrease (i.e., pixels that had disappeared at date 2 = (pixel_count_date1 − pixels_intersect)/(pixel_count_date2)) and increase (i.e., new pixels at date 2 = (pixel_count_date2 − pixels_intersect)/(pixel_count_date1)) (i.e., for a class C, pixel_count_date1: map of C at date n, pixel_count_date 2: map of C at date
n + 1, pixels_intersect: intersection of map C at date n and map C at date
n + 1). We also calculated a relative change rate (i.e., signed a difference between decrease and increase rates) and an absolute change rate (i.e., sign (pixel_count_d2-pixel_count_d1)/(pixel_count_d1)). The intersection between the two maps reported an increase of 40% and 17%, respectively, by SI and SAM (
Table 2), but these values are less precise than the relative and absolute changes rate and cannot estimate a reliable change rate. Relative and absolute change rates were negatives (i.e., built-up areas decrease) using both SI and SAM. When analyzing the results by classes (clear and dark build-up types), an increase of dark built-up areas (i.e., C2 class) was noticed using SI classification (i.e., 3% of relative change and 12% of absolute change) (
Table 2), whereas no increase in terms of built-up areas was given by the SAM classification.
Concerning the classification accuracies of built-up areas, the SAM classification showed better statistical accuracies than the SI classification for detecting all built-up areas over the city for both dates (i.e., overall accuracy (O.A.) of 94%) (
Table 3). In a second step, we calculated the classification accuracies for two classes of built-up areas, which were of clear and dark types. O.A. decreased for both SAM and SI classifications, with accuracies between 57% and 68% (
Table 3). The SAM classifier was superior in terms of statistical accuracy than the index classification. The SI classification showed, nevertheless, better statistical accuracy (producer accuracies >80%) for detecting clear built-up areas (C1 class) than the SAM classifier, which had a lower performance (producer accuracies <75%). Dark built-up areas were difficult to identify using the SI classification with poor producer accuracies (producer accuracies ≤30%) (
Table 3).
4.2. Inter-Annual Study
For the inter-annual study, two sentinel-2A images from September 2015 and September 2017 were used. The goal was to evaluate built-up areas estimation efficiency using a three-year temporal interval with almost the same illumination condition (i.e., same month), and to what degree non-manmade materials (bare soil, vegetation, waterbodies, sandbanks, etc.) could affect detection efficiency. The first image was acquired in early September and the second one in mid-September, so the temporal interval was over two years. A thresholding of [0.25–1] was used for NDWI 2. Two thresholds were used for NDVI, [0.25–1] for 2015 and [0.25–1] for 2017, while two thresholds were also used for BI2, BI2_C1 = [20–100%] for 2015 and [17–100%] for 2017 to detect clear build-up and BI2_C2 = [>0–12%] for 2015 and [>0–10%] for 2017 to detect dark build-up.
The inter-annual study highlighted an important increase in terms of built-up areas between 2015 and 2017. Clear and dark built-up areas increased by about 10% and 13%, respectively. An important increase of waterbodies was also identified, which was due to the presence of many sandbanks spreading all over the Lena River in September 2015. Vegetation kept a certain constancy (i.e., slight decrease of about 3%) (
Table 4). In terms of built-up areas recognition, the SAM classification was more accurate than the SI one, with less accuracy variation between the two dates. The intersection between the two maps reported an increase of 52% and 37 %, respectively, for SI and SAM (
Table 5). The relative and absolute change ratios reported an increase for SI and a decrease for SAM, which reported a decrease of 9% of relative change and 6% of absolute change. This is contrary to the SI classification, which reported an increase of 15% of relative change and 11% of absolute change (
Table 5). When analyzing the results by classes (i.e., clear and dark built-up areas), it seemed that the SAM classification underestimated clear built-up areas on the 2017 date, which affected the global estimation of the built-up areas increase. However, SI offered a more coherent estimation.
The reliability of change ratios is still dependent on the classification accuracy maps. For our case, an overestimation of clear built-up areas was noticeable both by the SI (
Figure 3) and the SAM classifications. The non-built-up land use categories detected in both dates also affect the reliability of the built-up areas recognition map. A decrease in vegetation will increase bare soils and will provide overestimation at (d+1) date. The important built-up areas regression observed between 2015 and 2017 for the SI classification was due to the presence of sandbanks over Lena in 2015, which were detected as built-up areas and then disappeared in 2017. The SAM classification showed less confusion with sandbanks. If these land use change phenomena between the two dates are taken into account, one can expect a biased change rate. When combining the two classifications results and subtracting a bias related to land use changes of 5% to 10%, we can reasonably approximate a 12% increase in terms of built-up areas between 2015 and 2017.
Figure 4 shows a unique localization of the main changes, which are colored in red (i.e., dark built-up areas) and yellow (i.e., clear built-up areas). The increase in terms of roads infrastructure will not be taken into consideration. These changes correspond to the appearance of new individual residences and new buildings.
As for the inter-annual study, the SAM classification of built-up areas offered higher statistical accuracies (O.A. >90%) (
Table 6) than SI classification (O.A. ≤75%) for both dates. Accuracy decreased when evaluating the accuracies over two classes of built-up areas (O.A. <70%). The SAM kept its statistical superiority over SI. Nevertheless, the SI classification offered a better identification of clear built-up class (C1 class) over SAM. The SI classification remains poor in terms of dark built-up areas identification (producer accuracies ≤30%) (
Table 6).
For better characterizing urban expansion related to clear and dark built-up areas using SI and SAM classifications, we created the following classes: unchanged pixels, clear built-up areas expansion, and dark built-up areas expansion. Precise ground truth points were extracted from the initial scenes and two groups of validation points were created. The first one was related to change between 04 June, 2017 and 12 September, 2017, with 54, 96, and 2503 ground truth pixels of clear built-up areas, dark built-up areas, and unchanged areas, respectively. The second group was related to change between 03 September, 2015 and 12 September, 2017, with 411, 476, and 2800 ground truth pixels of clear built-up areas, dark built-up areas, and unchanged areas, respectively. O.A went from moderate to fair (O.A. from 58% to 72%). The unchanged areas were almost well detected by both SI and SAM (producer accuracies from 65% to 74%), clear built-up areas expansion was moderately detected (producer accuracies from 36% to 55%), and dark built-up areas expansion was poorly detected (25% to 35%). The SAM classification showed better statistical accuracy than the SI classification. Nevertheless, the difference was slight (up to 9% better). The SI classification even permits a slightly better detection for dark built-up class (i.e., detection between 04 June, 2017 and 12 September, 2017. When comparing accuracies of change detection between intra-annual and inter-annual studies, O.A. was guided by the unchanged areas class, which was more consistent in terms of the pixel validation number than the other classes. Nevertheless, when analyzing producer accuracies, inter-annual studies were more efficient in terms of built-up areas extension detection (
Table 7).
4.3. Mono-Data Morphological Processing
In the previous sections, we detected built-up areas using different spectral indices available in the literature and applied a spectral classification using “ground truth training pixels” acquired over Sentinel-2A images. These spectral based methods allow for an accurate identification of urban fabric. Nevertheless, it is not possible to extract any information about the socioeconomic structure of built-up areas, which is more related to object form and morphology rather than spectral or radiometric characteristics.
In this section, we used two datasets known as the Sentinel-2A image from September 2017 and a SPOT 6 image from July 2016 acquired over the same region of interest. The goal was to take advantage of the high spatial resolution offered by the SPOT 6 sensor, which gives better capabilities in terms of morphological object detection and recognition compared to the sentinel-2A sensor. As the first step, an SI-based classification was applied to extract the global urban fabric using the methodology described in
Section 3. Then, an automatic threshold segmentation based on the Otsu thresholding segmentation algorithm [
45] was applied to extract the objects existing in the scene. We took into consideration only the clear built-up areas in the morphological study (
Section 4.1 and
Section 4.2). Once the urban objects had been extracted, we used five geometric attributes to detect specific socioeconomic classes.
The “Area” attribute corresponds to the object pixel count. The “Elongation” attribute is equal to the difference between the inertias of the major and minor axes, divided by the sum of these inertias (i.e., null for a circle and equal to 1 for a long and narrow ellipse).
The “Convexity” attribute is equal to the area of an object divided by the area of its convex hull. It permits us to discriminate/recognize complex shaping objects (i.e., a convex hull is the smallest shape completely containing the object).
The “BRFillRatio” is equal to the ratio between the area of an object and the area of its bounding rectangle. It gives a certain compromise between compactness and convexity (i.e., a bounding rectangle is the smallest rectangle containing the object).
The “Compactness” attribute is equal to the area of an object divided by its squared perimeter, and it is suited for cubic object detection.
Three socioeconomic classes were defined for the study zone including category 1: individual houses, which are most spread at the different borders of Yakutsk, and inside the city center, category 2: mean elongated structures, which correspond to social habitations, commercial buildings, or warehouses, and category 3: mean convex/cubic structures, which correspond to cultural, industrial, or governmental buildings. Concerning SPOT 6, for category 1, we used [area & BRFillRatio]. The use of compactness instead of the BRFillRatio is not suited due to the increase of spatial resolution and the appearance of a more complex form corresponding to individual houses. For category 2, we used [convexity & elongation], and for category 3, [area & elongation & BRFillratio]. Concerning Sentinel-2A data, for category 1, we used the rule [area & compactness], for category 2, [convexity & elongation], and for category 3, [area & elongation & BRFillRatio].
Morphological detection over the SPOT 6 image gives a precise recognition of individual and elongated structures. Mean/cubic structures were also extracted with a confusion between bare soil delineated parcels and built-up areas (
Figure 5). Morphological recognition over Sentinel-2A was less precise due to a lower spatial resolution, but the distinction between the socioeconomic categories was clear enough. The elongated structures that were perfectly distinguished by SPOT 6 and associated with category 2 are often associated with category 3, due to their detection as an ensemble by Sentinel-2A classification. The same confusion between bare soil parcels and category 3 are noticeable (
Figure 6). The morphological aspects of remote sensing imagery could be a very interesting alternative and complementary tools for classical image processing techniques based on radiometric/spectral characteristics of multiband imagery. One can characterize urban objects and classify them following a socioeconomic scheme, which will help with understanding the specific urban fabric functions of the city.
5. Discussion
Many indices related to built-up areas estimation (NDBI, index-based built-up index IBI, EBBI, morphological building index MBI, etc.), surface bareness (e.g., NDbal, NBLI), and impervious surfaces (e.g., impervious surface index NDISI) were developed and proposed in many state-of-the-art studies as mentioned in previous sections. The idea behind developing such indices was to provide end-users with efficient tools for urban structure monitoring, urban planning, and change detection. Most of these indices require SWIR bands for calculations and are often used over coarse resolution imagery (e.g., Landsat) for built-up frame identification at a large scale. The efficiency of such indices was largely reviewed and studied [
46,
47,
48]. In Reference [
48], the authors reported the sensibility of such indices to the image resolution, seasonality, and location of the scene. The authors stated that the normalized built-up area (NBAI) [
49], biophysical composition (BCI) [
50], CBI, and band ratio for built-up area (BRBA) [
49] indices were the most robust from the tested ones. The tests were done over Sentinel-2A imagery, but O.A. was not calculated. The validation of the results was done by visual interpretation. From the tested indices, only the CBI and BRBA did not require SWIR bands for the mathematical calculation. The authors also reported the difficulty of separating the bare soil from built-up areas. The O.A. of different state-of-the-art studies using the above indices was reported (i.e., Landsat for the majority), with values from 85% to 97%, and best accuracies obtained for the CBI index. In Reference [
46], built-up areas were identified using Landsat 8, Sentinel-1, and Sentinel-2A data. The authors included some SI as additional bands in the classification process (i.e., NDVI, NDBI, NDWI, urban index UI [
51], and enhanced vegetation index EVI [
52]). The authors used a random forest classifier to identify built-up areas in Vietnam. In the second step, they applied a more precise classification of residential and nonresidential areas and used two strategies of classification training. With the best strategy, they obtained O.A. for built-up areas from 73% to 96%, and from 60% to 76% for the two-class scenario. In Reference [
9], the authors compared several built-up SIs over Landsat ETM+ images, and they calculated the degree of correlation between the generated SI maps distribution and an IKONOS classification map (i.e., O.A. of 96%). The best correlations were obtained for the EBBI and IBI indices, with values of about 70% with the reference built-up classification map.
Comparing our results’ accuracy with state-of-the-art studies, we can see most of the existing studies used SI over coarse resolution imagery. The O.A. varied from moderate to good depending on the input imagery and on the number of classes. Generally, supervised classifiers showed better statistical accuracy than SI. Nevertheless, expert identification and visual interpretation of such maps were more than encouraging given the proposed trade-off in terms of processing time and implementation. For this study, our accuracy results followed state-of-the-art tendencies, even if a direct comparison is difficult to address (i.e., different classes of interest, study zone, date of acquisition, etc.). The primary goal was to propose a reliable SI-based method for built-up areas identification using Vis-NIR high-resolution imagery. The method is somewhat demanding in terms of training data and computational processing compared to conventional machine learning, statistical classifiers, or deep learning conventional classifiers. In terms of statistical accuracy, it was expected to have a decrease compared to a well-trained classifier with thousands of points. The decrease was up to 25% for the whole built-up areas identification and up to 14% for two classes of built-up areas identification. For this latter case, the SI-based method even showed a higher performance for clear built-up areas identification than the SAM classifier.
As mentioned by many state-of-the-art studies, the main issue of using SI for the built-up areas identification is their sensibility to many environmental and context factors. A pre-processing step of masking conflicting classes as waterbodies, vegetation, shadow, and clouds is necessary for ensuring an optimal built-up areas identification [
48]. Second, the separation of bare soil from built-up areas is an important issue that needs further investigation. An accurate built-up areas map was an important issue at two levels including: (1) change detection of built-up areas and (2) morphological and socioeconomic characterization of built-up areas. The SI showed an encouraging performance for detecting built-up areas expansion of two classes of roofs. The accuracy performance was less than 9% compared to supervised classification. The classes’ definition for the change detection procedure needs to be defined in accordance with the nature of change (e.g., expansion, regression, and modification of roofing types) [
53]. The change detection is a sensible procedure requiring highly accurate land use maps. Otherwise, the change estimation will be corrupted. For our study, the main issue concerned false alarms of environmental changes (e.g., bare soil, shadows, clouds). Further experiments need to address whether an accurate masking of non-built-up areas pixels or post-processing of these elements are possible. The last section of the paper concerned the use of the previously generated built-up areas maps to extract their corresponding functions and socioeconomic characterization using morphological operators. This final step is even more sensitive insofar as it is dependent on both built-up areas identification accuracy and segmentation accuracy. The obtained results also depend on the input image resolution. SPOT 6 morphological object characterization was encouraging, with accurate detection of three socioeconomic classes. Sentinel-2A was less accurate due to coarser resolution. The developed rules need to be adapted in accordance. The combination of spatial and spectral features has largely been addressed [
27,
28,
29]. Nevertheless, few studies have addressed socioeconomic identification of the detected objects.
6. Conclusions
In this paper, we presented an original study about the simultaneous use of high temporal resolution imagery Sentinel-2A for urban change detection and high spatial resolution imagery SPOT 6 for socioeconomic modeling. The sub-Artic East Siberian town of Yakutsk was a good example to test the method. Significant urban sprawl has characterized the city’s urban dynamics in recent decades. The city is composed of different structures following specific districts/habitation types around it.
The first step of the method highlighted an important urban built-up areas increase between 2015 and 2017 (approximately 12%). The intra-annual change detection of class expansion was less efficient due to: (1) a short time interval (i.e., approximately 3 months) and (2) different illumination conditions between June and September, which led to an under-estimation of the clear built-up areas in June compared to September due to shadow effects that biased the change quantification. Therefore, the relative and absolute change rates were not able to detect the increase in terms of built-up areas. For a reference value of 12% in terms of the built-up areas increase between 2015 and 2017, we can theoretically estimate an increase of around 2% to 3% between June 2017 and September 2017 if we take into consideration that construction works take place between June and October. For reliable built-up areas change detection estimation, it is recommended to choose two dates of the same month with one year of interval at minimum, and, if possible, the same week as well for better reliability. In addition to the change quantification study by pixel count and class intersection, a precise experiment was done using accurate ground truth points and permitted to extract both expansions of clear and dark built-up areas for the two temporal intervals. The experiment shows comparable global performance in terms of intra-annual and inter-annual scenarios. On the other hand, the intra-annual study could be useful for non-built-ups areas change estimation (e.g., vegetation, bare soil, sandbanks). An important diminution of built-up areas between June 2017 and September 2017 was observed, which is partly due to the increase of vegetated surfaces in September. This process resulted in a decrease of the total surfaces of urban fabric, and the other land use changes masked the fine changes of built-up areas during this period.
The use of spectral indices (SI) proved to be a practical and fast solution compared to the usual classification algorithm, which requires precise ground truth data and is time-consuming. Nevertheless, these indices are correlated with each other and require an accurate thresholding definition to maximize the detection of each land use element and then maximize the built-up areas extraction when combining the different indices. This methodology can be improved by enhancing the thresholding process and introducing other SI such as Vis-NIR SI since the CBI index requires Vis-NIR bands and the first PCA band for calculation.
In addition to change detection, we experimented with a morphological recognition over built-up areas masks. The results were interesting, and three socioeconomic categories were recognized over the test zone based on geometric attributes. The road portions were efficiently eliminated due to their specific morphologies. The main confusion concerned bare soil delineated parcels, which were often considered as built-up areas. SPOT 6 imagery gave a precise morphological recognition of built-ups areas forms due to its high spatial resolution. Nevertheless, Sentinel-2A imagery can also be used in the morphological approach, with a reasonable detection of the different socioeconomic categories.
Many perspectives could be considered for this study. In terms of the method, the combination of classification algorithms with spectral indices could refine the final change ratio estimation (i.e., overestimation with SAM, underestimation with SI). The fusion procedure could be considered at the decision level (i.e., maps combination) or when integrating the calculated indices as additional bands. In terms of data, the pan sharpening of 10 m and 20 m could be considered, so that SWIR-based indices could be calculated and evaluated. The fusion of Sentinel 1 and 2 could be tested. The change detection procedure could be extended to other purposes in addition to built-up areas expansion. The validation is an essential step that provides qualitative assessment of our mapping results. Our validation data could be, in this context, enhanced with ones that are more accurate (e.g., Lidar map, official built-up areas outlines). Concerning the socioeconomic characterization step, its reliability will mainly depend on three parameters: (1) initial image spatial resolution, (2) built-up areas map generated by classification, and (3) segmentation algorithm used to extract the objects. Improving one of these parameters could lead to a better characterization of the urban fabric and to a better understanding of the city organizational logics.