One of the main goals of this research was to present a technique that will become a convenient, relatively easy, and fast tool to improve topoclimatic classification that can be applied by a wide group of users. Therefore, we developed the method to use free datasets and free software, if available (including CORINE Land Cover and Landsat images, and free software: SAGA GIS, LST program, and Map Comparison Kit).
3.1. Selection of Landsat Images
Landsat Enhanced Thematic Mapper Plus (ETM+) images were used for the LST map calculations (
Figure 4). The data were acquired from the United States Geological Survey (USGS) Resource Center (
http://glovis.usgs.gov, accessed on 26 June 2021). ETM+ is the main instrument aboard Landsat 7 and it records thermal infrared images with resolutions of 15 (panchromatic), 30 (multispectral), or 60 m per pixel.
Some images are not suitable for the analyses, including those with (1) too high a percentage of cloud cover or (2) that are outdated with no reference to the relatively current map of LULC used in the research (in terms of the distribution of settlements or road networks on the island, for example). Moreover, only images taken before May 2003 were used because of the failure of the Scan Line Corrector (SLC) instrument on 31 May 2003 (
http://landsathandbook.gsfc.nasa.gov/payload/prog_sect3_3.html, accessed on 26 June 2021). Images taken at different times of the year (from March to September,
Figure 5) were selected to increase the universality of the typology created. Altogether, seven Landsat ETM+ images taken between 9:50 and 9:55 GMT were selected (
Table 1).
Table 1.
Circulation types during the Landsat ETM+ images acquisition times. Circulation types are according to [
41].
Table 1.
Circulation types during the Landsat ETM+ images acquisition times. Circulation types are according to [
41].
Date | Acquisition Time (GMT) | Circulation Type | Group | Duration (Days) |
---|
11.07.1999 * | 09:55 | Fennoscandian high, cyclonic | meridional | 3 |
13.09.1999 | 09:55 | Central European high | mixed | 6 |
08.04.2000 | 09:54 | British Islands high | meridional | 2 |
14.08.2000 | 09:53 | Central European ridge | mixed | 9 |
13.05.2001 | 09:52 | Norwegian Sea–Fennoscandian high, anticyclonic | meridional | 6 |
20.08.2002 | 09:50 | Fennoscandian high, anticyclonic | meridional | 6 |
16.03.2003 | 09:51 | British Islands high | meridional | 8 |
Figure 5.
Seven LST (land surface temperature) images of Wolin Island calculated using the LST program [
19] with the algorithm of Qin et al. [
42], making use of the Landsat ETM+ thermal bands. All the images are presented using the same temperature scale (−3–55 °C) so that the spatial variability of the LST values are easier to compare.
Figure 5.
Seven LST (land surface temperature) images of Wolin Island calculated using the LST program [
19] with the algorithm of Qin et al. [
42], making use of the Landsat ETM+ thermal bands. All the images are presented using the same temperature scale (−3–55 °C) so that the spatial variability of the LST values are easier to compare.
3.2. Retrieval Algorithm
LST from Landsat thermal band 6 data is commonly retrieved through the algorithm proposed by Qin et al. [
42]. The algorithm is based on the linearization of Planck’s radiance function and is expressed as follows:
where
Ts is the land surface temperature in K,
Tsensor is the brightness temperature in K computed from Landsat ETM+ band 6,
Ta is the effective mean atmospheric temperature (K), and
a6 and
b6 are constants with values of −67.355351 for
a6 and +0.458606 for
b6 when the LST is between 273.5 and 343.5 K.
C6 and
D6 are calculated using the following equations:
where
ε is the ground surface emissivity and
τ6 is the atmospheric transmittance (
Table 2). Following Qin et al. (2001),
τ6 is estimated from the atmospheric water content (
Table 3).
Ta is calculated from one of linear Equations (4)–(7) referring to the four standard atmospheres:
where
T0 is the near-surface air temperature. Temperatures measured at meteorological stations can be used. In this study, Equations (6) and (7) were used for mid-latitudes.
The total atmospheric water vapor content (
w, g cm
−2) is calculated as follows:
where
w(
z) is the water vapor content at altitude
z.
Rw(
z) is the ratio of water vapor content at a given altitude to the average for a standard atmospheric column and is given in
Table 4. Water vapor at altitude
z (absolute humidity) was calculated from the formula
where
g is the absolute air humidity (g m
−3),
e is the actual water vapor pressure (hPa),
T is obtained from meteorological data, and
e is calculated by converting the formula for relative humidity [
14]:
Therefore,
where
f is the relative humidity (%) and
emax is the maximum water vapor pressure (%), which is read from the psychometric boards.
Ground surface emissivity (ε) was calculated using the LST software [
19], which has a built-in emissivity calculation module. Since plants and trees control their own temperature through evapotranspiration, the normalized difference vegetation index (NDVI) was introduced to the applied module. The LST calculation based on the normalized difference vegetation index (NDVI) from Landsat ETM+ is expressed as follows:
The relation between the emissivity and the NDVI (for NDVI values from 0.157 to 0.727) can be expressed by the following equation:
For other NDVI values, fixed emissivity values were assigned. For most of the island area, the NDVI values were in the range 0.157–0.727. Only areas of water exceeded this range, and the value of 0.985 was assigned to water in accordance with the work of Snyder et al. [
43].
The LST maps are presented in
Figure 5. Forested areas are characterized by lower LST values in the mornings, since vegetation isolates the ground from incoming radiation. Water is characterized by the lowest LST values (
Figure 1 and
Figure 5), except during August and September, when the water temperature is close to the temperature of forested areas due to warm water in the summer. Urban areas are characterized by the highest LST, especially in July. Agricultural areas also have higher LST values, especially in August and September; the bare soil heats quickly after crops are harvested and grasslands are grazed or mowed.
All images were taken during sunny and stable weather conditions lasting for at least several days. According to the Grosswetterlangen classification of atmospheric circulation patterns [
41], the meridional circulation group occurred five times (in five Landsat ETM+ images) and the mixed circulation group occurred two times. However, the seven Landsat images represent six different circulation types (
Table 1) within these two groups. This is important for the universality of the approach: despite the limited number of images used in the analysis, they represent different seasons of the year, different weather conditions, and are statistically different (
Table 4)—a linear correlation between each pair of images (datasets) is weak. This makes our conclusions more generalizable.
3.3. Unsupervised Classification of LST Images
The classification of thermal images was performed in order to find areas with similar LST values. This enabled the indirect delimitation of the topoclimates. The unsupervised classification was performed on standardized data and relied on mathematical differences between LST values—the concept was to find LST distribution patterns based on computer learning. It was not possible to arbitrarily establish the thresholds for the topoclimatic classes, as the differences between them were difficult to establish on the basis of visual analysis only. The classification obtained was compared to previously published topoclimate distribution based on the other approach [
28,
44]—see
Section 3.5.
The classification was performed based on seven separate thermal images (not averaged). The standardization function of the STATISTICA software was used to standardize the data. Unsupervised classification was initially carried out with the use of the isoclust, isodata, and cluster [
45] procedures.
After the unsupervised classification process, the significance of the LST differences within designated classes was tested with a variance analysis, post hoc tests, and discriminant analysis, as described below. First, a one-way analysis of variance (ANOVA) was used. A low value of testing probability (
p-value) indicates the significant dependency of the dependent variables (LST images) tested on the independent variable (classification map). The analysis of classification maps showed that the
p-value was close to 0 (
p = 1 × 10
−6), which means that classes differ significantly in terms of thermal properties (
Table 5).
The second tool, post hoc tests, informs on which LST classes differ significantly. The post hoc Scheffe test in the STATISTICA program [
46] was used. With few exceptions, this test showed that the classes differed significantly. For example, on the ISOCLUST8 map with seven classes, only 1 combination from 343 (i.e., 7 images out of 49 combinations) was not significantly different (
Table 6).
The third tool, discriminant functions, was applied to determine whether the variables discriminated between two or more naturally occurring groups. The Wilks’ lambda discriminant function for all seven images was analyzed. Wilks’ lambda is a standard statistic used to denote the statistical significance of a model’s discriminatory power. Wilks’ lambda assumes values from 0 (perfect discrimination) to 1 (no discrimination). The isodata classification method (
Figure 6) gave the most coherent and unambiguous results, with a low Wilks’ lambda value and a high percent of clearly classified pixels compared to other methods (
Table 7). Therefore, it was chosen for use in further analyses.
3.4. Factors Influencing the LST Spatial Variability
Thermal classes were described in terms of (1) LULC, (2) lithology/moisture, (3) relief, and (4) average standardized LST.
The LULC form definitions were based on the CORINE Land Cover 2006 (CLC2006) database (
http://www.eea.europa.eu/publications/COR0-part1, accessed on 26 June 2021;
Figure 2). The description of classes is generalized, and only LULC forms that occupy > 10% of the class were included in the description. Therefore, the percentages of various LULC forms do not necessarily add to 100%.
Thermal type II, which is 60% agricultural terrain, was named “built-up areas”: small villages were classified in the CLC2006 database as agricultural areas/pastures even when they lie next to agricultural areas/pastures; this is due to the limited accuracy of the CLC2006 database. The authors separated villages based on the visual interpretation of the aerial and motor-glider low-altitude visible images [
49], and topographic maps that show the spatial distribution of urban fabric on Wolin Island.
The lithology/moisture map (
Figure 7) is the combination of a lithology map and the moisture map, which consists of 11 classes (
Table 8). The lithology was adapted in a generalized form by reducing the number of classes from the detailed geological map of Poland at a scale of 1:50,000, prepared as single map sheets [
50,
51,
52,
53,
54,
55]. Only the main types of land cover that differed significantly in terms of thermal properties [
56,
57,
58], such as specific heat, heat capacity, thermal conductivity, and thermal diffusivity, were kept for further analyses (
Figure 7).
Since a moisture map of Wolin Island is not available, a 1 m hydroisobath was digitized from the hydrographic map of Poland at a scale of 1:50,000 [
59]. Moisture information is critical, as the water content determines the thermal properties of the ground due to the specific thermal properties of the water. It was assumed that the areas with shallow groundwater tables are generally wet, while the areas with a deep groundwater table are generally dry.
Relief characteristics were calculated using Digital Terrain Elevation Data Level 2 (DTED2) [
60]. Three classes of terrain slopes were defined in accordance with [
61]: (1) flat areas (<2°), (2) low slopes (2–5°), and (3) high slopes (>5°). Aspects were categorized into four classes: (1) south (135–225°), (2) east and west (45–135° and 225–315°), (3) north (0–45° and 315–360°), and (4) flat areas without aspect. The slope and aspect classes were then merged and recoded into seven combined classes (
Table 9 and
Figure 3).
3.5. Digital Elaboration of Paszyński’s Classification
The method of determining topoclimatic maps developed by [
44] is based on the characteristics of energy exchange between the atmosphere and the ground. This is a fundamental process that shapes local climates. The Paszyński classes were originally separated on the ground as a result of direct heat flux measurements, and then described based on relief (slope and aspect) and land cover characteristics (
Figure 8 and
Table 10). Energy exchange at the interface between the atmosphere and the ground can be presented quantitatively in the form of the energy balance equation, also known as the heat balance equation. The simplified equation is
where Q * is the radiation balance, H is the turbulent perceived heat flux to or from the atmosphere, E is the latent heat flux associated with the processes of evaporation or condensation, and G is the heat flux in the ground (mainly conducted). Component signs (+ or −) depend on the direction of the flux from the atmosphere to the active surface or vice versa. The final topoclimate classification, taking into account day and night, includes 16 major topoclimatic types [
44] corresponding to the relative values of the major components of heat balance. This classification refers essentially to the growing season and stable, sunny weather. Each of the 16 topoclimate types were also described by Paszyński in terms of relief and land cover/land use characteristics.
Paszynski’s division was elaborated and digitalized in this report to verify the LST-based topoclimate categorization method. Relief characteristics (slope and aspect) describing topoclimates types were calculated from DTED 2 and then reclassified in accordance with Paszyński’s description. Concave forms were defined based on the “r.geomorphon” extension [
62] implemented in the GRASS GIS software [
63].
LULC classes were designated based on the CLC2006 database, which distinguishes classes within Wolin Island. However, we simplified Paszyński’s topoclimatic classification [
28,
44] by eliminating or merging classes with minor relevance or ambiguous definition—e.g., class 1.4 (areas varied in terms of topography) was not included in the digital elaboration. In total, 13 topoclimate classes were thus distinguished for Wolin Island (see
Figure 8,
Table 10).
3.6. Map Comparison Tools
Cross tabulation was used to compare the two derived typologies, LST map, and the modified Paszyński’s map described in
Section 3.5. Cross tabulation allows for the evaluation of similarities between two maps. The kappa index of agreement (KIA) [
64] was calculated. It indicates the degree of agreement between two maps. KIA itself is usually not sufficient for a classification scheme that attempts to accurately specify both quantity and location. The VALIDATE module (IDRISI), which was also used, offers one comprehensive statistical analysis that describes how well a pair of maps agree in terms of (1) the quantity of cells in each category and (2) the location of cells in each category. To make this comparison, it offers three options: kappa for no information (Kno), kappa for grid-cell level location (Klocation), and kappa for stratum-level location (KlocationStrata). Moreover, VALIDATE divides agreement and disagreement into the more sophisticated components as agreement due to chance, agreement due to quantity, agreement due to location at the stratified level, agreement due to location at the grid cell level, disagreement due to location at the grid cell level, disagreement due to location at the stratified level, and disagreement due to quantity [
31].
For a better understanding of agreement and disagreement of quantity and location, assume that there are two maps containing two categories: light and dark. We compute the proportion of cells that agree between the two maps and find that the agreement is 12/16 and the disagreement is 4/16. However, at a more sophisticated level the disagreement in terms of two components can be computed as (1) disagreement due to quantity, and (2) disagreement due to location. A disagreement due to quantity is defined as a disagreement between the maps in terms of the quantity of a category. For example, the proportion of cells in the dark category in the comparison map is 10/16 and in the reference map is 12/16; therefore, there is a disagreement due to quantity of 2/16. The example of disagreement of location could be when one replaces two cells from the comparison map with cells from the reference map to make them more consistent. Therefore, the difference in location is 2/16 [
65].
In addition, fuzzy kappa was computed using the Map Comparison Kit (MCK) software [
66]. The fuzzy kappa statistic is often used for the categorical maps with unequal legends (a different number of classes). The fuzzy kappa approach uses cells in the neighborhood to compute the equivalent of the traditional kappa statistic, but is instead based on fuzzy set theory [
58,
67,
68]. One of three functions, which are exponential, linear, or constant, can be selected to determine the extent to which the neighboring cells influence the fuzzy representation [
66]. Exponential decay (default setting) was chosen in the present research.
Finally, fraction correct and Cramer’s V statistics were calculated. Fraction correct [
68] is the number of equal cells divided by the total amount of cells. As an overall measure of similarity, the fraction correct method is considered a flawed estimate because it tends to consider maps with few or unevenly distributed categories as more similar than those with many, more equally distributed categories. Therefore, Cramer’s V statistics were chosen to determine the strength of association between variables. Cramer’s V statistics calculates the correlations in tables that have more than two rows and two columns. Values close to 0 denote a weak association between variables, while values close to 1 indicate a strong association.