1. Introduction
The transition to digital agriculture can be achieved by addressing the challenges associated with the automated recognition of crops growing in a given area [
1]. In recent years, research at both the national and global scale has focused on creating regional cropland masks of different regions, as well as masks of individual crops [
2,
3]. Machine learning (ML) methods are actively used to solve these problems. Typically, input data for these ML models are seasonal time series of various indices (especially optical) that are obtained after processing images. This is because time series usually provide higher accuracy in mapping than an index calculated from a single image [
4,
5,
6].
Typically, to identify agricultural crops, time series of optical vegetation indices (VI) and traditional ML classifiers, such as decision trees, Support Vector Machines (SVM), and Random Forest (RF) algorithms are used. An example of this is a study conducted in the Tashkent province, Uzbekistan, in which researchers classified several crops (cotton, wheat, rice, other crops) using a series of monthly Normalized Difference Vegetation Index (NDVI), Enhanced Vegetation Index (EVI), and Normalized Difference Water Index (NDWI1 and NDWI2) composites, based on Landsat-8 and Sentinel-2 imagery [
7]. The highest accuracy was 90% (Kappa = 0.88) when using EVI composites derived from Landsat-8 data. Hu et al. performed retrospective restoration of crop rotation using the RF algorithm [
8]. They trained the classifier on percentile monthly Sentinel-2 composites in 2020 to map cropland in 2019. The possibility of successful classification 2 months before the end of the growing season was shown. The overall accuracy of retrospective forecasting was 92%. In this study, monthly bands and VI time series were used. The use of a large number of correlating parameters (in the model [
8], 90 parameters were included) in the classification model requires a large number of calculations and may be excessive.
A serious disadvantage of optical data used in the discussed works is the strong dependence of satellite data quality on weather conditions. Clouds, shadows from clouds, and the presence of aerosols in the atmosphere may significantly reduce the potential of optical sensors. A large number of cloudy images leads to substantial gaps in VI time series built using optical imagery.
Some researchers have used the approximation of the seasonal course of VI to bridge gaps in remote sensing data. Polynomials, Gauss function, and the double logistic function [
9,
10,
11] are most often used as approximate functions. In Uruguay, scientists developed a soybean growth model and predicted the NDVI maximum using polynomials and the double logistic function for fields with an area of more than 250 hectares [
12]. In the Samara region of Russia, a group of researchers [
13] approximated the seasonal course of NDVI using a piecewise function, Fourier series, and cubic splines.
SAR (Synthetic Aperture Radar) data, independent of lighting and weather conditions, is a reliable alternative for long-term monitoring of crop rotation. Sentinel-1A/B satellites sensors allow the observation of a significant part of the Earth and receive double polarization data every 12 days (every 6 days when using both satellites). This is sufficient to build continuous time series with a sufficiently large number of observations for the season. Therefore, radar data obtained from Sentinel-1 has recently been actively used for crop mapping and the identification of abandoned cropland. For instance, Qiu et al. [
14] used time series of VV polarization to detect woody abandoned cropland in China. However, to determine grass fallows, single polarization was not enough, and the authors additionally used optical data. In the majority of studies, scientists have used either the VH or VV polarizations, or even their ratio (VH/VV). For example, Song et al. [
15] determined the proportion of pixels of soybean and corn (three classes per crop: small, middle, and large) in experimental polygons throughout the United States. Sentinel-1 VV, VH, and VV/VH time series for one agricultural season were used. The use of three predictors in a decision tree classifier achieved an accuracy greater than 94% for both crops. Arias et al. [
16] attempted to identify 14 crops based on Sentinel-1 radar data (three sets of polarizations) at the level of the province of Navarra in Spain, as well as that of individual municipalities. The accuracy of the classification using Sentinel-1 did not exceed 70%. Attempts have also been made to merge radar and optical data for crop mapping [
17,
18], where Landsat-8 (or Sentinel-2) and Sentinel-1 imagery were used as initial data and the RF algorithm was used as a classification method. In both of these studies, significant sparsity in the data was observed, and linear interpolation greatly decreased the accuracy of the classification.
Time series for several years make it possible to observe crop rotation. For example, a group of scientists [
19] used the Dynamic Time Warping method and linear interpolation to equalize data for two years when classifying nine crops in China based on the combination of polarizations.
Despite the frequent and successful use of individual Sentinel-1 polarizations for crop mapping, the use of radar VIs (by analogy with well-known optical VIs) appears to be more promising because they are constructed for monitoring vegetation. The Radar Vegetation Index (RVI) is the most common radar index [
20]. Evidence suggests that this index is highly sensitive to the dynamics of plant growth; therefore, it can likely be used to assess crop phenology and the characteristics of vegetation and perform classification [
17,
21]. Bickensdorfer et al. [
22] classified croplands in Germany using RVI, VV, VH and the ratio of VV/VH monthly composites. Classification accuracy while using four time series (12 values) was 65%. To increase the accuracy to 80%, the authors had to additionally calculate the time series of several vegetation indices based on Sentinel-1 and Sentinel-2 data, resolve inconsistencies in the spatial resolution, restore data, and add temperature, precipitation, and humidity variables to the model. Gella et al. [
23] used Dual Polarization SAR Vegetation Index (DPSVI) and Modified Radar Vegetation Index (MRVI) time series to identify eight agricultural crops in the Netherlands. The use of such VIs has been found to improve overall accuracy.
The Dual-polarization Radar Vegetation Index (DpRVI) is based on the transformations of complex polarimetric radar Level-1 Single Look Complex (SLC) data [
24]. Processing is carried out by applying one of the polarimetric decomposition methods [
25,
26,
27]. At the same time, a comprehensive polarimetric response of the signal from the object to the components, which characterizes the contribution of a particular radar signal (single, double, or volumetric), is obtained. Information about scattering is characterized by the degree of polarization and the measure of the dominant scattering mechanism. Based on these calculated indicators, as shown by [
24], DpRVI becomes more sensitive to crop growth and is used as a relatively simple and physically interpretable vegetation descriptor.
Croplands in different regions have significant differences in their spectral characteristics, which leads to inaccurate forecasts even if the classifier is well-trained [
28]. In particular, a number of researchers, including those from Russia, have faced the problem of incorrect results from the model when it was transferred to regions where crops had a different phenology and the temporal spectral profile of the crops was changing [
29,
30].
In recent years, the challenges associated with creating a cropland mask have been addressed at the Space Research Institute in Russia. Land classification is carried out using NDVI composites on cropland (spring and winter crops separately), forest, grassy vegetation, swamps, and tundra. Miklashevich et al. [
30] achieved high accuracy in land mapping for the European regions of Russia. The disadvantage of large-scale maps is the significantly uneven quality of the classification between regions (in particular, Miklashevich et al. [
30] noted an increase in error in the Russian Far East). It can also be noted that although most of these maps are aimed at determining the land types and the creation of a cropland mask, they do not provide any information about crop rotation on the allocated lands. Moreover, since different regions are characterized by different sets of agricultural crops and the variety of climatic conditions leads to variation in phenology, the task of separating crops based on VI can be extremely difficult even if the classifier was well performed in one region. Therefore, the most productive approach is the development of accurate local maps indicating the growing crops within the region or in a few regions characterized by similar climatic conditions and agricultural practices.
The main crop in the Russian Far East is soybean. Soybean occupies 60–90% of the cropland area in various municipalities [
31]. It should be noted that the increase in soybean acreage is not as much due to the commissioning of new lands to crop rotation but due to repeated sowing. This is unacceptable in terms of the recommended crop rotations, where its specific share should not exceed 50%. The placement of soybean in crop rotation patterns contributes to an increase in yield by an average of 36.4% compared to that in the case of permanent cultivation. However, repeated sowing of this crop in crop rotations reduces its yield by 27.2% [
32]. To ensure the optimal use of the cropland, soybean is rotated with grain crops, including oat [
33,
34].
However, there are a lot of abandoned croplands in the Russian Far East. Currently, the total fallow area in Russia is up to 40 million ha. According to the data obtained from the agricultural census conducted in 2016, nearly 17.7 million ha of fallow land are owned by agricultural enterprises yet remain unclaimed by them. According to the Ministry of Agriculture of the Russian Federation, this area was approximately 15.3 million ha in 2019 [
35]. Such lands include both bushy and forested land, as well as swampy fallow land. Defining fallow lands and introducing them into the crop rotation system are among the most important tasks for the agricultural sector in the country.
Thus, the main objective of this study is to identify the main crop of the region (soybean) used in crop rotation, as well as oat and fallow lands, by developing cropland classifiers for the Khabarovsk District based on different DpRVI time series and ML methods. Further, it aims to assess the possibility of using best classifiers for other municipalities in the Russian Far East, and compare the differences in the accuracy of determining these classes across these regions.
2. Materials and Methods
2.1. Study Areas
The municipal districts of the southern part of the Russian Far East in the Amur River basin were chosen as the study areas. Study area 1 (SA1) is the cropland of the Khabarovskiy District, situated on the right bank of the Amur River in the vicinity of the city of Khabarovsk. It is located between 48.31° N and 48.64° N and between 134.81° E and 135.57° E of the Greenwich meridian (
Figure 1). It is characterized by favorable climate and soil conditions, which makes it possible to grow grain and leguminous crops.
Three districts (Arkharinskiy, Ivanovskiy, and Oktyabrskiy) are located on the territory of the Zeya-Bureinskaya Plain (Amur Region). About 75% of the territory of the Zeya-Bureya Plain of the Amur Region has been converted into agricultural landscapes. The total area of cropland in the territory of the Zeya-Bureya plain is 1.32 million ha. Cropland is the most important type of agricultural land and the main wealth of the region. Study area 2 (SA2) is located between 49.28° N and 50.61° N, 127.76° E and 130.05° E of the Greenwich meridian (
Figure 2). Relatively high temperatures and the length of the vegetation season make it possible to grow good and stable yields of cereals, industrial, and other crops [
36].
The study areas are characterized by a monsoonal climate with cold snowy winters (the average January temperature varies from −25 °C in study area [SA] 2 to −20 °C in SA1) and humid warm summers (the average July temperature in the study areas exceeds 21 °C). The average annual rainfall is at the level of 600–700 mm per year.
The main agricultural crop in the southern part of the Russian Far East is soybean; in 2021, soybean occupied 55% (9381 ha) in the Khabarovskiy District, 79% (28,596 hectares) in the Arkharinskiy District, 76% (82,425 ha) in the Ivanovskiy District, and 79% in the Oktyabrsky district (78,899 ha) of the total area of arable land. The land occupied by oats is 7% (1232 ha), 5% (1851 ha), less than 1% (865 ha), and just over 1% (1380 ha) in the Khabarovskiy District, Arkharinsky District, Ivanovsky District, and Oktyabrskiy District, respectively.
2.2. Data
2.2.1. DpRVI Calculation
To calculate DpRVI, Sentinel-1 Level-1 SLC images were acquired for SA1 from one track (one scene with relative orbit number 90), but for SA2 from two tracks (two scenes with relative orbit number 105 and one scene with relative orbit number 134), as shown in
Table 1. The data were obtained from the Alaska Satellite Facility Distributed Active Archive Center (
https://search.asf.alaska.edu/ (accessed on 14 January 2023), which contains modified Copernicus Sentinel data from 2015, processed by the European Space Agency. The
Geopandas package was used for raster data handling. Two scenes with track number 105 between 2 May and 29 October 2021 were selected. The scene with track number 134 was acquired between 4 May and 31 October 2021, and the scene with track number 90 was acquired between 1 May and 28 October 2021. These data were used in the processing Terrain Observation with Progressive Scans SAR (TOPSAR) mode Interferometric Wide (IW) swath products. The swath length is 250 km and the spatial resolution is 5 × 20 m in this configuration. Swath in the IW mode is divided into three sub-swaths (IW1, IW2, and IW3). Each sub-swath consisted of 9 bursts in the azimuth direction.
SA1 is within one sub-swath of scene 90. SA2 is located in three zones, marked with red rectangles with a dash-dotted line in
Figure 3. Fields that are in the western part are included in the zone of intersection of two scenes from the Sentinel-1B satellite, while in the eastern part they are concentrated in one scene. Thus, pre-processing of the radar image time series data was carried out using two different workflows, with standard correction steps as follows [
37]: (a) assembly of two slices (relative orbit number 105 of IW1); (b) single sub-swath processing (relative orbit number 134 of IW3).
All workflows were processed using the Graph Processing Framework (GPF) on the ESA Sentinel Application Platform (SNAP) v.8.0 (
http://step.esa.int/main/ (accessed on 14 January 2023)). This allows the user to create graphs and set up chains for batch processing, as shown in
Figure 4. Pairs of SAR images for the same date were fed as input.
In Module 1, the operator S1 TOPS Split was used to select sub-swaths and bursts depending on the location of the study area. Slice Assembly combined parts of different scenes into a single product. Further, the precise orbit state vector files, which contained accurate information about the satellite position and speed, were downloaded and applied. Then, radiometric calibration was performed, during which it was important to keep the data in a complex-valued format because complex values were needed to calculate the covariance matrix for dual-polarization C2 in the subsequent steps. Next, in Module 2, the S-1 TOPS Deburst operator was applied to produce a single SLC image. These image subsets were multi-looked with factors of 3 and 1 in the range and azimuth directions, respectively, to generate a square pixel. The spatial resolution of the output was 10 m. Then, for each date for the entire data stack, a dual-polarization 2 × 2 covariance matrix C2 was formed, as shown in Equation (1):
where * indicates complex conjugation and 〈〉 indicates the spatial average over a moving window.
All elements of the matrix
C2 were additionally subjected to the noise mitigation procedure using the refined Lee adaptive filter [
38]. The elements of the matrix were complex values containing all of the information about the polarimetric scattering properties. The polarimetric decomposition technique was then used to obtain the parameters of the state of polarization of an electromagnetic wave, which was characterized by the degree of polarization
m [
24,
39] and the measure of the dominant scattering mechanism
β of the reflecting target as follows:
where
Tr represents the sum of the diagonal elements of the matrix, || represents the determinant of a matrix,
m is the degree of polarization (0 ≤
m ≤ 1), which is defined as the ratio of the average intensity of the polarized portion of the wave to the average total intensity of the wave, and
β refers to the measure of the dominant scattering mechanism, which is determined based on the decomposition of the matrix
C2 into two non-negative eigenvalues (
λ1 ≥
λ2 ≥ 0), as follows:
Further, the DpRVI index was calculated for each date using Equation (4):
In Module 3, all Sentinel-1 images were co-registered as a stack using the Sentinel-1 Back Geocoding operator and subsetted based on the area of interest. Following this, SAR stack data were geocoded to a WGS84/UTM zone 52N projected coordinate system using the Range Doppler Terrain correction operator to yield a DpRVI time-series product in GeoTiff format.
For single sub-swath processing (e.g., IW3), Module 1 was modified as shown in
Figure 5. Since all studied fields were in one IW3 for track 134, Modules 2 and 3 remained the same as for the first workflow.
It must be noted that when several sub-swaths (e.g., IW1 and IW2) are processed, Module 2 can be modified, as shown in
Figure 6. The S-1 Back Geocoding and S-1 TOPS Merge operators can be added to merge separated products of different sub-swaths into a single image.
2.2.2. Ground Reference Data
Labeled DpRVI data seasonal series for pixels of agricultural fields of the Khabarovskiy District were divided into training data to build a model and test dataset to evaluate the quality of the model. Information on crop rotation, obtained from ground-based observations, was additionally verified by expert visual interpretation of satellite images. Overall, pixels from 80 fields were divided into three classes: soybean, oat, and fallow. Pixels of 40 fields were used to build a model, pixels of another 40 sites were used to evaluate model accuracy.
Table 2 provides information on the number of fields and areas occupied by the different crops. It can be seen that half of the fields (20) from the training sample in 2021 were occupied by oat; however, soybean was planted in more large fields, and thus the total area covered by soybean exceeded the area covered by oat. The share of fallow land in the studied fields of the Khabarovskiy District exceeded 35%.
Labeled DpRVI time series for pixels of agricultural fields in the districts of the Amur Region were used to evaluate the quality of the model when transferred to a neighboring region. To form a test sample, we selected abandoned fields, as well as fields with soybean and oat, in each of the three districts of the Amur Region: Arkharinskiy, Ivanovskiy, and Oktyabrskiy. To mark up the test data, we used information on crop rotation for the fields from the Unified Federal Information System of Agricultural Lands (
https://efis.mcx.ru/landing (accessed on 26 December 2022)). During the preliminary visual interpretation of satellite images, we discovered that some of the fields were in the flood zone, while for other fields, information about the growing crop was deemed unreliable.
Overall, 29 fields were used to form an Amur region sample.
Table 3 provides information on the number of fields and areas of land included in this sample.
2.3. Datasets
As shown in
Table 1, there was no Sentinel-1 image for track 105 on 18 August 2021 and for track 134 on 3 July 2021. DpRVI time series were approximated using a cubic polynomial to restore gaps. The dates of observations converted to DOY (day of the year) format were considered an independent variable. DpRVI data values were used as the dependent variable. The polynomial coefficients were determined for each time series and then the function values were calculated at the points of interest: 230 (18 August) for track 105 and 184 (3 July) for track 134. Thus, after data recovery, the time series included 16 observations.
Despite the fact that DpRVI data are practically independent of atmospheric phenomena, anomalous values of this index can be observed in some parts of the fields. To minimize the influence of such noises, “2σ-filtering” was performed. For each field and for each date, the mean value and standard deviation (σ) were calculated for DpRVI. Pixels with more than half of the values of the time series that did not lie in the 2σ-interval were considered anomalous and were not included in the training set.
After removing anomalous pixels from the sample, a training dataset was formed, which consisted of 71,153 DpRVI time series. The test dataset for Khabarovsk District was not filtered, and its size was 30,550 time series for DpRVI. Then, DpRVI time series for Amur region fields were calculated and the Amur region dataset was formed. The Amur region set was also not filtered and included 203,120 time series. The distribution of time series between classes in both samples is presented in
Table 4.
2.4. ML Methods
Classification was performed using the Python scikit-learn package. This package contains different machine learning approaches for supervised classification, such as tree algorithms, discriminant analysis, Bayes classifiers, nearest-neighbor techniques, ensemble classifiers, and neural networks. This package contains out-of-the-box implementations and does not require thorough tuning. We selected three machine learning algorithms for cropland classification: SVM, RF, and Quadratic Discriminant Analysis (QDA). The GridSearchCV method (scikit-learn) was applied to optimize SVM and RF hyperparameters during cross-validation. Random state controlled the pseudo random number generation for shuffling the data and reproducible output across multiple function calls.
2.4.1. Support Vector Machines
SVM is a set of supervised learning methods used for classification, regression, and outlier detection. The advantages of SVMs include effectiveness in high dimensional spaces, memory efficiency (they use a subset of training points in the decision function, which are called support vectors), and versatility (different kernel functions can be specified for the decision function). The objective of the SVM is to find a hyperplane with the maximum margin (the maximum distance between data points of classes) in an N-dimensional space (N represents the number of features). LinearSVC is a fast implementation of Support Vector Classification in
scikit-learn for the case of a linear kernel. This classification method is widely used in remote sensing for crop type mapping [
40,
41,
42].
2.4.2. Random Forest
RF [
43] is a meta estimator that fits a number of decision tree classifiers on various sub-samples of the dataset and makes final decisions using majority votes to improve the predictive accuracy and control over-fitting. The features are always randomly permuted at each split. Therefore, the best split may vary even with the same training data. To obtain deterministic behavior during fitting, a random state has been fixed. Evidence suggests that RF is effective for crop mapping [
18,
22,
44]. Moreover, it is known for performing particularly well and efficiently on large input datasets with many different features. Another advantage of RF is its high accuracy and robustness to outliers and noise [
40].
2.4.3. Quadratic Discriminant Analysis
QDA is the classic classifier, with a quadratic decision surface. It is attractive because it has closed-form solutions that can be computed easily, is inherently multiclass, has been proven to work well in practice [
45], and has no hyperparameters to tune. QDA can be derived from simple probabilistic models that model the conditional distribution of the data for each class [
46]. More specifically, QDA models data distribution as a multivariate Gaussian distribution. To train a classifier, the fitting function estimates the parameters of a Gaussian distribution for each class.
2.5. Classification Performance Evaluation
The accuracy metrics for each class, namely, overall accuracy (
OA) and F1 metrics, were calculated based on a confusion matrix. A confusion matrix is a square matrix of dimensions
n × n (where
n is the number of classes). In the confusion matrix, each row represents an actual class, and each column represents a predicted class. Each element of the matrix
Xij denotes the number of time series of class
i that fall into class
j during classification. The diagonal elements of the matrix reflect the number of correctly classified time series. The
OA is the ratio of the sum of diagonal elements to the sum of all elements of the error matrix
N (expressed as a percentage):
The confusion matrix presents both the number of time series in class
i (
i ∈ [1,
n]) erroneously assigned to other classes (false negative time series, FN) and the number of pixels erroneously assigned to class
i (false positive time series, FP) by the classifier. The F-measure for class
i is a harmonic estimate of the FN and FP quantities:
To assess the model’s performance, OA and F1 metrics were calculated for different ML algorithms (RF, SVM, QDA). The accuracy metrics , , , were evaluated. The best-performing method was selected based on the analysis of the accuracy metrics. This method was then used in the classification of land in the Amur Region. To assess the performance of the classifier in the neighboring region, the metrics , , , were calculated.
To assess the efficiency of the classifier, it is also important to determine the number and proportion of fields in which crops were identified correctly. The number of time series assigned by the classifier to a particular class was counted to identify the crops in each field. Each field was assigned the label of the class with the greatest number of time series. We calculated the number and proportion of fields with correctly identified crops to assess classification accuracy at the field level.