The Introduction section discusses the importance of studying Sentinel-1 time series data in the mapping of crops, and relates our work with the state-of-the-art in the context of the study. The introduction is divided into two sub-sections. Firstly, the background and motivation are discussed and the objectives of the study are given. Examples of the earlier studies using synthetic aperture radar (SAR) data in crop classification are reviewed further in the second sub-section.
1.1. Background and Motivation
The Land Parcel Identification System (LPIS) of the European Commission is used for registration of agricultural reference parcels considered eligible for annual payments of European Common Agricultural Policy (CAP) subsidies to farmers [
1]. LPIS and the subsidies presume a system of controlling the quality of the management for each parcel. The control was carried out through field visits and using aerial photographs, and both methods are quite expensive. The European Union is interested in a more cost-efficient way of controlling the management of crop fields, e.g., the utilization of the space-borne remote sensing data. Reliable image acquisition is important in operative applications. Clouds and haze often prevent the acquisition of applicable optical area images in such a way that, e.g., even a single cloud free Sentinel-2 can not be acquired from the growing season from each region in most of the European countries. Space-borne SAR data are thus the only remote sensing data source suitable for rapid and near real-time assessment of crop fields during the growing season in European countries. Since the advent of the Copernicus programme Sentinel-1 SAR satellites that are capable to provide repeated acquisitions every six days (with two satellites), the potential for continuous monitoring of crops was established.
One key prerequisite for an operational method is a possibility to recognize the management operations and species early enough during the growing season. This requirement arises from the need to make decisions and provide the subsidy payments early enough. It is also important to know the uncertainty of the species prediction at an individual parcel level. An operational system may still need several auxiliary information sources, e.g., provided by methods using data acquired by space-borne or airborne sensors, and even field visits. Uncertainties of the species prediction at an individual field parcel level are therefore important to decide whether other information is needed or not, in addition to the predictions based on satellite observations.
To date, several approaches to use space-borne SAR data to monitor agricultural regimes, e.g., for detecting ploughing, moving, or sowing activities were reported in the literature (see
Section 1.2). Studies in crop classification with fully polarimetric C-band SAR and optical area data have also been reported (
Section 1.2). Many studies concentrated on tree orchards, grapes, or sugarcane. A few studies exist with detailed corn species recognition particularly using Sentinel-1 data.
Section 1.2 reviews the aforementioned literature.
The objectives of this research are (1) to develop a method for recognition of individual crop species, or species groups, and management regimes; (2) to investigate the earliest time point in the growing season when the species predictions will be satisfactory; and (3) to present a method to assess the uncertainty of the crop species prediction at an individual parcel level. The improved k nearest neighbour method (ik-NN) with the feature optimization using a genetic algorithm was modified for estimation [
2,
3]. The uncertainty assessment of the parcel level predictions was based on the probabilities of the crop species that are outputs of the ik-NN method. A parametric method to estimate the confidence intervals of the largest probabilities was developed. Multinomial logistic regression was employed as an optional method for comparison with the ik-NN method. The methods and results were demonstrated on a test site located in Southern Finland, in Eura muninicipality. The number of crop field parcels was 10,287 with a total study area of 24,503.5 hectares.
1.2. SAR Data in Crop Classification
SAR technology offers many advantages for crop monitoring activities, given that the radiowaves are generally unaffected by the presence of clouds and haze. The presence of multiple SAR instruments increases the opportunity to build temporally rich data sets in the early weeks of crop growth. Due to the size of the crops, C-band is considered to be a wavelength most suitable for crop monitoring applications with various SAR techniques [
4,
5,
6,
7].
Multitemporal SAR data sets can be used to exploit changes in the crop structure as crops transition from one growth stage to another and thus to separate one crop type from another. In particular, SAR backscatter at C-band is very sensitive to changes in structure during seed and fruit development, stages which occur later in the growing season [
8]. This means that high quality crop inventories can be readily delivered close to the end of the growing season. For mono-temporal SAR based classification of crops, in order to maximize classification accuracies, SAR image acquisition is recommended to be planned during seed and reproductive phenology phases [
8,
9]. With multitemporal SAR approaches, it remains still to be seen how early in the growing season the crop inventories can be produced, and what accuracies can be achieved as the growing season develops.
To date, the majority of research on cropland classification was done using multiparametric SAR data [
10]. This includes mostly polarimetric and multitemporal SAR data [
8,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22], as well as multi-frequency SAR and fusion of satellite optical and SAR data [
7,
23]. In addition, the potential of interferometric SAR approaches was evaluated along with SAR backscatter data in crop monitoring [
6,
24,
25]. While often SAR data acquired at X-band, and, to lesser extent, L-band [
26], can still be used in crop monitoring and change detection, satellite C-band SAR data are assumed to be the preferred choice for crop mapping, particularly for agricultural areas with low vegetation [
8]. There is a trade-off between the polarimetric information and the multitemporal information; however, results obtained using the multitemporal information tend to be better [
5]. On the other hand, when only a few acquisitions are available, the polarimetric mode may perform better than the single- and dual polarization modes.
In Loosvelt et al. [
15], the Random Forests approach was used for the probabilistic mapping of vegetation using fully polarimetric L- and C-band EMISAR data to assess and analyze classification uncertainty based on the local probabilities of class membership. Results showed that multi-configuration in the dataset decreases the classification uncertainty for the different agricultural crops when compared to the single-configuration alternative. Furthermore, the uncertainty assessment revealed lower confidence for the classification of (mixed) pixels at the field edges.
Currently, Sentinel-1 series satellites of European Space Agency (ESA) are the key resource for supplying freely available SAR data. The data are not fully polarimetric, but scenes are acquired every 6–12 days (depending on the geographical region) and can continuously cover the whole growing season. This enables monitoring at different times and allows access to the whole phenological cycle of the crops. It also brings forward the aspect of temporal variation of backscatter for different types of crops, as these can be useful in differentiating between crops.
Most recently, due to freely available ESA Sentinel-1 C-band SAR data, many research efforts concentrate on various crop classification with advanced methods using Sentinel-1 time series [
21,
23,
26,
27,
28,
29]. Likewise, we use SAR data and review several classification approaches in more detail.
In [
26], repeat-pass Sentinel-1 data were used over North Dakota to classify individual agricultural land-cover types. In this approach, the times series forms the basis of a classification algorithm, where individual pixels are compared against a model of average crop backscatter response and classified as the crop with the least difference from the model.
Similarly, in [
21], the temporal intensity models were used in K-means clustering approach for crop classification (4 to 5 types of crops including corn, soybean, rice, peanut, lotus, and grass), gaining accuracies on the order of 75 to 90% for different crops. Xu et al. [
21] used Sentinel-1 time-series data to construct temporal intensity models employing K-means clustering. The introduced spectral similarity value measure (SSV) seemed to work better than the decision tree and the Bayesian classifier methods. The training data were produced through visual interpretation. The number of the species groups were five and four in the two study area. Overall accuracies were as high as 90–92%. The average sizes of the field parcels were not given.
Analysis of Sentinel-1 time series along with optical Sentinel-2 data in [
23] has shown that SAR backscatter and NDVI (normalized difference vegetation index) may be complementary for agricultural applications. Particularly, the VH-to-VV ratio at C-band was shown to be a good discriminator and notably suitable for crop applications. Veloso et al. [
23] analyzed and interpreted the temporal trajectory of remote sensing data for a variety of winter and summer crops that are widely cultivated in the world (wheat, rapeseed, maize, soybean and sunflower). Sentinel-1 data were used and compared to the temporal variation with NDVI derived from Sentinel-2 type optical data. The performances of different features in assessing the phenological stages of the crops were analyzed. The purpose was to also investigate the possibilities of estimating physical parameters, such as fresh biomass and green area index (GAI). The authors concluded that the dense time series allowed the capture of short phenological stages and to describe various crop developments. The authors concluded also that a better understanding of SAR backscatter and NDVI temporal behaviors under contrasting agricultural practices and environmental conditions help upcoming studies related to crop monitoring based on Sentinel-1 and Sentinel-2, such as dynamic crop mapping and biophysical parameters estimation.
In earlier work [
27], a set of five Sentinel-1 scenes were used along with one multispectral Sentinel-2 image to classify six crop types (beans, beetroot, grass, maize, potato, and winter wheat). To assess the potential of crop classification with common off the shelf supervised learning models, a benchmarking using four different approaches (kernel-based extreme learning machine, multilayer feedforward neural networks, random forests, and support vector machine) was implemented. The first approach performed best with very high overall accuracy of 96.8%. Evaluation of the sensitivity of classification models and relative importance of data types using data-based sensitivity analysis showed that the two most important scenes (explanatory variables) was VV-pol channel of one of Sentinel-1 scenes and band 4 of the Sentinel-2 scene, indicating complementary of optical and SAR data in crop classification.
In yet another methodological study on crop-type mapping using a sequence of Sentinel-1 images [
28], the dynamic conditional random fields were shown to effectively capture spatio-temporal phenological information for various crops. This kind of variation appears inherent in images and can be used for crop classification purposes. Not surprisingly, the final classification performance was higher for the multitemporal stack than for any of the separate scenes. The suggested approach was also shown to perform better than the conventional maximum likelihood approach with multitemporal images aggregated as composite bands.
Several conventional machine learning approaches were compared with deep learning approaches subject to crop classification performance in [
30]. Ndikumana et al. [
30] demonstrated the performances of the traditional k-NN, random forest (RF) and support vector machine (SVM) in a test site in France with eleven species categories and 25 Sentinel-1 scenes, ranging from from May 2017 to September 2017. The two most dominant categories, rice and wheat, comprised more than 50% of the area of the test sites. The average parcel area of rice was 3.3 hectares and that of wheat 1.7 hectares. The performances of the traditional methods were compared to those of two deep learning techniques, deep recurrent neural network (RNN)-based classifiers. Interestingly, while deep learning approaches proved superior, the performance in terms of overall accuracies was not so strong: 86–87% with more traditional methods and 89–90% with deep learning models. The k-NN method and its optimized version ik-NN has some advantages over the other methods, where we selected it as the main method here [
31]. These advantages are discussed in
Section 2.3.
While there is a considerable amount of research on the use of Sentinel-1 time series in crop mapping, considerable gaps seem to be present. Most notably, the prediction uncertainty assessment is typically missing. Furthermore, the connection of classification performance to the specific stage of phenological cycle of crops is barely established. However, examples of the earlier studies and references can be found, e.g., in Veloso et al. [
23] and Song and Wang [
32]. The latter one deals with one species in its phenology. In addition, the reported classification accuracy levels were not always satisfactory, particularly for a large number of crops, lacking in depth sensitivity analysis of the classification performance on the number of crops used, size of the parcels, and the number of Sentinel-1 scenes employed in the experiment. In addition, in the majority of the studies, no selection of most suitable Sentinel-1 scenes was performed, and the study area is often very small. In this study, we address several of these issues.