1. Introduction
Even though the primary function of agriculture is and will be to produce food for the increasing human population with increasing standards of living, multifaceted targets are set for agriculture alongside food production and security. Agriculture needs to meet the “additional” requirements of sharing land for production of renewable materials for industries, and to tackle various environmental challenges that it is linked to: biodiversity loss, eutrophication, soil degradation and greenhouse gas emissions [
1]. Increasing food production by reducing yield gaps in a changing climate, as well as reducing markedly the environmental footprint of agriculture [
2] and also otherwise contributing to the Sustainable Development Goals of the United Nations [
3,
4] means that large-scale transformations and improvements are needed everywhere. Agriculture needs to be redesigned to become more productive, resource- and climate-smart, and environmentally, economically and socially sustainable [
5]. Multifunctional agriculture is an apt term to describe the multifaceted role that agriculture needs today to meet all the societal requirements [
6]. It aims to produce food, provide benefits for the environment and reciprocally benefit from ecosystem services. Due to many apparent trade-offs in the target setting characterized as “increasing while decreasing”, holistic approaches are needed [
7,
8] as well as indicators of the current status and progress [
4,
9].
A yield gap that differs depending on the region [
10] provides an easy and comparable estimate of the regional yield potential. However, the variation in yield gaps is high within a country, region, farm and even field parcel depending on the management, soil and weather conditions. Northern European growing conditions are not only highly variable, but farms and fields are very heterogeneous. Ignoring differences in performance by making generalizations may cause stagnation in the yield due to the non-optimal use of resources (as in Finland) [
11,
12] and the excess use of resources, reminding us of the drawbacks of the Green Revolution [
7]. For this reason, the redesign of production systems needs to progress from the bottom up by acknowledging the variation existing at the field parcel scale. Concrete policy roadmaps are needed to support the national redesign of agricultural systems given the heterogeneity of conditions and their environmental impacts [
3].
With ever expanding farm sizes, achieving joint multifaceted targets call for precision agriculture [
13], but also many other types of decision support systems and tools [
14,
15,
16]. Land use optimization is a means of allocating land in a rational way to improve resource use efficiency and to link the reduction of the environmental footprint to economic and social sustainability. The Natural Resources Institute Finland (Luke) has developed a land use optimization tool which is available for all the Finnish farmers at Luke’s EconomyDoctor-portal [
17]. Using the tool, field parcels with a high production capacity and valuable field parcel characteristics can be allocated for sustainable intensification, while underperforming parcels with poor physical characteristics can be allocated for extensification, or in the uttermost case for afforestation if there is no future food security role. Thereby, farmers have access to a land use action plan to sustain climate- and resource-smart production [
17] and to also follow other practices and principles of sustainable intensification [
5,
18].
The production capacity of each field parcel is critical for land allocation. However, contrary to the physical field parcel characteristics, the yield data are scattered, and when available are at the farm and not field parcel scale and only for a limited number of field crops. Hence, comprehensive data on production capacity are needed with good spatial and temporal coverage. Open satellite data haves opened many new solutions for agriculture and many of them support the large-scale transformation of agriculture: e.g., in the estimation of yields [
19,
20,
21] as well as the identification of crops [
16], crop conditions [
22,
23], pre-crop values for crop sequence [
24] and providing sets of sustainable intensification indicators [
25].
To support land use optimization for farmers, a method for the estimation of productivity gaps was developed on the field parcel scale, based on Sentinel-2 derived Normalized Difference Vegetation Index (NDVI) values [
17]. This study aimed to identify how farming system characteristics and farm and field parcel properties contribute to the risk of high NDVI-gaps. Farmers’ allocation of field parcels for different crops, rotations and cultivars was analyzed in order to understand the means used by farmers to cope with poor growth performance. Thereby, additional support (e.g., knowledge sharing) to assist decision making by farmers may be needed, and the novel understanding could be used to update or make changes in the prioritization of policy incentives to achieve sustainability goals for high-latitude agriculture.
2. Materials and Methods
In this study six original data files or databases were used as inputs to determine the drivers of low NDVI values, which were outputs of the processes (
Figure 1). In the first phase, Sentinel-2 images with a cloud mask (less than 99% cloud cover) were used to derive Normalized Difference Vegetation Index (NDVI) time series for field parcels from May to August for the years 2016 and 2017. The Sentinel-2 data were processed automatically by utilizing the Earth Observation processing toolkit developed at the Finnish Geospatial Institute (FGI) (for more details see [
26]). Cloud masks and NDVI-images were calculated by FGI by using ESA’s Sentinel Application Platform python interface snappy3 [
17,
24]. The NDVI values were calculated as:
where NIR is near infrared wavelength (842 nm) and Red is red wavelength (665 nm). NDVI values for clouds or for shadows of clouds are near zero. For typical fields, NDVI increases from 0.1–0.3 to 0.4–0.8 during growing seasons, being highest at the end of July or at the beginning of August. Additional information on data processed by the FGI was published earlier [
24] The field parcel scale data (shp-file) from the Finnish Food Agency were available for all fields in Finland, but only data in the southwest part of Finland were used (in total 181,108 field parcels). This is because the length of the growing season does not restrict the crop choices in this part of Finland and also the productivity data as NDVI values were available for this region. Because the position of agricultural parcels within the field parcels was unknown, a field parcel was included if the largest agricultural parcel covered at least 70% of the area of the field parcel. Thereby, the combined data comprised a total of 120,174 field parcels in 2016 and 118,116 in 2017.
In the second phase (
Figure 1), the National Land Survey of Finland database, containing waterway width and classification for surface and irrigation water resources (lakes, rivers and ditches) in Finland [
27] was combined with the field parcel scale data from the Finnish Food Agency (shp-file). ArcGIS (v.10.2) software was used to calculated distance to the nearest waterway (with 50, 100 and 300 meter buffering zones). Results were categorized as follows: next to any waterway (lake, river or main ditch) and <50 m, 50–99 m, 100–299 m and ≥ 300 m apart). Additional information on data processed was published earlier [
28].
In the third phase, a crop rotation of five years was defined for each field parcel using the crop cultivation database from the Finnish Food Agency. Six pre-defined rotations were identified: (1) Cereal species monoculture, (2) cereal monoculture, (3) rotation with break-crop, (4) diverse crop rotation, (5) grassland rotation and (6) green-fallow rotation (
Figure 1). Additional information on data processed was published earlier [
29].
In the fourth phase, a risk of a low NDVI value was defined to occur when the NDVI value for the field was in the first tertile of the NDVI value distribution of the same crop within the same sub-area (
Figure 1). The study area was divided into four sub-areas and the NDVI value of a crop in a field parcel was compared to a distribution of NDVI values for the same crop in field parcels with the same sub-area. These comparisons were made on three pre-selected dates between 1st July and 10th August. The dates were selected separately for each sub-area so that cloudiness disturbed the satellite signal as little as possible. For grasslands, three dates were selected between 10th May and 10th June. In Finland, the 1st cut is typically done between 15th and 25th June and the NDVI values for grass are mutually comparable only before that.
In the fifth phase, data from all previous phases and three additional data sets were combined (
Figure 1). The Finnish soil database included the dominating soil type (coarse mineral soils such as
Haplic Podzol 1 and
2, clay soils such as
Vertic Cambisols, other clay soils such as
Eutric Cambisol, Gleyic Cambisol and
Gleysols, and organic soils such as
Fibric/Terric Histosol 1 and
2 and
Dystric Cleysol). Another dataset included the field slope (<1.3%, 1.3–2.89%, 2.9–6.99% and ≥ 7.0%) [
30]. The crop cultivation dataset from the Finnish Food Agency included seven additional variables (and their categorization): (1) the total field area of the farm (<30 ha, 30–59 ha, 60–99 ha and ≥ 100 ha); (2) the field size (<0.5 ha, 0.5–0.99 ha, 1.0–2.99 ha, 3.0–4.99 ha and ≥ 5.0 ha); (3) the distance from the farm center (<300 m, 300–599 m, 600–1199 m, 1200–2499 m, 2500–4999 m and ≥ 5000 m); (4) the field shape (<0.3, 0.3–0.49, 0.5–0.69 and ≥ 0.7); (5) the farm type (cattle, pig, poultry, sheep and horse, cereal, special crops and others); (6) field ownership (owned by the farmer vs. leased land); and (7) farming system (organic and conventional farming). Additional information on used data has been published [
29]. Finally, breeding country (Finland or not) was obtained from the Value for Cultivation and Use VCU database (i.e., Finnish Official variety trial database).
In the sixth phase (
Figure 1), the allocation of crops after spring wheat (
Triticum aestivum L.), barley (
Hordeum vulgare L.) and oats (
Avena sativa L.) in crop sequence was analyzed using the Cochran-Mantel-Haenszel test (CMH) with SAS/FREQ software [
31]. Field parcels with spring wheat, barley or oats were separately divided into four equal sized groups according to their NDVI values in 2016. After that, the association between the allocation of crops in 2017 and the NDVI value in the preceding year (2016) was tested for each crop in 2017 against other available crops (a CMH-test for a 4 × 2 contingency table, with one degree of freedom). Crops that were grown in fewer than 30 field parcels were excluded.
In the seventh phase (
Figure 1), statistical modeling for whole dataset was done. Most statistical modeling was based on logistic regression. The focus was on the events of interest for this study, and dichotomous outcome variables were used: either an event occurred or it did not occur. The following events were tested: the risk of low NDVI values, whether a Finnish cultivar or certain crop was cultivated or not and whether a certain crop-rotation was used or not. Independent variables were those generated in the second and third phases or combined in the fifth phase. Most of the variables were originally continuous. Relationships between the log odds of the probability for success and independent variables were not, however, linear. Therefore, all continuous variables were categorized for final analyses as defined in the second, third and fifth phases. In the case that the difference between organic and conventional farming was obvious without statistical testing, when analyzing crop rotation data, the farming systems were analyzed separately. Otherwise, the farming system was used as a two-level independent variable.
The results of the seventh phase (
Figure 1) were given as odds ratios with 95% confidence limits (CL). If the confidence limits cross 1.00 (e.g., in the case of 0.90–1.20), it implies that there is no statistically significant difference between the risks of the two tested groups at a 5% significance level. Confidence limits were used instead of
p-values because some tests utilize plenty of fields and a practically non-important difference can appear as statistically significant. The logistic regression analyses were performed using SAS/LOGISTIC software [
31].
In the eighth phase, results of variety trial data (VCU data) were compared to results of the first phase (
Figure 1). The field parcel scale data on crops from the Finnish Food Agency included the name of the variety. The official variety trial data provided by Luke were used to estimate the average yield (kg ha
−1) for all varieties using a linear mixed model, in which the cultivar was used as a fixed effect (the set of cultivars varied from year to year), while an experimental site (>20 sites, the set of sites varied from year to year), the year (1970–2018) and their interaction were used as random effects. This model resulted in mutually comparable yield estimates for all cultivars in spite of the fact that their yields varied widely between trials and each cultivar was tested only in a limited set of trials. A mixed model analysis was performed using SAS/MIXED software [
31].