Agricultural Machinery Movement Trajectory Recognition Method Based on Two-Stage Joint Clustering

Zhang, Shuya; Liu, Hui; Cao, Xiangchen; Meng, Zhijun

doi:10.3390/agriculture14122294

Open AccessArticle

Agricultural Machinery Movement Trajectory Recognition Method Based on Two-Stage Joint Clustering

¹

Information Engineering College, Capital Normal University, Beijing 100048, China

²

National Research Center of Intelligent Equipment for Agriculture, Beijing 100097, China

^*

Authors to whom correspondence should be addressed.

Agriculture 2024, 14(12), 2294; https://doi.org/10.3390/agriculture14122294

Submission received: 13 November 2024 / Revised: 9 December 2024 / Accepted: 13 December 2024 / Published: 14 December 2024

(This article belongs to the Special Issue Intelligent Agricultural Machinery Design for Smart Farming)

Download

Browse Figures

Versions Notes

Abstract

:

To address the challenges posed by the large scale of agricultural machinery trajectory data and the complexity of actual movement trajectories, this paper proposes a two-stage joint clustering method for agricultural machinery trajectory recognition to enhance accuracy and robustness. The first stage involves trajectory clustering, where the spatial distribution characteristics of agricultural machinery trajectories are analyzed, and the position coordinates and the number of neighboring points of trajectory points are extracted as features. The silhouette coefficient method is used to determine the optimal number of clusters k for the K-Means algorithm, thus reducing the data scale. The second stage focuses on trajectory recognition, where a list of Eps and Minpts parameters is generated based on the statistical properties of the trajectory dataset. The Genetic Algorithm is employed for parameter optimization to determine the optimal DBSCAN parameters, enabling precise identification of field operation trajectories and road travel trajectories. Experimental results show that this method achieves mean values of 91.55% for Accuracy, 95.41% for Precision, 89.86% for Recall, and 92.41% for F1-score on a sample dataset of 337 trajectories, representing improvements of 12.8%, 5.13%, 7.79%, and 6.84%, respectively, over the traditional DBSCAN algorithm. Additionally, the Runtime of the two-stage joint clustering method is approximately 30% shorter than that of single-stage clustering. Compared with mainstream deep learning models such as LSTM and Transformer, this method delivers comparable recognition accuracy without the need for labeled data training, significantly reducing recognition costs. The proposed method achieves accurate and robust recognition of agricultural machinery trajectories and holds broad application potential in practical scenarios.

Keywords:

trajectory recognition; two-stage joint clustering; K-Means; Genetic Algorithm; DBSCAN; deep learning

1. Introduction

Intelligent network technology for agricultural machinery is based on communication networks and the Internet. It utilizes satellite positioning devices, sensors for machinery operating conditions, and image sensors to perceive the real-time operational status of agricultural machinery. This technology facilitates the intelligent identification, localization, monitoring, and management of clusters of agricultural machinery equipment. With the advancement and application of intelligent network technology for agricultural machinery, not only have remote supervision [1], autonomous operation [2], and task scheduling been achieved [3], but the rich data of machinery movement trajectories can also be utilized for production management applications such as assessments of operational efficiency and quality [4]. The precise identification of these continuous movement trajectories, which include both field operations and road travel, is a prerequisite and essential condition for data analysis and mining [5].

Currently, scholars mainly use spatio-temporal analysis, clustering algorithms, and deep learning methods for agricultural machinery trajectory recognition. Pei Wang et al. [6] proposed a clustering algorithm based on spatial indexing and grid density to automatically recognize the operational states of agricultural machinery, including active operation, idle movement, and halted states. Jing Xiao et al. [7] introduced a segmentation method for agricultural machinery movement trajectories based on spatio-temporal cubes, achieving the segmentation of field operation and road travel trajectories. Lili Y et al. [4] developed a two-step K-Means clustering and a three-step correction method to identify machinery turn-around and anomalous operational trajectories. Yashuo Li et al. [8] proposed a method combining DBSCAN, a density-based clustering method, with the BP_Adaboost classifier ensemble algorithm, to recognize the states of trajectory points and to delineate operational plots. Jernej P et al. [9] utilized trajectory velocity, direction, and derived parameters such as acceleration, curvature radius, and angular velocity to construct a decision tree model for recognizing road and field operation trajectories. Chen Y et al. [10] introduced a field-road classification algorithm based on Graph Convolutional Networks (GCN), which achieves the recognition of in-field operations and road trajectories of agricultural machinery. Huang J et al. [11] proposed the Field&Road-CENet semantic segmentation algorithm, which, building on the CE-Net model, incorporates standard convolutional residual blocks and a global max pooling attention mechanism to fuse original feature information and enhances the semantic expression of low-level feature maps for segmenting agricultural machinery trajectories. Ying C et al. [12] presented a trajectory point representation method that combines statistical and visual features, utilizing a BiLSTM network and a linear classifier to classify field-road trajectories.

Spatio-temporal analysis methods are fast but typically rely on predefined rules or thresholds to segment movement trajectories, which limits their ability to deeply explore and represent complex spatio-temporal relationships, making it challenging to improve classification accuracy. Currently, deep learning methods have gained popularity due to their powerful feature extraction capabilities, showing distinct advantages in recognizing the interrelationships of multidimensional features within agricultural machinery trajectories. However, the extensive need for labeled sample training, trajectory data transformation, and high computational costs represent bottlenecks to the technology’s application. Compared with the first two methods, traditional clustering algorithms have strong interpretability and do not require pre-labeled training data, performing well in the initial recognition and processing of trajectory data. Yet, many clustering algorithms are sensitive to parameters, making parameter determination challenging.

This paper proposes a two-stage joint clustering method for the recognition of agricultural machinery movement trajectories. Initially, an improved K-Means algorithm is utilized to perform preliminary clustering to reduce the scale of trajectory data. Subsequently, an adaptive DBSCAN algorithm, which determines parameters automatically, is employed for the precise recognition of field operation trajectories and road travel trajectories within the data. This study aims to enhance the accuracy of trajectory recognition progressively by integrating the advantages of different clustering algorithms. It addresses the issues of parameter sensitivity and low clustering efficiency inherent in traditional clustering methods, while also avoiding the challenges associated with the extensive labeling of trajectory data required by deep learning approaches.

2. Methods

2.1. Overview

The classification of field and road trajectories in agricultural machinery is a critical task in the agricultural domain regarding spatio-temporal trajectory data processing. It aims to analyze the behavior patterns of agricultural machinery based on the spatio-temporal information embedded in massive trajectory data, distinguishing trajectories between in-field and on-road. Agricultural machinery movement trajectory points typically include attributes such as operation time, latitude, longitude, speed, and heading, and the spatial distribution characteristics of the trajectories can objectively reflect their movement states. Figure 1 shows the spatial distribution of original agricultural machinery movement trajectories on a remote sensing map. As illustrated, field operation trajectories usually exhibit a densely clustered spatial distribution, reflecting the typical low-speed, repetitive movements within plot areas during field operations. In contrast, road travel trajectories are sparser and usually have a more regular linear spatial distribution, indicating that machinery tends to travel unidirectionally at higher speeds when moving between plots along roads. Effectively distinguishing between field operation and road travel trajectories using spatial characteristics of trajectory data can assist in estimating farmland area [13], and is crucial for assessing machinery operational efficiency [14], predicting yields [15], and determining agricultural subsidies [16].

This study proposes a two-stage joint clustering method for classifying agricultural machinery movement trajectories, named K-SGA-DBSCAN. Here, K represents the improved K-Means algorithm [17], used for preliminary clustering to determine the initial regions for trajectory clustering; SGA represents an adaptive determination method based on the statistical characteristics (S) of trajectories and Genetic Algorithm (GA) [18]; and DBSCAN represents the Density-Based Spatial Clustering of Applications with Noise algorithm [19]. Figure 2 is the technical roadmap of the method proposed in this paper, which is divided into two main parts: data preparation and two-stage joint clustering. Data preparation encompasses data preprocessing and feature extraction. In the two-stage joint clustering part, the first stage, S1, involves trajectory clustering where an improved K-Means algorithm is used for preliminary clustering. This process divides large-scale agricultural machinery trajectories into multiple small-scale trajectory clusters, reducing the data scale and lowering the time complexity of trajectory clustering. The second stage, S2, is the trajectory recognition stage, where this paper introduces an adaptive DBSCAN parameter determination method, SGA. This method generates a list of candidate Eps and Minpts parameters based on the statistical characteristics of the agricultural machinery trajectory data and utilizes the Genetic Algorithm to optimize and determine the best parameter combination for the DBSCAN algorithm. Clustering is then performed sequentially for each small-scale cluster to precisely identify field operation trajectories and road travel trajectories.

2.2. Data Preparation

2.2.1. Data Preprocessing

The preprocessing of the original agricultural machinery trajectory data primarily involves the following three steps:

(1): Data Format Conversion

The coordinates of the original trajectory points are expressed in latitude and longitude within the WGS84 (World Geodetic System 1984) geodetic coordinate system. This study employs the Universal Transverse Mercator (UTM) projection to convert these geodetic coordinates

P (l o n, l a t)

into planar coordinates

P (x, y)

.

(2): Duplicated Trajectory Processing

When two adjacent trajectory points have identical latitude and longitude coordinates, they are considered duplicated trajectories. In such cases, only the record with the earliest timestamp is retained, and other duplicate data are removed.

D T = \{P_{i}, P_{i + 1}| {l o n}_{i + 1} = {l o n}_{i} a n d {l a t}_{i + 1} = {l a t}_{i}, i \in [1, N - 1]}

(1)

In Equation (1),

D T

represents the duplicated trajectory,

P_{i}

denotes the

i

-th trajectory point,

{l o n}_{i}

and

{l a t}_{i}

represent the longitude and latitude coordinates, respectively, and

N

is the number of trajectory points.

(3): Stop Trajectory Processing

When the agricultural machinery is in a halted state, the onboard terminal devices continue to report data, with the operational speed of the machinery consistently recorded as zero. Continuous trajectory points with a speed of zero are identified as stop trajectory points; only the first trajectory point is retained, and the others are removed.

S T = \{P_{i}, P_{i + 1}| v_{i} = v_{i + 1} = 0, i \in [1, N - 1]}

(2)

In Equation (2),

S T

represents the stop trajectory, and

v

denotes the instantaneous speed of the agricultural machine trajectory point.

2.2.2. Feature Extraction

To accurately distinguish between field operation trajectories and road travel trajectories of agricultural machinery, selecting effective feature information is a prerequisite for achieving desired results. For a given trajectory point, other trajectory points are distributed within its neighboring area. By analyzing the spatial distribution characteristics of trajectory points, it is evident that within a region centered at the trajectory point

P (x_{i}, y_{i})

with a radius

R

, the number of neighboring trajectory points is greater for field operation trajectories than for road travel trajectories. The average distance between trajectory points within the sample dataset is calculated, and

m

times the average distance of trajectory points is used as the radius for the neighboring area. The number of neighboring points for each trajectory point is then counted. The calculation formulas are shown in Equations (3) to (5).

D i s t (P_{i}, P_{j}) = \sqrt{{(x_{i} - x_{j})}^{2} + {(y_{i} - y_{j})}^{2}}

(3)

A v g D i s t = \frac{2}{n (n - 1)} \sum_{i = 1}^{n - 1} \sum_{j = i + 1}^{n} D i s t (P_{i}, P_{j})

(4)

R = m \times A v g D i s t

(5)

In Equations (3) to (5),

D i s t (P_{i}, P_{j})

represents the distance between trajectory points,

A v g D i s t

denotes the average distance of all trajectory points,

R

is the radius of the neighboring area, and

m

is the multiplicative factor.

2.3. Two-Stage Joint Clustering

2.3.1. Trajectory Clustering Stage (S1)

In the S1 trajectory clustering stage, the K-Means clustering algorithm is utilized to reduce the scale of trajectories for each clustering session. This algorithm is simple, stable, and converges quickly, making it suitable for handling massive datasets. It effectively divides large-scale trajectory datasets into multiple smaller clusters and is widely used in vehicle trajectory recognition research.

The core concept of the S1 clustering stage is to divide trajectory points into k clusters, where the number of clusters, k, is a crucial parameter of the K-Means algorithm. In this study, the silhouette coefficient method [20] is used by calculating the silhouette coefficients for clustering results under different k within a set range. The k value corresponding to the highest silhouette coefficient is selected as the optimal parameter, thereby optimizing the clustering effect. The silhouette coefficient

S_{i}

for trajectory point

P_{i}

is defined as follows:

S_{i} = \frac{b_{i} - a_{i}}{m a x (a_{i} + b_{i})}

(6)

where

a_{i}

is the average distance from the trajectory point

P_{i}

to other samples in the same intra-cluster(namely intra-cluster distance), reflecting the compactness of the clusters, and

b_{i}

is the average distance from the trajectory point

P_{i}

to all the samples in other clusters(namely inter-cluster distance), reflecting the separateness of the clusters.

2.3.2. Trajectory Recognition Stage (S2)

The trajectory points of agricultural machinery are characterized by clusters of densely connected data points, where field operation trajectory points are tightly packed, and road travel trajectory points are relatively sparse. The DBSCAN algorithm, which can detect clusters of arbitrary shapes in noisy data, is well-suited for distinguishing these two types of trajectories by leveraging the spatial distribution characteristics of agricultural machine trajectories. Thus, DBSCAN was utilized as the fundamental algorithm for trajectory recognition in this paper. However, the DBSCAN parameters—neighborhood radius (Eps) and minimum sample count (Minpts)—are highly sensitive, and fixed values may not suit diverse agricultural machine movement trajectories, leading to reduced identification accuracy. To address this issue, this study improves the method for determining the Eps and Minpts parameters of the DBSCAN algorithm, proposing an adaptive determination method, SGA. Initially, a list of DBSCAN parameter combinations

{[L}_{E p s}, L_{M i n p t s}]

is generated based on the statistical characteristics of the agricultural machinery trajectories. Then, the Genetic Algorithm is applied to optimize the DBSCAN parameter combinations, enhancing recognition accuracy.

The flow chart of the SGA-DBSCAN algorithm for the S2 trajectory recognition stage is shown in Figure 3, and the specific implementation steps are described as follows:

Step 1: Calculate the distance distribution matrix between each trajectory point and its

K

-th nearest neighbor trajectory point within the trajectory dataset, which is denoted as follows:

D_{n \times n} = \{D i s t (i, j) | 1 \leq i \leq n, 1 \leq j \leq n\}

(7)

where

n

is the total number of trajectory points in the sample dataset, and

D_{n \times n}

is a symmetric matrix of real numbers.

Step 2: Sort each row of the distance matrix in ascending order to obtain the elements of the

K

-th column, which constitute the

K

-nearest neighbor distances

D_{K}

for all trajectory points. Calculate the average of all

K

-nearest neighbor distances to obtain the average

K

-nearest neighbor distance

\bar{D_{K}}

of the sample dataset. This average distance serves as the candidate Eps parameter, expressed as follows:

\bar{D_{K}} = \frac{1}{n} \sum_{i = 1}^{n} D_{K} [i]

(8)

Step 3: Calculate the candidate Eps parameter for all values of

K

, generating a list of Eps parameters,

L_{E p s}

, represented as follows:

L_{E p s} = \{\bar{D_{K}} | 1 \leq K \leq n\}

(9)

Step 4: For each

K

-th Eps parameter in

L_{E p s}

, calculate the number of objects in the Eps neighborhood across all trajectory points in the sample dataset, and compute its average value

\bar{{N U M}_{K}}

, which serves as the Minpts parameter for this Eps, as follows:

\bar{{N U M}_{K}} = \frac{1}{n^{2}} \sum_{i = 1}^{n} \sum_{j = 1}^{n} {N U M}_{K} [i] [j]

(10)

Step 5: For each Eps in

L_{E p s}

, calculate the corresponding average object count in the Eps neighborhood to obtain the Minpts parameter list

L_{M i n p t s}

, expressed as follows:

L_{M i n p t s} = \{\bar{{N U M}_{K}} | 1 \leq K \leq n\}

(11)

Step 6: Use the generated Eps and Minpts parameter pairs as the initial population in the Genetic Algorithm. By simulating the mechanism of natural selection, this approach enhances the probability of finding the global optimal solution. The optimal parameter combination is determined through multiple iterations.

Step 7: Use the identified optimal Eps and Minpts parameter pair as input for DBSCAN clustering, obtain trajectory classification results, and finally merge the individual clusters to generate the output trajectories.

The two-stage joint clustering approach shortens the runtime for trajectory recognition and improves the accuracy of distinguishing between field operation trajectories and road travel trajectories.

3. Experiments

3.1. Datasets

The data used in this study were sourced from the intelligent monitoring system database of the National Research Center of Intelligent Equipment for Agriculture. The monitoring terminal AMT-3011, developed by the center, is equipped with an embedded GNSS module, a network module, and supports external sensors through CAN and RS232 interfaces, as shown in Figure 4. The AMT-3011 can be installed on agricultural machinery from different manufacturers. Its GNSS module provides meter-level positioning accuracy, as well as speed and heading data. This study selected operational data from 77 agricultural machines in the major grain-producing plains in China in 2022. The scale of individual plots generally ranged from 0.333 hectares to 2.0 hectares. A total of 1036 continuous motion trajectories were selected as the dataset.

Each trajectory data includes six attributes: positioning time (YYYY:MM:DD hh:mm:ss), longitude(°), latitude(°), speed(m/s), heading(°), and operation status(TRUE or FALSE), with a total of 4.02 × 10⁶ trajectory points. After data preprocessing and feature extraction, the total number of trajectory points was reduced to 2.80 × 10⁶, and the number of attributes per trajectory point increased to nine. The newly added attributes include x, y, and the number of neighboring points. To ensure the representativeness of the experiment and the reliability of the results, a statistical random sampling method [21] was employed to sample 337 movement trajectories from the dataset, with a total of 4.64 × 10⁵ trajectory points. The sample size calculation formula is as follows:

n = \frac{N Z^{2} p (1 - p)}{E^{2} (N - 1) + Z^{2} p (1 - p)}

(12)

where

n

represents the sample size,

N

represents the population size,

p

represents the population proportion,

Z

is the critical value of the standard normal distribution, and

E

is the allowable sampling error. In this paper, the values of these parameters are as follows:

N

= 1036,

p

= 0.5,

E

= 0.04, and with a 95% confidence level,

Z

= 1.96.

3.2. Experimental Setup

This study designed multiple experiments to evaluate the accuracy of K-SGA-DBSCAN in recognizing agricultural machinery movement trajectories. In Section 3.4.1, the impact of selecting different features on clustering performance was compared to verify whether the extracted number of neighboring points can effectively distinguish agricultural machinery movement trajectories. In Section 3.4.2, a set of ablation experiments was conducted to assess the effectiveness of the proposed SGA adaptive parameter determination method and the multi-stage joint approach in optimizing trajectory recognition. In Section 3.4.3, a comparative analysis was performed between the proposed method and other classic clustering algorithms, with the visualized trajectories identified by different algorithms presented to intuitively compare their recognition outcomes. In Section 3.4.4, K-SGA-DBSCAN was compared with deep learning models such as LSTM and Transformer to validate the accuracy and practicality of the proposed method in contrast to deep learning approaches.

The experimental platform used in this study featured an Intel^® Xeon^® Platinum 8255C @2.50 GHz CPU, 43 GB of RAM, and 230 GB of disk space. The software environment included Python 3.8 on the Windows 10 operating system, and PyCharm 2023 was used for development.

The K-SGA-DBSCAN algorithm employed K-Means and DBSCAN as its base algorithms and introduced improvements to parameter determination. The silhouette coefficient method was used to determine the parameters for K-Means, while genetic algorithms and statistical properties of trajectory points were used to adaptively determine the parameters for DBSCAN. Based on the spatial distribution characteristics of agricultural machinery trajectory points, S1 extracted position coordinates

(x, y)

and the number of neighboring points as clustering features. The range of

k

values was set from 2 to 21, and the

k

value that maximizes the silhouette coefficient was chosen as the optimal number of clusters. The random seed was set to 42 to ensure the reproducibility of the experimental results. In S2, the SGA adaptive method for determining DBSCAN parameters set the population size to 50, the number of generations to 50, the crossover rate to 0.5, and the mutation rate to 0.2, with tournament selection as the selection method. The fitness function was the silhouette coefficient. By dynamically adjusting the genetic algorithm parameters, the optimal combination of Eps and Minpts parameters was obtained.

3.3. Evaluation Metrics

In this study, the clustering performance under different features was evaluated using two commonly used clustering evaluation metrics: the Davies-Bouldin Index (DBI) and the Calinski-Harabasz Index (CHI) [22]. DBI measures the average similarity between each cluster and its most similar cluster, with a range of [0, 1]. Smaller values indicate more compact and well-separated clusters. CHI is essentially the ratio of between-cluster distance to within-cluster distance, with a range of [0, +∞], where higher values indicate better clustering performance.

To objectively assess the recognition performance of the two-stage joint clustering algorithm, this study used QGIS software to label the agricultural machinery trajectory dataset, where 0 represents road travel trajectory points and 1 represents field operation trajectory points. The evaluation of the trajectory recognition results was conducted using five metrics: Accuracy, Precision, Recall, F1-score [23], and Runtime. The values of all metrics in the experimental results were the averages obtained from the sample dataset of 337 trajectories.

3.4. Results and Analysis

3.4.1. Experiment on Clustering with Different Features

To investigate the impact of different feature selections on trajectory recognition results, clustering was performed using the trajectory point position coordinates

(x, y)

, speed, and the number of neighboring points as features. It was found that when the radius of the neighboring area was set to 0.1 times the average distance between the trajectory points of the agricultural machinery on the given day, the distinction in the number of neighboring points between field operation trajectory points and road travel trajectory points was most effective. When the neighboring area radius was too large, the number of neighboring points for road travel trajectories gradually increased; conversely, when it was too small, the number of neighboring points for field operation trajectories decreased sharply. Both situations led to an increased overlap in the number of neighboring points between field operation and road travel trajectories, impacting the differentiation of trajectory points.

Figure 5 presents the analysis of neighboring points for 337 sample trajectories. Figure 5a shows the distribution of trajectory points using the number of neighboring points as a feature, where field operation trajectories have more neighboring points, predominantly ranging from 40 to 1000, exhibiting high-density clustering characteristics. In contrast, road travel trajectories have fewer neighboring points, typically ranging from 0 to 40, with a relatively dispersed distribution. Figure 5b is a statistical chart of the number of neighboring points for trajectory points. The median number of neighboring points for field operation trajectory points is 246, significantly higher than the median of 18 for road travel trajectory points. This further illustrates the spatial clustering of field operation trajectories and the dispersal of road travel trajectory points. Therefore, using the number of neighboring points with a neighboring area radius set to 0.1 times the average distance of trajectory points as a feature provides good differentiation and can be used to determine the movement state of trajectory points.

Table 1 presents a comparison of clustering performance under different features (in the table, the results of this method under each metric are highlighted in bold, the same applies below). The goal of optimal feature selection is to enhance clustering distinction and accuracy. As shown in Table 1, clustering using the position coordinates

(x, y)

and the number of neighboring points as features yields a DBI of 0.31, indicating low similarity between the trajectory point clusters and higher distinction. The CHI reaches 11,645.21, indicating that the between-cluster distance is significantly greater than the within-cluster distance, resulting in better clustering performance. In comparison, when using only the position coordinates

(x, y)

as features, the DBI is 0.43, and the CHI is 6264.66. When using position coordinates

(x, y)

, speed, and the number of neighboring points as features, the DBI is 0.66, and the CHI is only 2079.35, indicating poor clustering performance. This demonstrates that more features do not necessarily improve clustering performance; excessive features may introduce noise and reduce clustering quality. The best distinction is achieved when using position coordinates and the number of neighboring points as features.

3.4.2. Ablation Experiments

To validate the effectiveness of the SGA adaptive parameter determination method and the multi-stage joint approach for trajectory recognition, ablation experiments were conducted with DBSCAN as the core algorithm. Table 2 presents the results of the ablation experiments, where SGA-DBSCAN represents the DBSCAN algorithm with adaptively determined parameters, and K-SGA-DBSCAN is the proposed two-stage joint clustering method.

As shown in Table 2, the DBSCAN algorithm operates at high speed but, due to its sensitivity to parameters and inability to adaptively adjust them, achieves only 78.75% in Accuracy. The performance of trajectory recognition improved significantly after modifying the parameter determination method. Specifically, SGA-DBSCAN showed a 12.24% increase in Accuracy (90.99%), a 4.31% increase in Precision (94.59%), a 0.45% increase in Recall (82.52%), and a 2.12% increase in F1-score (87.69%). However, the computational complexity involved in generating the parameter combination list based on trajectory characteristics led to a significant increase in Runtime for SGA-DBSCAN, reaching 1531.0 s.

To address this issue, K-SGA-DBSCAN employs a two-stage joint clustering method to reduce time complexity. The Runtime was shortened by 466.8 s, while the Accuracy, Precision, Recall, and F1-score for trajectory recognition were improved to 91.55%, 95.41%, 89.86%, and 92.41%, respectively. These results demonstrate that the two-stage joint clustering method based on the improved DBSCAN algorithm significantly enhances the performance of agricultural machinery trajectory recognition.

Figure 6 illustrates the process of field and road segmentation for a sample trajectory after undergoing two-stage joint clustering. As shown in the figure, it is evident that the S1 clustering stage divides large-scale agricultural machinery movement trajectories into four high-density, small-scale trajectory clusters, with each cluster containing dense field operation trajectory points and their surrounding road travel trajectory points. The S2 stage introduces an SGA adaptive parameter determination method that adapts to different motion trajectory characteristics, overcoming the subjectivity and inaccuracy of manual parameter setting. S2 clustering effectively distinguishes road travel trajectory points at the junctions between fields and roads, thereby more accurately reflecting the actual movement trajectories.

3.4.3. Comparison with Clustering Algorithms

The effectiveness of the two-stage joint clustering method was compared with three classical clustering algorithms: Mean-Shift [24], OPTICS [25], and KANN-DBSCAN [26], to evaluate the performance differences in agricultural machinery trajectory recognition. Table 3 shows the recognition results of the sample dataset under these four clustering methods. As indicated in Table 3, K-SGA-DBSCAN consistently outperforms the other methods across all performance metrics. Specifically, K-SGA-DBSCAN showed an improvement in Accuracy over the three clustering methods: Mean-Shift (82.07%), OPTICS (87.16%), and KANN-DBSCAN (87.65%) by 9.48%, 4.39%, and 3.9%, respectively. In terms of the F1-score, it improved by 6.92%, 4.23%, and 2.76%, respectively.

Figure 7 shows the comparison of trajectory recognition results using the four clustering methods. It is evident from the figure that the results clustered by the K-SGA-DBSCAN method more accurately reflect the actual operational states of the trajectories. Specifically, K-SGA-DBSCAN effectively identifies trajectory points at the boundary between fields and roads and is less sensitive to noise trajectories within the field. Mean-Shift and OPTICS perform well in recognizing high-density field operation trajectory points but misidentify road travel trajectory points at the field-road boundaries. The KANN-DBSCAN algorithm improves the recognition accuracy of road travel trajectory points but performs poorly in recognizing field trajectories.

3.4.4. Comparison with Deep Learning Methods

To further evaluate the accuracy and practicality of the two-stage joint clustering method in recognizing agricultural machinery movement trajectories, a comparison was made between the proposed method and deep learning approaches. The experimental platform featured an Intel^® Xeon^® Platinum 8481C @2.50 GHz CPU, 80 GB of RAM, 80 GB of disk space, and an RTX 4090D GPU with 24 GB of VRAM. The software environment included Python 3.8, with PyTorch 1.10.0 as the deep learning framework and CUDA version 11.33. The dataset comprised 1036 original agricultural machinery movement trajectories, with the test set consisting of the 337 sample trajectories used in the two-stage clustering experiments, while the remaining 699 trajectories were divided into a training set and validation set at an 8:2 ratio.

In this study, the two-stage joint clustering method was compared with two mainstream deep learning models for sequence data input: LSTM [27] and Transformer [28], to evaluate the performance in the context of agricultural machinery movement trajectory recognition. The experimental results are shown in Table 4. The results indicate that the K-SGA-DBSCAN method achieved a Precision of 95.41%, which is 6.72% higher than LSTM (88.69%) and 6.17% higher than Transformer (89.24%). The F1-score for K-SGA-DBSCAN was 92.41%, surpassing the LSTM and Transformer models by 3.79% and 2.02%, respectively. In terms of Accuracy (91.55%) and Recall (89.86%), K-SGA-DBSCAN performed comparably to the LSTM model (Accuracy 91.62%, Recall 88.55%) and was slightly lower than the Transformer model (Accuracy 92.71%, Recall 91.56%).

These findings suggest that the two-stage joint clustering method achieves comparable recognition accuracy to the LSTM and Transformer models. However, its significant advantage lies in not requiring extensive sample-labeled training and avoiding trajectory data conversion issues, thus demonstrating stronger practicality and application potential in agricultural machinery movement trajectory recognition.

Figure 8 presents the agricultural machine trajectory recognition results of K-SGA-DBSCAN, LSTM, and Transformer on real remote sensing imagery. From the figure, it can be intuitively observed that the results identified by K-SGA-DBSCAN show higher accuracy on the corresponding map background. LSTM fails to fully recognize both field operation and road travel trajectory points. Transformer performs well in recognizing field operation trajectories and has a faster processing speed. However, for handling trajectory points at the field-road junction, the K-SGA-DBSCAN algorithm still exhibits a significant advantage.

4. Conclusions

In this paper, a two-stage joint clustering method, K-SGA-DBSCAN, was proposed to classify field operation and road travel trajectories of agricultural machinery in continuous movement trajectories. This method integrates the strengths of K-Means and DBSCAN by first performing initial clustering to reduce the data scale, then optimizing parameters to refine clustering results, and finally employing an adaptive parameter determination method, SGA, to adjust the parameters of DBSCAN. The proposed approach overcomes the inherent sensitivity to parameters found in traditional clustering algorithms, exhibiting high robustness and adaptability.

Our experiments conducted on a dataset of 337 sample trajectories of agricultural machinery trajectory data demonstrated that K-SGA-DBSCAN outperforms traditional methods in terms of recognition performance. Specifically, the method achieved Accuracy, Precision, Recall, and F1-score values of 91.55%, 95.41%, 89.86%, and 92.41%, respectively. It effectively improved the recognition accuracy of noise trajectory points, turning points, and the points at the field-road junctions, especially in complex trajectory data. Additionally, in comparison with LSTM and Transformer models, the proposed method achieved comparable recognition accuracy while offering superior computational efficiency and practicality for large-scale agricultural machinery trajectory data. Unlike deep learning methods, this approach does not require large-scale sample labeling or data transformation, making it both cost-effective and practical for real-world applications.

In conclusion, the K-SGA-DBSCAN method represents a significant advancement in agricultural machinery trajectory recognition, providing a highly efficient and accurate solution for field-road classification. This method holds strong potential for large-scale applications in agricultural monitoring and precision farming. Our future studies will focus on optimizing algorithm parameters and further enhancing its performance across various operational scenarios. Additionally, exploring the integration of this approach with deep learning models and extending its capabilities for handling different types of agricultural machinery trajectories will be key directions for advancing the method’s applicability and effectiveness.

Author Contributions

Conceptualization, H.L. and Z.M.; methodology, S.Z. and H.L.; validation, S.Z. and H.L.; formal analysis, X.C. and S.Z.; data curation, X.C. and S.Z.; writing—original draft preparation, S.Z. and H.L.; writing—review and editing, S.Z. and H.L.; visualization, S.Z. and H.L.; supervision, H.L. and Z.M.; project administration, H.L. and Z.M.; funding acquisition, H.L. and Z.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Nature Science Foundation of China, Grant No. 31971800.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The authors declare that all data supporting the findings of this study are available within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Liu, H.; Ye, X.; Meng, Z.; Zhou, L.; Sun, Z. An Agricultural Machinery Operation Monitoring System Based on IoT. In Proceedings of the 2019 International Conference on Data Science and Information Technology (DSIT 2019); International Association of Applied Science and Engineering: Bloomington, IL, USA, 2019; p. 5. [Google Scholar]
Wei, W.; Xiao, M.; Duan, W.; Wang, H.; Zhu, Y.; Zhai, C.; Geng, G. Research Progress on Autonomous Operation Technology for Agricultural Equipment in Large Fields. Agriculture 2024, 14, 1473. [Google Scholar] [CrossRef]
Li, D.; Liu, X.; Zhou, K.; Sun, R.; Wang, C.; Zhai, W.; Wu, C. Discovering Spatiotemporal Characteristics of the Trans-Regional Harvesting Operation Using Big Data of GNSS Trajectories in China. Comput. Electron. Agric. 2023, 211, 108003. [Google Scholar] [CrossRef]
Yang, L.; Wang, X.; Li, Y.; Xie, Z.; Xu, Y.; Han, R.; Wu, C. Identifying Working Trajectories of the Wheat Harvester In-Field Based on K-Means Algorithm. Agriculture 2022, 12, 1837. [Google Scholar] [CrossRef]
Hua, M.; Song, J.; Wang, X.; Zhang, C.; Li, S.; Zhai, C. Research status and prospect of IoT technology for agricultural equipment. Jiangsu Agric. Sci. 2024, 52, 17–27. [Google Scholar]
Wang, P.; Meng, Z.; Yin, Y.; Fu, W.; Chen, J.; Wei, X. Automatic recognition algorithm of field operation status based on spatial track of agricultural machinery and corresponding experiment. Trans. Chin. Soc. Agric. Eng. (Trans. CSAE) 2015, 31, 56–61. [Google Scholar]
Xiao, J.; Liu, H.; Wei, X.; Chen, J.; Wang, P.; Meng, Z. Segmentation of agricultural machinery trajectories based on space-time cube. Jiangsu Agric. Sci. 2018, 46, 244–247. [Google Scholar]
Li, Y.; Zhao, B.; Wang, C.; Xu, M.; Wei, L.; Pang, Z. Land Division Method for Agricultural Machinery Operation Based on DBSCAN and BP_Adaboost. Trans. Chin. Soc. Agric. Mach. 2023, 54, 37–44. [Google Scholar]
Poteko, J.; Eder, D.; Noack, P.O. Identifying Operation Modes of Agricultural Vehicles Based on GNSS Measurements. Comput. Electron. Agric. 2021, 185, 106105. [Google Scholar] [CrossRef]
Chen, Y.; Li, G.; Zhang, X.; Jia, J.; Zhou, K.; Wu, C. Identifying Field and Road Modes of Agricultural Machinery Based on GNSS Recordings: A Graph Convolutional Neural Network Approach. Comput. Electron. Agric. 2022, 198, 107082. [Google Scholar] [CrossRef]
Huang, J.; Wan, C.; Wu, C. Field & Road-CENet: Semantic Segmentation for Agricultural Machinery Trajectory Segmentation. Res. Sq. 2023, Preprint (Version 1). [Google Scholar]
Chen, Y.; Quan, L.; Zhang, X.; Zhou, K.; Wu, C. Field-Road Classification for GNSS Recordings of Agricultural Machinery Using Pixel-Level Visual Features. Comput. Electron. Agric. 2023, 210, 107937. [Google Scholar] [CrossRef]
Lacour, S.; Burgun, C.; Perilhon, C.; Descombes, G.; Doyen, V. A Model to Assess Tractor Operational Efficiency from Bench Test Data. J. Terramechanics 2014, 54, 1–18. [Google Scholar] [CrossRef]
Liu, H.; Meng, Z.; Wang, P.; Wei, X.; Han, Y. Buffer algorithms for operation area measurement based on global navigation satellite system trajectories of agricultural machinery. Trans. Chin. Soc. Agric. Eng. (Trans. CSAE) 2015, 31, 180–184. [Google Scholar]
Sujatha, R.; Isakki, P. A Study on Crop Yield Forecasting Using Classification Techniques. In Proceedings of the 2016 International Conference on Computing Technologies and Intelligent Data Engineering (ICCTIDE’16), Kovilpatti, India, 7–9 January 2016; pp. 1–4. [Google Scholar]
Bereznicka, J.; Wicki, L. Do operating subsidies increase labour productivity in Polish farms. Stud. Agric. Econ. 2021, 123, 114–121. [Google Scholar]
Ikotun, A.M.; Ezugwu, A.E.; Abualigah, L.; Abuhaija, B.; Heming, J. K-Means Clustering Algorithms: A Comprehensive Review, Variants Analysis, and Advances in the Era of Big Data. Inf. Sci. 2023, 622, 178–210. [Google Scholar] [CrossRef]
Lambora, A.; Gupta, K.; Chopra, K. Genetic Algorithm—A Literature Review. In Proceedings of the 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon), Faridabad, India, 14–16 February 2019; pp. 380–384. [Google Scholar]
Schubert, E.; Sander, J.; Ester, M.; Kriegel, H.P.; Xu, X. DBSCAN revisited, revisited: Why and how you should (still) use DBSCAN. ACM Trans. Database Syst. (TODS) 2017, 42, 1–21. [Google Scholar] [CrossRef]
Du, P.; Li, F.; Shao, J. Multi-Agent Reinforcement Learning Clustering Algorithm Based on Silhouette Coefficient. Neurocomputing 2024, 596, 127901. [Google Scholar] [CrossRef]
Shao, Z. Method for determining sample size in sampling survey. Stat. Decis. 2012, 22, 12–14. [Google Scholar]
Davies, D.L.; Bouldin, D.W. A Cluster Separation Measure. IEEE Trans. Pattern Anal. Mach. Intell. 1979, PAMI-1, 224–227. [Google Scholar] [CrossRef]
Prelipcean, A.C.; Gidofalvi, G.; Susilo, Y.O. Measures of transport mode segmentation of trajectories. Int. J. Geogr. Inf. Sci. 2016, 30, 1763–1784. [Google Scholar] [CrossRef]
Comaniciu, D.; Meer, P. Mean Shift: A Robust Approach Toward Feature Space Analysis. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 603–619. [Google Scholar] [CrossRef]
Ankerst, M.; Breunig, M.M.; Kriegel, H.P.; Sander, J. OPTICS: Ordering points to identify the clustering structure. ACM Sigmod Rec. 1999, 28, 49–60. [Google Scholar] [CrossRef]
Li, W.; Yan, S.; Jiang, Y.; Zhang, S.; Wang, C. Research on method of self-adaptive determination of DBSCAN algorithm parameters. Comput. Eng. Appl. 2019, 55, 1–7. [Google Scholar]
Gers, F.A.; Schmidhuber, J.; Cummins, F. Learning to forget: Continual prediction with LSTM. Neural Comput. 2000, 12, 2451–2471. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]

Figure 1. Spatial distribution of agricultural machinery movement trajectories.

Figure 2. Technology roadmap.

Figure 3. Flow chart of the SGA-DBSCAN algorithm.

Figure 4. The structure of the monitoring terminal AMT-3011.

Figure 5. Distribution of the number of neighboring points for trajectory points: (a) distribution of trajectory points using the number of neighboring points as a feature, with field operation trajectories showing higher density and road travel trajectories showing dispersion; (b) statistical distribution of the number of neighboring points, highlighting the significant difference in median values between field operation and road travel trajectories.

Figure 6. Example of the two-stage joint clustering process. In S1, the blue points represent the clustered small-scale trajectory clusters. In S2, the green points indicate field operation trajectories, while the red points represent road travel trajectories.

Figure 7. Comparison of trajectory recognition results using the four clustering methods: (a) Mean-Shift; (b) OPTICS; (c) KANN-DBSCAN; (d) K-SGA-DBSCAN. In the figure, green points represent field operation trajectory points, blue points indicate road travel trajectory points, and red circles highlight misidentified trajectory points.

Figure 8. Comparison of trajectory recognition results of three methods on remote sensing imagery: (a) LSTM; (b) Transformer; (c) K-SGA-DBSCAN. In the figure, green points represent field operation trajectory, blue points indicate road travel trajectory, and red circles highlight misidentified trajectory points. The bottom image is a magnified view of the area within the white rectangular box in the top image.

Table 1. Comparison of clustering performance under different features.

Feature	DBI	CHI
[(x,y)]	0.43	6264.66
[speed]	0.54	3405.15
[NeighborPts ¹]	0.38	5955.54
[(x,y), speed]	0.62	2507.83
[(x,y),NeighborPts]	0.31	11,645.21
[speed, NeighborPts]	0.54	2643.24
[(x,y), speed, NeighborPts]	0.66	2079.35

¹ NeighborPts represents the number of neighboring points.

Table 2. Ablation experiments.

Method	Accuracy (%)	Precision (%)	Recall (%)	F1-Score (%)	Runtime (s)
DBSCAN	78.75	90.28	82.07	85.57	1.8
SGA-DBSCAN	90.99	94.59	82.52	87.69	1531.0
K-SGA-DBSCAN	91.55	95.41	89.86	92.41	1064.2

Table 3. Performance comparison of trajectory recognition using four clustering methods.

Method	Accuracy (%)	Precision (%)	Recall (%)	F1-Score (%)
Mean-Shift	82.07 (9.48↓)	90.53 (4.88↓)	81.78 (8.08↓)	85.49 (6.92↓)
OPTICS	87.16 (4.39↓)	92.23 (3.18↓)	85.24 (4.62↓)	88.18 (4.23↓)
KANN-DBSCAN	87.65 (3.90↓)	93.57 (1.84↓)	86.68 (3.18↓)	89.65 (2.76↓)
K-SGA-DBSCAN	91.55	95.41	89.86	92.41

Table 4. Performance comparison of trajectory recognition using different methods.

Method	Accuracy (%)	Precision (%)	Recall (%)	F1-Score (%)
LSTM	91.62 (0.07↑)	88.69 (6.72↓)	88.55 (1.31↓)	88.62 (3.79↓)
Transformer	92.71 (1.16↑)	89.24 (6.17↓)	91.56 (1.70↑)	90.39 (2.02↓)
K-SGA-DBSCAN	91.55	95.41	89.86	92.41

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, S.; Liu, H.; Cao, X.; Meng, Z. Agricultural Machinery Movement Trajectory Recognition Method Based on Two-Stage Joint Clustering. Agriculture 2024, 14, 2294. https://doi.org/10.3390/agriculture14122294

AMA Style

Zhang S, Liu H, Cao X, Meng Z. Agricultural Machinery Movement Trajectory Recognition Method Based on Two-Stage Joint Clustering. Agriculture. 2024; 14(12):2294. https://doi.org/10.3390/agriculture14122294

Chicago/Turabian Style

Zhang, Shuya, Hui Liu, Xiangchen Cao, and Zhijun Meng. 2024. "Agricultural Machinery Movement Trajectory Recognition Method Based on Two-Stage Joint Clustering" Agriculture 14, no. 12: 2294. https://doi.org/10.3390/agriculture14122294

APA Style

Zhang, S., Liu, H., Cao, X., & Meng, Z. (2024). Agricultural Machinery Movement Trajectory Recognition Method Based on Two-Stage Joint Clustering. Agriculture, 14(12), 2294. https://doi.org/10.3390/agriculture14122294

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Agricultural Machinery Movement Trajectory Recognition Method Based on Two-Stage Joint Clustering

Abstract

1. Introduction

2. Methods

2.1. Overview

2.2. Data Preparation

2.2.1. Data Preprocessing

2.2.2. Feature Extraction

2.3. Two-Stage Joint Clustering

2.3.1. Trajectory Clustering Stage (S1)

2.3.2. Trajectory Recognition Stage (S2)

3. Experiments

3.1. Datasets

3.2. Experimental Setup

3.3. Evaluation Metrics

3.4. Results and Analysis

3.4.1. Experiment on Clustering with Different Features

3.4.2. Ablation Experiments

3.4.3. Comparison with Clustering Algorithms

3.4.4. Comparison with Deep Learning Methods

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI