Next Article in Journal
A Multi-Model Combined Filter with Dual Uncertainties for Data Fusion of MEMS Gyro Array
Next Article in Special Issue
Algorithm Design for Edge Detection of High-Speed Moving Target Image under Noisy Environment
Previous Article in Journal
Performance Optimization for Phase-Sensitive OTDR Sensing System Based on Multi-Spatial Resolution Analysis
Previous Article in Special Issue
Computerized Ultrasonic Imaging Inspection: From Shallow to Deep Learning
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

SHNN-CAD+: An Improvement on SHNN-CAD for Adaptive Online Trajectory Anomaly Detection

Graphics and Imaging Lab, Universitat de Girona, Campus Montilivi, 17071 Girona, Spain
*
Author to whom correspondence should be addressed.
Sensors 2019, 19(1), 84; https://doi.org/10.3390/s19010084
Submission received: 14 November 2018 / Revised: 16 December 2018 / Accepted: 22 December 2018 / Published: 27 December 2018

Abstract

:
To perform anomaly detection for trajectory data, we study the Sequential Hausdorff Nearest-Neighbor Conformal Anomaly Detector (SHNN-CAD) approach, and propose an enhanced version called SHNN-CAD + . SHNN-CAD was introduced based on the theory of conformal prediction dealing with the problem of online detection. Unlike most related approaches requiring several not intuitive parameters, SHNN-CAD has the advantage of being parameter-light which enables the easy reproduction of experiments. We propose to adaptively determine the anomaly threshold during the online detection procedure instead of predefining it without any prior knowledge, which makes the algorithm more usable in practical applications. We present a modified Hausdorff distance measure that takes into account the direction difference and also reduces the computational complexity. In addition, the anomaly detection is more flexible and accurate via a re-do strategy. Extensive experiments on both real-world and synthetic data show that SHNN-CAD + outperforms SHNN-CAD with regard to accuracy and running time.

1. Introduction

Thanks to advanced location-aware sensors, massive trajectory data are generated every day, which requires effective information processing techniques. The main objective of anomaly detection is to pick out anomalous data which are significantly different from the patterns that frequently occur in data. A lot of applications benefit from automatic anomaly detection of trajectory data, such as video surveillance [1,2,3], airspace monitoring [4], landfall forecasts [5], and so on.
A variety of approaches have been proposed for the task of trajectory anomaly detection [6], however, most of them have limitations of computational cost or parameter selection [7], leading to the difficulty in reproducing experimental results. Usually, the trajectory of a moving object is collected and stored as a sequence of sample points which record the location and timestamp information, but the complementary information of data is lacking. Here, the complementary information refers to information about the number of patterns included, the trajectories labeled with certain patterns, the abnormal pattern, and so on, which can help the analysis of data. To automatically find this information, unsupervised approaches are commonly applied. In this case, it is not straightforward or easy to finely tune the parameters for these approaches to cope with different kinds of data. Although some approaches make effort to estimate the parameters simply by experience and a lot of experiments or complicatedly by introducing more assisted parameters and rules, the parameter setting is not trivial with respect to different distributions. Especially, for online handling of massive datasets, the low computational complexity is of great importance. Thus, it is better to avoid complex and time-consuming pre-processing on data.
Based on the theory of conformal prediction, Laxhammar et al. successively proposed its application in anomaly detection [8], and then the Similarity based Nearest Neighbour Conformal Anomaly Detector (SNN-CAD) [9] for online learning and automatic anomaly detection, and further introduced a relatively complete algorithm called Sequential Hausdorff Nearest-Neighbor Conformal Anomaly Detector (SHNN-CAD) [10] by improving the description and details of their previous works with more comprehensive discussion of previous works and explanation of the algorithm. SHNN-CAD has the following main advantages. First, it deals with raw data, which prevents the problems of information loss from dimension reduction, and over-fitting from modeling. Second, it makes use of the direct Hausdorff distance to calculate the similarity between trajectories. As is well known, selecting a proper distance measure is still a challenge when the trajectories in a set have unequal lengths (trajectory length refers to the number of sample points) due to sampling rate, sampling duration, and moving speed. To solve this issue, the direct Hausdorff distance is a good choice as it is parameter-free, and it is able to handle the case of unequal length. Third, SHNN-CAD is parameter-light and makes no assumption on data structure. The authors provided a method to adjust the anomaly threshold based on the desired alarm rate or the expected frequency of anomalies. Fourth, SHNN-CAD can perform online anomaly detection, which enables the increase of data size.
In this paper, we propose SHNN-CAD + to enhance the performance of SHNN-CAD. Compared with the previous approach, SHNN-CAD + has the following improvements:
  • The problems of applying Hausdorff distance directly to trajectory data are high computational cost as it visits every pairwise sample points in two trajectories, and that it cannot distinguish the direction while computing because the distance between two trajectories is defined as the distance between two sample points from the corresponding trajectories under a certain criterion. In [10], Voronoi diagram is used to speedup the calculation of Hausdorff distance, but it is complicated to implement. On the other hand, the direction attribute can be added when computing distance, but the extension of feature will increase the computational cost. To solve this, a modified distance measure based on directed Hausdorff distance is proposed to calculate the difference between trajectories. In addition, the modified measure has the advantage of a fast computation, which meets the requirement of performing online learning in a fast manner.
  • According to the description in [10], when the data size is quite small, the new coming trajectory can be regarded to be abnormal, however, with time evolving, this trajectory may have enough similar neighbors to be identified as normal. Our solution is introducing a re-do step into the detection procedure to identify anomalous data more accurately.
  • The anomaly threshold is a critical parameter since it controls the sensitivity to true anomalies and error rate. As aforementioned, in [10], the threshold is manually selected which relies on the user experience. Instead of predefining the anomaly threshold, an adaptive and data-based method is proposed to make the algorithm more parameter-light, which is more easily applicable for practical use.
In addition, compared with the work in [10], this paper expands the experiments in two aspects:
  • In order to evaluate the performance of anomaly detection, F1-score is used in [10] to compare SHNN-CAD with different approaches. We propose to apply more performance measures, such as, precision, recall, accuracy, and false alarm rate, in order to analyze the behaviour of anomaly detection algorithms comprehensively.
  • One important advantage of Hausdorff distance is that it can deal with trajectory data with different number of sample points. However, in the experiments of evaluating SHNN-CAD [10], all the testing data have the same number of sample points. In this paper, the experiments are enriched by introducing more datasets with unequal length.
The rest of this paper is organized as follows. Following this introduction, Section 2 reviews unsupervised anomaly detection approaches of trajectory data. Section 3 gives a brief description of SHNN-CAD and interprets where SHNN-CAD can be improved for practical use, and then explains the SHNN-CAD + that improves some limitations of SHNN-CAD. Section 4 presents extensive experiments on both real and synthetic datasets and discusses the performance of the proposed improvement strategies. Finally, concluding remarks and future work are given in Section 5.

2. Related Work

In the last few years, a branch of research have made effort to find efficient ways to detect outliers in trajectory data [6,11,12]. Since in reality, the available trajectories are usually raw data without any prior knowledge, this section focuses on the unsupervised approaches of anomaly detection which can be grouped into two categories: Clustering-based and non-clustering-based. Table 1 gives an overview of the approaches that are discussed below.
For clustering-based approaches, all the trajectories are clustered to obtain patterns, and the outliers are detected along with or after the clustering procedure due to the big difference with learnt patterns. The density-based spatial clustering of applications with noise (DBSCAN) algorithm [4,13] is of particular interest for both clustering and anomaly detection, considering that it is capable of discovering arbitrary shapes of clusters along with reporting outliers. However, the selection of two essential parameters limits its broad applicability. In concrete, DBSCAN requires two parameters, Eps and MinPts, to control the similarity between trajectories and the density of a cluster. The first one suffers from determining a proper distance measure which is still an open challenge [26]. The last one fails to support a good result when the densities of different clusters vary a lot. In addition, without the prior information of data, it is not straightforward and difficult to predefine specific parameters. Birant and Kut [14] proposed the Spatial-temporal DBSCAN (ST-DBSCAN) to improve DBSCAN by additionally dealing with the time attribute, but it increases the computational cost to calculate the similarity on both spatial and temporal dimension. It is well known that density based algorithms face with the problem of varied densities in data, Zhu et al. [15] proposed to solve this issue by developing a multi-dimensional scaling (DScale) method to readjust the computed distance.
Kumar et al. [16] proposed iVAT+ and clusiVAT+ for trajectory analysis along with detecting outliers. These approaches group the trajectories into different clusters by partitioning the Minimum Spanning Tree (MST). To build MST, the nondirectional dynamic time warping (DTW) distance between trajectories is regarded as the weight of the corresponding edge. The clusters which have very few trajectories are taken as irregular patterns, as a result the included trajectories are outliers. Obviously, the user expectation is necessary for determining how “few” should be.
Given the clusters, the testing trajectory is marked anomalous if the difference between it and the closest cluster center (also called centroid or medoid) is over an anomaly threshold. The representative distance measures that have been developed and applied in different applications are Euclidean distance (ED), Hausdorff distance, dynamic time warping (DTW), longest common subsequence (LCSS), etc. [26]. As the threshold is indirect to determine for varied practical situations, some approaches based on the probabilistic models have been proposed, and the distance is usually taken as the trajectory likelihood. In [17], the kernel density estimation (KDE) is used to detect the incoming sample point of the aircraft trajectory in progress. The sample point is determined as abnormal or belonging to a certain cluster depending on the probability is small or not. Guo et al. [18] proposed to apply the Shannon entropy to adaptively identify if the testing trajectory is normal or not. The normalized distances between the testing trajectory and the cluster centers obtained by the Information Bottleneck (IB) method build the probability distribution to compute the Shannon entropy, which measures the information used to detect the abnormality.
Some approaches attempt to detect outliers without the clustering procedure. In 2005, Keogh et al. [19] introduced the definition of time series discord and proposed the HOT Symbolic Aggregate ApproXimation (SAX) algorithm for the purpose of finding the subsequence (defined as discord) in a time series that is most different to all the rest subsequences. The authors proposed to search the discord via comparing the distance of each possible subsequence to the nearest non-self match using the brute force algorithm. Although the brute force algorithm is intuitive and simple, the time complexity is very high, which drives them to improve the process in a heuristic way. This definition was then improved and applied in different kind of time series including trajectory data by Yankov et al. [20] by treating each trajectory as a candidate subsequence.
Lee et al. [21] presented an efficient partition and detection framework, and developed a trajectory outlier detection algorithm TRAOD. Each trajectory is partitioned to a set of un-overlapping line segments based on the minimum description length (MDL) principle. Then the outlying line segments of a trajectory are picked. Due to the distance measure applied, the approach is able to detect both the positional and angular outliers. The novelty is that this algorithm is able to detect the outlying line segment other than the whole trajectory.
In [22], Guo et al. proposed a group-based signal filtering approach to do trajectory abstraction, where the outliers are filtered in an iterative procedure. Unlike the clustering algorithms, every trajectory may works in more than one group, and in the first phase (matching) the trajectory that has few similar items is considered as an outlier. In addition, the approach further picks the outliers in the second phase (detecting). It applies the 3 dimensional probability distribution function to represent the trajectory data and then computes Shannon entropy for outlier detection.
Banerjee et al. [23] designed the Maximal ANomalous sub-TRAjectories (MANTRA) to solve the problem of mining maximal temporally anomalous sub-trajectories in the field of road network management. The type of trajectory data studied is more specific and is called network-constrained trajectory which is a connected path in the road network. Thus, it is not easy to be applied to other anomaly detection applications.
Kanarachos et al. [25] proposed a systematic algorithm combining wavelets, neural networks and Hilbert transform for anomaly detection in time series. Being parameter-less makes the algorithm applicable in real world scenarios, for example, the anomaly threshold is given through the receiver operating characteristics without any assumption of data distribution.
Yuan et al. [24] tackled specific abnormal events for reminding drivers of danger. Both the location and direction of the moving object are taken into account, and contribute to the sparse reconstruction framework and the motion descriptor by a Bayesian model, respectively. Instead of dealing with raw trajectory data, this work is based on the video data where the object motion (trajectory) is represented by the pixel change between frames.

3. SHNN-CAD + : An Improvement of SHNN-CAD

First, the Sequential Hausdorff Nearest-Neighbor Conformal Anomaly Detector (SHNN-CAD) [10] is briefly described. Afterwards, we discuss several factors that influence the performance. Finally, we introduce the SHNN-CAD + .

3.1. SHNN-CAD Based Anomaly Detection

Laxhammar and Falkman proposed to perform online anomaly detection based on the conformal prediction [27] which estimates the p-value of each given label for a new observation, utilizing the non-conformity measure (NCM) to quantify the difference with the known observations. Successively three similar algorithms have been introduced [8,9,10] where SHNN-CAD is the last and the most complete one.
Considering that the conformal predictor provides valid prediction performance at arbitrary confidence level, Laxhammar and Falkman firstly defined the conformal anomaly detector (CAD). In concrete, given a training set T = x 1 , x 2 , , x l , a specified NCM, and a pre-set anomaly threshold ϵ , the corresponding nonconformity scores α 1 , α 2 , , α l + 1 are first computed. Then the p-value of x l + 1 , p l + 1 , is determined as the ratio of the number of trajectories that have greater or equal nonconformity scores to x l + 1 to the total number of trajectories.
p l + 1 = α i | α i α l + 1 , 1 i l + 1 l + 1 ,
where . computes the number of elements in the set. If p l + 1 < ϵ , then x l + 1 is identified as conformal anomaly, otherwise, x l + 1 is grouped to the normal set. Clearly, NCM is an essential factor that influences the quality of anomaly detection. The authors applied the k-nearest neighbors (kNN) and directed Hausdorff distance (DHD) to construct NCM, which refers to the Similarity based Nearest Neighbour Conformal Anomaly Detector (SNN-CAD) [9]. That is, the sum of distances between an observation x i and its k nearest neighbors, N N e i g h b o r , is defined as the nonconformity score of this observation. Thus, the nonconformity score of x i is given by
α i = x j N N e i g h b o r d h x i , x j ,
where d h . measures the distance between observations. In addition, since the new observation can be a single sample point, a line segment or a full trajectory, SNN-CAD was re-introduced as SHNN-CAD.
Usually unsupervised algorithms do not use any training set, and the outliers are defined to be observations which are far more infrequent than normal patterns [28]. In this paper we also adopt the “training set” concept as used in [10], assuming that the training set has only normal instances. In practical applications, due to the advantage of SHNN-CAD that only a small volume of data is needed as training set, these data can be chosen by users to make the algorithm work more effectively.

3.2. Discussion of SHNN-CAD

SHNN-CAD is not capable enough to adaptively detect outliers efficiently. The reason is threefold which is interpreted below.
First, using directed Hausdorff distance (DHD) to quantify the distance between trajectories cannot distinguish the difference of direction. DHD has the advantage of dealing with trajectory data with different number of sample points, showing the ability of computing distance for a single sample point or a line segment, which is important for the sequential anomaly detection of SHNN-CAD. Nonetheless, DHD was originally designed for point sets with no order between points as shown from the definition. Given two point sets, A = a 1 , a 2 , , a m and B = b 1 , b 2 , , b n , the DHD from A to B is defined as
d h A , B = max i = 1 m min j = 1 n d p a i , b j
where m and n are the number of points in sets A and B (without loss of granularity, assuming that m n henceforth), respectively. d p a i , b j returns the distance between points a i and b j , which is usually obtained by Euclidean distance [29]. Computing the DHD matrix of data is the most time-consuming part of SHNN-CAD, while the time complexity of DHD, O m n , is high in the traditional computation way that visits every two sample points from corresponding trajectories, which is almost impractical for large size datasets in real world. Alternatively, in [10], the authors adopted the algorithm proposed by Alt [30] which benefits from Voronoi diagram to reduce the time complexity to O m + n log m + n , but this algorithm requires to pre-process each trajectory by representing the included sample point based on its former neighbor. Moreover, although the trajectory is recorded as a collection of sample points, the order between points must be considered since it refers to the moving direction. For example, a car running in the lane with inverse direction should be detected as abnormal. Obviously, DHD is not sensitive for the direction attribute. As suggested in some literatures, the direction attribute at each sample point can be extracted and added to the feature matrix to obtain the distance, but the extra feature will increase the computational cost of distance measure. On the other hand, the direction attribute is generally computed as the intersection angle between line segment and horizontal axis, which may introduce noise to the feature matrix.
Second, SHNN-CAD is designed for online learning and anomaly detection which is highly desirable in practical applications, such as video surveillance. When a new observation is added into the database, SHNN-CAD decides it to be abnormal or not based on its p-value. As it can be seen in Equation (1), the p-value of an observation counts the amount of trajectories from the training set that have greater or equal nonconformity score to it. Obviously, the greater the p-value, the closer the observation to its k nearest neighbors, thus, it has higher probability of being normal. According to the mechanism of SHNN-CAD, once the trajectory comes to the dataset, it is identified as normal or not, and then the training set is updated for next testing trajectory. As shown in Figure 1a, the red trajectory has no similar items, thus it is detected as abnormal assuming that the p-value is bigger than the predefined anomaly threshold. However, unlike the case of a fixed dataset, the neighbors of an observation in online analysis are dynamic. In Figure 1b, the red trajectory has several similar items (in blue color), which means that its p-value may change to be greater than the anomaly threshold, and should be considered as normal. In conclusion, ignoring the influence of time evolution may bring errors in online anomaly detection.
Third, SHNN-CAD has two settings of anomaly threshold. The first one, 1 l + 1 , is to deal with the problem of zero sensitivity when the dataset size, l, is small. The second one is a predefined ϵ ( ϵ > 1 l + 1 ). In the first case, zero sensitivity means that, as long as l < ϵ , an actually normal observation that has the smallest p-value, 1 l + 1 , is identified as abnormal if there is only one threshold ϵ . Essentially, the first threshold defines the new coming observation as abnormal in default if it has the smallest p-value, 1 l + 1 . However, this strategy causes another problem of zero precision of anomaly detection. For example, an actually abnormal observation that has the smallest p-value, is classified as normal. In the second case, in theory, when the anomaly threshold is equal to the prior unknown probability of abnormal class λ , SHNN-CAD achieves perfect performance. However, for unsupervised CAD, no background information can help to tune ϵ .

3.3. SHNN-CAD +

According to the previous discussion, three improvement strategies to enhance the performance of anomaly detection are proposed and explained clearly and thoroughly. Finally, the pseudocode of the SHNN-CAD + method to detect a new coming is given and described.
First, as aforementioned, directly utilizing directed Hausdorff distance (DHD) for trajectory distance measure happens the problems of direction neglect and high computational complexity. The first problem is due to that DHD considers the two trajectories as point sets, which ignores the order between sample points, leading to the decline of accuracy. The second one is because that all the pairwise distances between the points of both sets have to be computed. To address the above issues, we propose to modify DHD by introducing a constraint window. The definition of DHD with constraint window, DHD( ω ), from trajectory A to trajectory B is
d h w A , B = max max i = 1 n + ω min | j i | w d p a i , b j , m n w max max i = 1 n + ω min | j i | w d p a i , b j , max i = n + ω + 1 m d p a i , b n , m n > w
where ω denotes the size of constraint window. Considering that unequal-length trajectories have a large difference in speed, ω is set as m n in this paper to limit that similar trajectories are homogeneous in speed. Obviously, each sample point in trajectory A visits at most 2 ω + 1 sample points in trajectory B, resulting in a linear time complexity O m + n ω , where ω m , n . On the other hand, the search space is limited to the sample points that follow temporal order, which not only enables the consideration of direction, but also improves the accuracy for measuring the distance as an extended visit may introduce wrong matching between two trajectories. Note that the direction of a sample point is an important attribute to give higher performance of detecting abnormal events for trajectory data in practical applications, such as vehicle reverse driving.
Figure 2 gives an example to illustrate the distance computation. Trajectories A and B have 5 and 8 sample points, respectively, and they have opposite directions which are indicated by the arrows at the end points. The dashed lines visualized in red, green, and blue colors represent the distances between sample points that are required to compute the distance two trajectories. The shortest distance between one sample point and its corresponding points from another trajectory is shown in blue or red. The red line indicates the maximum from all the shortest distances, namely the distance between A and B. Suppose that the window size is 2, the distances from A to B and from B to A by DHD( ω ) are computed as illustrated in Figure 2a and Figure 2b, respectively. Figure 2c shows the distance computation by DHD. The distances from A to B and from B to A are the same (all the blue lines are equal). In contrast, DHD( ω ) saves the computational cost. Additionally, due to that the order of sample points is taken into account, DHD( ω ) captures the difference between trajectories more accurately than DHD with respect to different features in trajectory data.
Second, considering that in online learning, the outlier may turn into normal once it has enough similar trajectories, we propose to apply a re-do step. To save time, not all outliers are rechecked again with every new coming. According to the theory of conformal anomaly detector, the anomaly threshold ϵ indicates how much probability of outliers occur in the dataset. Thus, if the size of outliers arrives larger than expected, several previous detected outliers may be detected as normal. For example, when the size of training data is l, if the new coming x l + 1 is identified as outlier but the number of outliers is greater than expected l + 1 · ϵ , the previous outliers will be rechecked if they can be moved to the normal class or not. In particular, this strategy helps to solve the problem of zero precision by SHNN-CAD when the data size is small. Regarding the use of predefined anomaly threshold, it is more reasonable to treat the new coming as abnormal when its p-value is too small, which means it has few similar items from the training set. Via the re-do strategy, the outliers can be picked out gradually with new comings. We don’t recheck the normal ones again, because their similar trajectories may increase or remain the same. In fact, for large volume data, the normal trajectories are usually pruned to discard redundant information or are trained into mixtures of models to avoid high computational cost, which will be considered in the future work.
Third, automated identification of outliers requires data-adaptive anomaly threshold instead of explicitly adjusting for different kind of trajectory datasets. Unlike SHNN-CAD, we define the threshold for the new coming when the size of training set is l as the minimum probability of a normal trajectory from training set N l .
ϵ l = min i N l p i
where p i is the p-value of ith trajectory. This setting is intuitive and straightforward since the p-values of the other trajectories in the training set are greater or equal to the defined anomaly threshold, which enables them to be normal. Obviously, the determination of ϵ depends on the condition of the considering training set, which makes the approach more applicable for different datasets.
Algorithm 1 shows how SHNN-CAD + works for a new coming. Compared with SHNN-CAD, the previously detected outliers are also input to perform the re-do strategy. Given the input, two zero distance arrays are built for the new coming x l + m + 1 (Lines 1 and 2). Lines 3–10 compute the nonconformity scores of all trajectories by summing the distances with the k nearest neighbors. The element D i , j denotes the distance from ith trajectory to its jth neighbor, which is obtained through modified Hausdorff distance. Then the p-values are calculated in Lines 11–12 according to Equation (1). Differing from SHNN-CAD, the anomaly threshold ϵ is dynamically updated depending on the training set (Line 13). From Line 14 to 25, the new coming is identified as outlier or not with the defined ϵ , and the outlier and training sets are updated correspondingly. If the size of the outlier set is over the expected value, the re-do strategy is performed on each previous outlier to check if it can be turned to the normal set (Lines 17–22).
Algorithm 1: Adaptive Online Trajectory Anomaly Detection with SHNN-CAD +
Sensors 19 00084 i001

4. Experiments

In this section, we present the experiments conducted. The matlab implementation is available at [31]. Firstly, a comparison of the proposed DHD ω to the typical DHD is given. Secondly, the performance of applying DHD ω to the anomaly detection measure is analyzed. Finally, the improvement on SHNN-CAD is evaluated. All the experimental results in this paper are obtained by MATLAB 2018a software running on a Windows machine with Intel Core i7 2.40 GHZ CPU and 8 GB RAM.

4.1. Comparison of Distance Measure

To evaluate the performance of measuring distance of DHD ω , we adopt the 10 cross-validation test using 1-Nearest Neighbor (1NN) classifier which has been demonstrated to work well to achieve this goal [26,32]. 1NN is parameter-free and the classification error ratio of 1NN only depends on the performance of the distance measure. Initially, the dataset is randomly divided into 10 groups. Then each group is successively taken as testing set, and the rest work as the training set for 1NN classifier. Finally, each testing trajectory is classified into the same cluster with its nearest neighbor in the training set. The 10 cross-validation test runs 100 times to obtain average error ratio. The classification error ratio of each run is calculated as follows:
classification error ratio = 1 10 i = 1 10 number of trajectories wrongly classified number of trajectories in the i th testing set
One thousand synthetic, 1 simulated, and 1 real trajectory datasets are utilized in this experiment. The Synthetic Trajectories I (numbered “I” to distinguish from the datasets in Section 4.3) is generated by Piciarelli et al. [33], which includes 1000 datasets. In each dataset, 250 trajectories are equally divided into 5 clusters and the remaining 10 trajectories are abnormal (abnormal trajectories are not considered in this experiment), see an example in Figure 3a. Each trajectory is recorded by the locations of 16 sample points. The simulated dataset CROSS and real dataset LABOMNI are contributed by Morris and Trivedi [34,35]. The trajectories in CROSS are designed to happen in a four way traffic intersection as shown in Figure 3b. CROSS includes 1999 trajectories which evenly belong to 19 clusters, and the number of sample points varies from 5 to 23. The trajectories in LABOMNI are from humans walking through a lab as shown in Figure 3c. LABOMNI has 209 trajectories from 15 clusters and the number of sample points varies from 30 to 624.
Table 2 gives the comparison results of the two different distance measures in terms of the average and standard deviation (std) of the classification error ratio. Due to the limited space of this paper, only the average result of the 1000 datasets in Synthetic Trajectories I is given. From the results on all datasets, we can see that DHD( ω ) works better than DHD to measure the difference between trajectories with only having the location information. In particular, in the case of real dataset LABOMNI, the classification performance improves a lot with the use of DHD( ω ), which demonstrates that DHD( ω ) captures the difference between trajectories more accurately than DHD. Compared with the datasets of Synthetic Trajectories I and C R O S S , the trajectories in LABOMNI are more complex in twofold: First, the number of sample points varies more dramatically; second, the trajectories with opposite directions are more close in location, for example, the trajectories following the same traffic rules in C R O S S are distributed in different lanes. In the final column of Table 2, the p-value for the null hypothesis that the results from the two distance measures are similar by Kruskal-Wallis test [36] is presented. The results of Synthetic Trajectories I and LABOMNI are largely different at a 1% significance level, while the difference in C R O S S is not so great. Note that Synthetic Trajectories I includes 1000 datasets, thus in conclusion, the results between DHD( ω ) and DHD are significantly different.
In addition, we compare the distance measures on time series data which is more general since the trajectory data is a specific form of time series [37]. The results are consistent with the expectations as shown in Table A1, Appendix A.

4.2. Comparison of Anomaly Detection Measures

The directed Hausdorff k-nearest neighbors nonconformity (DH-kNN NCM) measure is the core part of SHNN-CAD which computes the nonconformity score and further contributes to calculating the p-value of testing trajectory for classification. In this section, we evaluate the performance of DH-kNN NCM with the utilization of DHD( ω ). The same datasets and criterion used in [10] are adopted for comparative analysis, in addition, a real dataset from [18] is tested.
The 1000 synthetic trajectory datasets mentioned in Section 4.1 are the first group of testing data. Note that the outliers in each dataset are also included in the experiment, see Figure 4a. The second dataset consisting of 238 recorded video trajectories is provided by Lazarević et al. [38]. Each trajectory includes 5 sample points and is labeled as normal or not. In this dataset, only 2 trajectories are abnormal, as shown in Figure 4b. The Aircraft Dataset used by Guo et al. includes 325 aircraft trajectories with the number of sample points varying between 102 and 1023, and 5 trajectories are labeled as abnormal [18], as shown in Figure 4c. For each dataset, the nonconformity scores of all the trajectories are calculated and sorted. The accuracy of anomaly detection is calculated as the proportion of outliers in the top n nonconformity scores. Here, n is the number of outliers in the dataset according to the groundtruth.
Table 3 shows the accuracy performances of DH-kNN NCM and the version with DHD( ω ) on 1002 datasets. Due to the space limit of this paper, only the average result of the 1000 datasets in Synthetic Trajectories I is given. For Synthetic Trajectories I, using DHD( ω ) improves the detection quality of DH-kNN NCM regardless of the number of nearest neighbors k. Additionally, with DHD and DHD( ω ), the anomaly detection measure works the best when k = 2 . For Recorded Video Trajectories and Aircraft Trajectories, the replacement of DHD( ω ) achieves the same detection result.

4.3. Comparison of Online Anomaly Detection

In order to demonstrate the efficiency and reliability of the proposed improvement, we compare the performance of SHNN-CAD + with SHNN-CAD on the same datasets applied in [10] and further introduce more datasets.
The synthetic trajectories [39] presented in [10] for online anomaly detection is created by Laxhammar using the trajectory generator software written by Piciarelli et al. [40]. Synthetic Trajectories II includes 100 datasets, and each dataset has 2000 trajectories. Each trajectory has 16 sample points recorded with location attribute and has the probability 1% of being abnormal. To expand the dataset for experiment, we reuse Synthetic Trajectories I and rename it by Synthetic Trajectories III [41]. The trajectories in each dataset of Synthetic Trajectories I are randomly reordered since they are organized regularly by cluster, which is not common in practical applications. In addition, considering Hausdorff distance has the advantage of dealing with trajectory data with different number of sample points, however, in the experiments of [10], all the testing data have equal length for online learning and anomaly detection, we produce a collection of datasets where the trajectories have various lengths, called Synthetic Trajectories IV [41]. The trajectory generator software [40] is enhanced to produce trajectories with the number of sample points ranging from 20 to 100. For each dataset, firstly, 2000 normal trajectories from 10 equal-size clusters are generated with the randomness parameter 0.7 and are reordered to simulate the real scene. Then 1000 abnormal trajectories are generated. Finally, each normal trajectory is independently replaced with the probability λ by an abnormal one. The collection has 3 groups of datasets with λ equal to 0.005, 0.01, and 0.02, respectively, and each group contains 100 trajectory datasets.
In [10], F1-score is utilized to compare the overall performance of online learning and anomaly detection. F1-score is the harmonic mean of precision and recall (also called detection rate in the field of anomaly detection). In addition to this, we also analyze the false alarm rate and accuracy values for the purpose of comprehensive evaluation from different aspects [42]. Precision indicates the proportion of true outliers in the detected abnormal set. Recall presents the percentage of outliers that are detected. Accuracy computes ratio of correctly classified (normal or abnormal) trajectories. False alarm rate measures the rate of wrongly detecting an outlier. The calculations of these performance measures are as follows:
precision = number of anomalies detected number of objects classified as anomalies
recall ( detection rate ) = number of anomalies detected total number of anomalies labeled in groundtruth
F 1 = 2 · precision · recall precision + recall
accuracy = number of trajectories correctly classified total number of trajectories
false alarm rate = number of normal trajectories classified as abnormal total number of normal trajectories labeled in groundtruth
Firstly, the performance of SHNN-CAD and SHNN-CAD + is compared using the aforementioned 1400 trajectory datasets. k is set to 2 as suggested in [10]. The average results for each collection of trajectory data are given in Table 4 where the best score of each performance measure is highlighted. Note that it is necessary to pre-set the anomaly threshold ϵ for SHNN-CAD, thus, we define ϵ based on the real probability of anomaly λ . For Synthetic Trajectories III, the λ is computed as 10 / 260 as each dataset has 10 outliers and 250 normal trajectories. It is clear from the table that SHNN-CAD works the best in most datasets with regard to the F1 score when ϵ is close to λ , which is consistent with expected and with the description in [10]. Compared to the results in this case, SHNN-CAD + achieves better results. It is important to point out that for unsupervised anomaly detection, λ is not available and no information can be used to help to determine ϵ . For example, in different collection of datasets, ϵ is set with a different value. In addition, SHNN-CAD + achieves better results in all the datasets with the accuracy index. Thus, SHNN-CAD is less applicable in real-world applications than SHNN-CAD + , which utilizes the adaptive anomaly threshold. Furthermore, the average running time of dealing with 2000 equal-length trajectories in Synthetic Trajectories II is 39.85 s by SHNN-CAD + . For comparison, the typical implementation of DHD with computing distance between every pairwise sample points is equipped to the anomaly detection procedure of SHNN-CAD to get the running time, which is 128.08 s in this case. For unequal-length 2000 trajectories in Synthetic Trajectories IV, the running time becomes longer as 109.34 s by SHNN-CAD + and 556.28 s by SHNN-CAD with typical DHD implementation. Note that optimal implementations (as the one suggested in [10], which is based on the Voronoi diagram) will improve the result.
Next, in order to test the relative performance of each proposed improvement strategies, we conduct experiments based on three objectives. First (Objective 1), to verify that DHD( ω ) helps to improve the performance of SHNN-CAD, SHNN-CAD is implemented with DHD( ω ) computing the distance between trajectories. Second (Objective 2), for the purpose of demonstrating the rationality of the re-do step, SHNN-CAD is equipped with a re-do step in the procedure of anomaly detection. Third (Objective 3), to prove the effectiveness of adaptive anomaly threshold, the predefinition of ϵ is replaced in SHNN-CAD. The results are listed in Table 5. For the task of objective 1 and objective 2, the results are compared with those by SHNN-CAD in Table 4. In the case of objective 1, obviously, the utilization of DHD( ω ) improves the behaviour of anomaly detection regardless of most performance measures for all the datasets. In the case of objective 2, only the recall index for some datasets is not as well as SHNN-CAD, which means the missing recognition of outliers. However, the comprehensive F1 score indicates that the performance is promising. In the case of objective 3, the results are compared with SHNN-CAD in Table 4 when ϵ is closest to the corresponding λ . Clearly, for most datasets, the adaptive anomaly threshold can make up the shortcoming of predefinition and strengthen the capability of anomaly detection. Compared with the SHNN-CAD + in Table 4, all the improvement strategies work together to accomplish the enhancement of SHNN-CAD.

5. Conclusions

Based on the Sequential Hausdorff Nearest-Neighbor Conformal Anomaly Detector (SHNN-CAD), which focuses on online detecting outliers from trajectory data, we have presented an enhanced version, called SHNN-CAD + , to improve the anomaly detection performance. The proposal includes three improvement strategies: First, modifying typical point-based Hausdorff distance to be suitable for trajectory data and to be faster in distance calculation; second, adding a re-do step to avoid false positives in the initial stages of the algorithm; third, defining data-adaptive and dynamic anomaly threshold rather than a pre-set and fixed one. Experimental results have shown that the performance of the presented approach has been improved over SHNN-CAD. Considering that the training set will increase a lot with time, further research will focus on incremental learning which prunes the historical data for future process.

Author Contributions

Conceptualization, Y.G.; Writing—original draft, Y.G.; Writing—review & editing, A.B.

Funding

This research has been funded in part by grants from the Spanish Government (Nr. TIN2016-75866-C3-3-R) and from the Catalan Government (Nr. 2017-SGR-1101). Yuejun Guo acknowledges the support from Secretaria d’Universitats i Recerca del Departament d’Empresa i Coneixement de la Generalitat de Catalunya and the European Social Fund.

Acknowledgments

The datasets in this paper are public. We acknowledgment the authors Piciarelli et al., Morris and Trivedi, Lazarević et al., Chen et al. for putting the collection of synthetic datasets and trajectory generator, CROSS and LABOMNI, recorded video trajectory dataset, UCR time series public online, respectively.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
HDHausdorff distance
DHDDirected Hausdorff distance
DHD( ω )Directed Hausdorff distance with constraint window
kNNk-nearest neighbors
CADConformal anomaly detector
NCMNon-conformity measure
SNN-CADSimilarity based Nearest Neighbour Conformal Anomaly Detector
SHNN-CADSequential Hausdorff Nearest-Neighbor Conformal Anomaly Detector
SHNN-CAD + Enhanced version of Sequential Hausdorff Nearest-Neighbor Conformal Anomaly Detector

Appendix A. 10-Fold Cross Validation Results on 65 Time Series Datasets

As interpreted in [37], “A sequence composed by a series of nominal symbols from a particular alphabet is usually called a temporal sequence and a sequence of continuous, real-valued elements, is known as a time series”. According to the definition, trajectory data is a specific kind of time-series. We test the performance of DHD( ω ) based on 65 public time series datasets which are collected by Chen et al. [43]. The average and stand deviation (Std) of classification error ratio on the datasets are given in Table A1 where DHD( ω ) performs better than DHD.
Table A1. Classification Error Ratio (%) on Time Series Datasets. The best performance of each dataset is in bold.
Table A1. Classification Error Ratio (%) on Time Series Datasets. The best performance of each dataset is in bold.
No.DatasetData SizeClusters SizeDHDDHD( ω )
AverageStdAverageStd
150words9055086.520.030040.770.0391
2Adiac7813771.330.060336.480.0452
3ArrowHead210350.000.098510.000.0613
4Beef60555.000.236450.000.2606
5BeetleFly40240.000.174822.500.2486
6BirdChicken40222.500.321720.000.1581
7Car120458.330.152129.170.0982
8CBF930360.540.06083.440.0167
9Coffee56225.330.15011.670.0527
10Computers500226.400.061037.800.0614
11Cricket_X7801278.080.029348.330.0897
12Cricket_Y7801280.510.055050.510.0510
13Cricket_Z7801278.330.033348.080.0397
14DiatomSizeReduction32248.390.03920.000.0000
15DistalPhalanxOutlineAgeGroup539333.190.063723.380.0583
16DistalPhalanxOutlineCorrect876234.600.050923.290.0425
17DistalPhalanxTW539643.210.056329.680.0488
18Earthquakes461232.340.073536.040.0797
19ECG200200231.500.118011.500.0914
20ECG50005000513.700.01736.680.0114
21ECGFiveDays88423.510.01250.000.0000
22FaceAll22471464.490.02786.540.0149
23FaceFour112450.910.162117.050.0911
24FacesUCR22471464.530.03376.190.0135
25FISH350769.430.068717.430.0888
26Gun_Point200239.500.10122.000.0258
27Ham214245.280.089427.620.0831
28Haptics463569.770.042464.370.0510
29Herring128245.450.101546.790.1380
30InsectWingbeatSound22001187.910.022141.500.0263
31ItalyPowerDemand1096232.390.04855.750.0206
32LargeKitchenAppliances750351.330.059460.530.0623
33Lighting2121236.280.114935.640.1137
34Lighting7143752.430.096654.480.1559
35Meat120310.830.08837.500.0730
36MedicalImages11401047.720.031025.880.0262
37MiddlePhalanxOutlineAgeGroup554344.420.092831.410.0662
38MiddlePhalanxOutlineCorrect891239.730.064827.840.0480
39MiddlePhalanxTW553649.190.041945.940.0434
40MoteStrain1272227.510.038412.570.0285
41OliveOil60426.670.161016.670.1361
42OSULeaf442665.160.044338.710.0926
43PhalangesOutlinesCorrect2658237.770.040924.080.0261
44Plane210721.900.07842.860.0246
45ProximalPhalanxOutlineAgeGroup605329.400.055423.640.0620
46ProximalPhalanxOutlineCorrect891230.180.049318.300.0541
47ProximalPhalanxTW605633.880.056026.420.0824
48RefrigerationDevices750335.730.043963.600.0507
49ScreenType750351.200.046765.600.0474
50ShapeletSim200238.500.143547.000.0919
51ShapesAll11986083.480.022521.370.0302
52SmallKitchenAppliances750349.870.070659.330.0587
53SonyAIBORobotSurface621218.690.03461.450.0119
54Strawberry98327.630.01834.170.0170
55SwedishLeaf11221567.210.052517.020.0359
56Symbols1000677.900.03704.400.0222
57synthetic_control600673.500.05523.330.0192
58ToeSegmentation1252230.510.069726.090.0912
59ToeSegmentation2166219.850.078421.250.1409
60Trace200425.000.09436.500.0626
61TwoLeadECG116228.860.02690.950.011
62Wine11125.450.08784.550.0643
63WordsSynonyms9052582.750.045838.120.0489
64Worms225559.570.083969.940.1074
65WormsTwoClass225240.450.121250.610.0490
Average of all datasets44.360.074426.500.0640

References

  1. Haritaoglu, I.; Harwood, D.; Davis, L.S. W4: Real-Time Surveillance of People and Their Activities. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22, 809–830. [Google Scholar] [CrossRef]
  2. Majecka, B. Statistical Models of Pedestrian Behaviour in the Forum. Master’s Thesis, School of Informatics, University of Edinburgh, Edinburgh, UK, 2009. [Google Scholar]
  3. Wang, Q.; Chen, M.; Nie, F.; Li, X. Detecting Coherent Groups in Crowd Scenes by Multiview Clustering. IEEE Trans. Pattern Anal. Mach. Intell. 2018. [Google Scholar] [CrossRef]
  4. Gariel, M.; Srivastava, A.N.; Feron, E. Trajectory Clustering and an Application to Airspace Monitoring. IEEE Trans. Intell. Transp. Syst. 2011, 12, 1511–1524. [Google Scholar] [CrossRef] [Green Version]
  5. Powell, M.D.; Aberson, S.D. Accuracy of United States Tropical Cyclone Landfall Forecasts in the Atlantic Basin (1976–2000). Bull. Am. Meteorol. Soc. 2001, 82, 2749–2768. [Google Scholar] [CrossRef]
  6. Meng, F.; Yuan, G.; Lv, S.; Wang, Z.; Xia, S. An Overview on Trajectory Outlier Detection. Artif. Intell. Rev. 2018. [Google Scholar] [CrossRef]
  7. Keogh, E.; Lonardi, S.; Ratanamahatana, C.A. Towards Parameter-free Data Mining. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, WA, USA, 22–25 August 2004; pp. 206–215. [Google Scholar] [CrossRef]
  8. Laxhammar, R.; Falkman, G. Conformal Prediction for Distribution-independent Anomaly Detection in Streaming Vessel Data. In Proceedings of the International Workshop on Novel Data Stream Pattern Mining Techniques, Washington, DC, USA, 25 July 2010; pp. 47–55. [Google Scholar] [CrossRef]
  9. Laxhammar, R.; Falkman, G. Sequential Conformal Anomaly Detection in Trajectories Based on Hausdorff Distance. In Proceedings of the International Conference on Information Fusion, Chicago, IL, USA, 5–8 July 2011; pp. 1–8. [Google Scholar]
  10. Laxhammar, R.; Falkman, G. Online Learning and Sequential Anomaly Detection in Trajectories. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 36, 1158–1173. [Google Scholar] [CrossRef] [PubMed]
  11. Gupta, M.; Gao, J.; Aggarwal, C.C.; Han, J. Outlier Detection for Temporal Data: A Survey. IEEE Trans. Knowl. Data Eng. 2014, 26, 2250–2267. [Google Scholar] [CrossRef] [Green Version]
  12. Parmar, J.D.; Patel, J.T. Anomaly Detection in Data Mining: A Review. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 2017, 7, 32–40. [Google Scholar] [CrossRef]
  13. Ester, M.; Kriegel, H.P.; Sander, J.; Xu, X. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In Proceedings of the International Conference on Knowledge Discovery and Data Mining, Portland, OR, USA, 2–4 August 1996; pp. 226–231. [Google Scholar]
  14. Birant, D.; Kut, A. ST-DBSCAN: An Algorithm for Clustering Spatial-Temporal Data. Data Knowl. Eng. 2007, 60, 208–221. [Google Scholar] [CrossRef]
  15. Zhu, Y.; Ming, K.T.; Angelova, M. Angelova, M. A Distance Scaling Method to Improve Density-Based Clustering. In Advances in Knowledge Discovery and Data Mining; Phung, D., Tseng, V.S., Webb, G.I., Ho, B., Ganji, M., Rashidi, L., Eds.; Springer: Cham, Switzerland, 2018; pp. 389–400. [Google Scholar]
  16. Kumar, D.; Bezdek, J.C.; Rajasegarar, S.; Leckie, C.; Palaniswami, M. A Visual-Numeric Approach to Clustering and Anomaly Detection for Trajectory Data. Vis. Comput. 2017, 33, 265–281. [Google Scholar] [CrossRef]
  17. Annoni, R.; Forster, C.H.Q. Analysis of Aircraft Trajectories Using Fourier Descriptors and Kernel Density Estimation. In Proceedings of the International IEEE Conference on Intelligent Transportation Systems, Anchorage, AK, USA, 16–19 September 2012; pp. 1441–1446. [Google Scholar] [CrossRef]
  18. Guo, Y.; Xu, Q.; Li, P.; Sbert, M.; Yang, Y. Trajectory Shape Analysis and Anomaly Detection Utilizing Information Theory Tools. Entropy 2017, 19, 323. [Google Scholar] [CrossRef]
  19. Keogh, E.; Lin, J.; Fu, A. HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence. In Proceedings of the IEEE International Conference on Data Mining, Houston, TX, USA, 27–30 November 2005; pp. 226–233. [Google Scholar] [CrossRef] [Green Version]
  20. Yankov, D.; Keogh, E.; Rebbapragada, U. Disk Aware Discord Discovery: Finding Unusual Time Series in Terabyte Sized Datasets. Knowl. Inf. Syst. 2008, 17, 241–262. [Google Scholar] [CrossRef]
  21. Lee, J.G.; Han, J.; Li, X. Trajectory Outlier Detection: A Partition-and-Detect Framework. In Proceedings of the International Conference on Data Engineering, Cancun, Mexico, 7–12 April 2008; pp. 140–149. [Google Scholar] [CrossRef]
  22. Guo, Y.; Xu, Q.; Luo, X.; Wei, H.; Bu, H.; Sbert, M. A Group-Based Signal Filtering Approach for Trajectory Abstraction and Restoration. Neural Comput. Appl. 2018, 29, 371–387. [Google Scholar] [CrossRef]
  23. Banerjee, P.; Yawalkar, P.; Ranu, S. MANTRA: A Scalable Approach to Mining Temporally Anomalous Sub-trajectories. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 1415–1424. [Google Scholar] [CrossRef]
  24. Yuan, Y.; Wang, D.; Wang, Q. Anomaly Detection in Traffic Scenes via Spatial-Aware Motion Reconstruction. IEEE Trans. Intell. Transp. Syst. 2017, 18, 1198–1209. [Google Scholar] [CrossRef]
  25. Kanarachos, S.; Christopoulos, S.R.G.; Chroneos, A.; Fitzpatrick, M.E. Detecting anomalies in time series data via a deep learning algorithm combining wavelets, neural networks and Hilbert transform. Expert Syst. Appl. 2017, 85, 292–304. [Google Scholar] [CrossRef]
  26. Ding, H.; Trajcevski, G.; Scheuermann, P.; Wang, X.; Keogh, E. Querying and Mining of Time Series Data: Experimental Comparison of Representations and Distance Measures. VLDB Endow. 2008, 1, 1542–1552. [Google Scholar] [CrossRef]
  27. Gammerman, A.; Vovk, V. Hedging Predictions in Machine Learning. Comput. J. 2007, 50, 151–163. [Google Scholar] [CrossRef]
  28. Chandola, V.; Banerjee, A.; Kumar, V. Anomaly Detection: A Survey. ACM Comput. Surv. 2009, 41, 15:1–15:58. [Google Scholar] [CrossRef]
  29. Faloutsos, C.; Ranganathan, M.; Manolopoulos, Y. Fast Subsequence Matching in Time-Series Databases. SIGMOD Rec. 1994, 23, 419–429. [Google Scholar] [CrossRef]
  30. Alt, H. The Computational Geometry of Comparing Shapes. In Efficient Algorithms: Essays Dedicated to Kurt Mehlhorn on the Occasion of His 60th Birthday; Albers, S., Alt, H., Näher, S., Eds.; Springer: Berlin/Heidelberg, Germany, 2009; pp. 235–248. [Google Scholar] [CrossRef]
  31. Guo, Y. Matlab Code of Experiments. Available online: http://gilabparc.udg.edu/trajectory/experiments/Experiments.zip (accessed on 13 December 2018).
  32. Tan, P.N.; Steinbach, M.; Kumar, V. Introduction to Data Mining; Pearson Education: Noida, India, 2007. [Google Scholar]
  33. Piciarelli, C.; Micheloni, C.; Foresti, G.L. Synthetic Trajectories by Piciarelli et al. 2008. Available online: https://avires.dimi.uniud.it/papers/trclust/ (accessed on 14 December 2018).
  34. Morris, B.; Trivedi, M. Trajectory Clustering Datasets. 2009. Available online: http://cvrr.ucsd.edu/bmorris/datasets/dataset_trajectory_clustering.html (accessed on 14 December 2018).
  35. Morris, B.; Trivedi, M. Learning Trajectory Patterns by Clustering: Experimental Studies and Comparative Evaluation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 312–319. [Google Scholar] [CrossRef]
  36. Kruskal, W.H.; Wallis, W.A. Use of Ranks in One-Criterion Variance Analysis. Am. Stat. Assoc. 1952, 47, 583–621. [Google Scholar] [CrossRef]
  37. Antunes, C.M.; Oliveira, A.L. Temporal Data Mining: An Overview. In Proceedings of the KDD Workshop on Temporal Data Mining, San Francisco, CA, USA, 26–29 August 2001; Volume 1, pp. 1–13. [Google Scholar]
  38. Lazarević, A. First Set of Recorded Video Trajectories. 2007. Available online: https://www-users.cs.umn.edu/~lazar027/inclof/ (accessed on 14 December 2018).
  39. Laxhammar, R. Synthetic Trajectories by Laxhammar. Available online: https://www.researchgate.net/publication/236838887_Synthetic_trajectories (accessed on 14 December 2018).
  40. Piciarelli, C.; Micheloni, C.; Foresti, G.L. Synthetic Trajectory Generator. 2008. Available online: https://avires.dimi.uniud.it/papers/trclust/create_ts2.m (accessed on 14 December 2018).
  41. Guo, Y. Synthetic Trajectories by Guo. Available online: http://gilabparc.udg.edu/trajectory/data/SyntheticTrajectories.zip (accessed on 14 December 2018).
  42. Nandeshwar, A.; Menzies, T.; Nelson, A. Learning patterns of university student retention. Expert Syst. Appl. 2011, 38, 14984–14996. [Google Scholar] [CrossRef]
  43. Chen, Y.; Keogh, E.; Hu, B.; Begum, N.; Bagnall, A.; Mueen, A.; Batista, G. The UCR Time Series Classification Archive. 2015. Available online: www.cs.ucr.edu/~eamonn/time_series_data/ (accessed on 14 December 2018).
Figure 1. Plots of the trajectory change in dataset. As time goes on, from (a) to (b), the number of trajectories increases, and the red trajectory has several similar items (in blue color).
Figure 1. Plots of the trajectory change in dataset. As time goes on, from (a) to (b), the number of trajectories increases, and the red trajectory has several similar items (in blue color).
Sensors 19 00084 g001
Figure 2. Plots of the distance between two trajectories A and B by directed Hausdorff distance with window (DHD( ω )) and directed Hausdorff distance (DHD). (a) distance from A to B by DHD( ω ). (b) distance from B to A by DHD( ω ). (c) distance from A to B and from B to A by DHD.
Figure 2. Plots of the distance between two trajectories A and B by directed Hausdorff distance with window (DHD( ω )) and directed Hausdorff distance (DHD). (a) distance from A to B by DHD( ω ). (b) distance from B to A by DHD( ω ). (c) distance from A to B and from B to A by DHD.
Sensors 19 00084 g002
Figure 3. Plots of trajectory datasets used for the evaluation of distance measures. Trajectories in the same cluster have the same color.
Figure 3. Plots of trajectory datasets used for the evaluation of distance measures. Trajectories in the same cluster have the same color.
Sensors 19 00084 g003
Figure 4. Plots of trajectory datasets used for the evaluation of anomaly detection measures. The trajectories colored with red are abnormal. (a) One dataset “TS1” from synthetic datasets. (b) Recorded Video trajectory Dataset. (c) Aircraft trajectory Dataset.
Figure 4. Plots of trajectory datasets used for the evaluation of anomaly detection measures. The trajectories colored with red are abnormal. (a) One dataset “TS1” from synthetic datasets. (b) Recorded Video trajectory Dataset. (c) Aircraft trajectory Dataset.
Sensors 19 00084 g004
Table 1. Overview of anomaly detection in several related works.
Table 1. Overview of anomaly detection in several related works.
Ref.Category of AlgorithmType of DataThresholdEvaluation Measure
[4]clustering-based
(DBSCAN)
trajectory dataimplieddiscuss with data managers
[13]clustering-based
(DBSCAN)
point setimpliedcompare result with groundtruth,
running time
[14]clustering-based
(ST-DBSCAN)
spatial-temporal
data
impliedrunning time complexity,
interpret results in application
[15]clustering-based
(DBSCAN/OPTICS/DP)
point setimpliedF-measure
[16]clustering-based
(iVAT+/clusiVAT+)
trajectory datapredefinedpartition accuracy,
false alarm, true positive
[17]clustering-based
(DBSCAN+KDE)
trajectory datapredefined10-fold cross validation test,
interpret results in application
[18]clustering-based
(IB+Shannon entropy)
trajectory dataautomaticaccuracy, precision
recall, F-measure
[19]non-clustering-based
(HOT SAX)
time seriesautomaticinterpret results with data,
running time complexity
[20]non-clustering-based
(disk aware algorithm)
time seriesautomaticrunning time
[21]non-clustering-based
(TRAOD)
trajectory datapredefinedpruning power,
accuracy of pruning, speedup ratio
[22]non-clustering-based
(trajectory abstraction)
trajectory datapredefineddegree of redundancy,
informativeness, precision, recall
[23]non-clustering-based
(MANTRA)
trajectory datapredefinedgrowth rate of running time/number
of anomalous edges, accuracy,
5-fold cross validation, F-measure
[24]non-clustering-based
(anomaly detection
in traffic scenes)
video dataautomaticpixel-wise receiver of characteristics
(ROC), area under ROC
[25]non-clustering-based
(an algorithm combining
wavelets, neural networks
and Hilbert transform)
time seriesautomaticfalse positive/alarm rate,
true positive rate (hit rate),
interpret results with data
Table 2. Classification Error Ratio (%) on Different Trajectory Datasets and the corresponding p-value.
Table 2. Classification Error Ratio (%) on Different Trajectory Datasets and the corresponding p-value.
Distance MeasuresDHDDHD( ω )p-Value
Datasets AverageStdAverageStd
Synthetic Trajectories I (average)0.16340.00320.15660.0031 0.001
CROSS0.61000.07920.59370.06940.2182
LABOMNI31.231.3710.200.87 0.001
Table 3. Accuracy (%) of anomaly detection on different trajectory datasets.
Table 3. Accuracy (%) of anomaly detection on different trajectory datasets.
DatasetsNonconformity Measures# of Most Similar Neighbors Considered
k = 1 k = 2 k = 3 k = 4 k = 5
Synthetic Trajectories I (average)DH-kNN NCM96.4297.0997.0596.9596.77
using DHD( ω )96.4597.8597.8197.7497.65
Recorded Video TrajectoriesDH-kNN NCM100.00100.00100.00100.00100.00
using DHD( ω )100.00100.00100.00100.00100.00
Aircraft TrajectoriesDH-kNN NCM80.0080.0080.0080.0080.00
using DHD( ω )80.0080.0080.0080.0080.00
Table 4. Five Performance Measures (%) of Online Learning and Anomaly Detection on Different Trajectory Datasets. Note that we found a mistake in Table 3 of [10] where the given result of Sequential Hausdorff Nearest-Neighbor Conformal Anomaly Detector (SHNN-CAD) is different with the description of Algorithm 2 in [10]. The F1 result, 53.52, 74.61, and 61.68, of SHNN-CAD with ϵ = 0.005 , 0.01 , and 0.02, respectively, is given under the condition that if p-value ϵ , the testing trajectory is classified as abnormal. In the table below, we follow the rule in Algorithm 2 [10] for SHNN-CAD. The best performance of each collection of datasets is in bold.
Table 4. Five Performance Measures (%) of Online Learning and Anomaly Detection on Different Trajectory Datasets. Note that we found a mistake in Table 3 of [10] where the given result of Sequential Hausdorff Nearest-Neighbor Conformal Anomaly Detector (SHNN-CAD) is different with the description of Algorithm 2 in [10]. The F1 result, 53.52, 74.61, and 61.68, of SHNN-CAD with ϵ = 0.005 , 0.01 , and 0.02, respectively, is given under the condition that if p-value ϵ , the testing trajectory is classified as abnormal. In the table below, we follow the rule in Algorithm 2 [10] for SHNN-CAD. The best performance of each collection of datasets is in bold.
Trajectory DatasetsApproachesPrecisionRecallF1AccuracyFalse Alarm Rate
Synthetic Trajectories II
( λ = 0.01 )
SHNN-CAD ϵ = 0.005 98.7040.3954.7599.390.01
ϵ = 0.01 87.1577.4879.8099.630.13
ϵ = 0.02 50.2494.5964.3598.980.98
SHNN-CAD + 88.4189.6486.3899.770.13
Synthetic Trajectories III
( λ = 0.038 )
SHNN-CAD ϵ = 0.03 97.3455.5167.9298.190.10
ϵ = 0.04 91.5273.9579.3698.630.38
ϵ = 0.05 80.0183.4079.8398.391.01
SHNN-CAD + 84.7582.7079.3998.680.70
Synthetic Trajectories IV
( λ = 0.005 )
SHNN-CAD ϵ = 0.004 90.9754.5363.7899.730.03
ϵ = 0.005 83.8265.0469.4099.750.07
ϵ = 0.01 52.6388.2164.0199.530.41
SHNN-CAD + 78.4391.7681.4799.820.15
Synthetic Trajectories IV
( λ = 0.01 )
SHNN-CAD ϵ = 0.005 99.1737.3152.3999.330.00
ϵ = 0.01 89.3174.3179.3199.610.11
ϵ = 0.02 52.2792.0965.6099.010.92
SHNN-CAD + 88.6489.5285.6699.750.14
Synthetic Trajectories IV
( λ = 0.02 )
SHNN-CAD ϵ = 0.01 98.9945.7961.6498.910.01
ϵ = 0.02 87.4783.8884.6299.420.26
ϵ = 0.03 63.1893.0274.5398.781.10
SHNN-CAD + 95.3678.7581.4599.480.10
Table 5. Five Performance Measures (%) of Proposed Improvement Strategies on Different Trajectory Datasets.
Table 5. Five Performance Measures (%) of Proposed Improvement Strategies on Different Trajectory Datasets.
Trajectory DatasetsApproachesPrecisionRecallF1AccuracyFalse Alarm Rate
Synthetic Trajectories II
( λ = 0.01 )
Objective 1 ϵ = 0.005 98.8640.2354.8999.380.01
ϵ = 0.01 87.9377.9180.3899.640.12
ϵ = 0.02 50.7795.1664.8998.990.97
Objective 2 ϵ = 0.005 98.7040.3954.7599.390.01
ϵ = 0.01 87.1577.4879.8099.630.13
ϵ = 0.02 50.2494.5964.3598.980.98
Objective 387.0887.2184.9799.750.13
Synthetic Trajectories III
( λ = 0.038 )
Objective 1 ϵ = 0.03 97.0155.7867.9998.210.10
ϵ = 0.04 91.8674.3179.8198.660.37
ϵ = 0.05 80.4183.9480.3098.431.00
Objective 2 ϵ = 0.03 97.3455.5167.9298.190.10
ϵ = 0.04 91.5273.9579.3698.630.38
ϵ = 0.05 80.0183.4079.8398.391.01
Objective 385.1082.2579.0298.640.72
Synthetic Trajectories IV
( λ = 0.005 )
Objective 1 ϵ = 0.004 93.8858.2267.4599.750.02
ϵ = 0.005 87.2369.3973.2899.780.06
ϵ = 0.01 52.7591.0865.0199.540.41
Objective 2 ϵ = 0.004 90.9754.5363.7899.730.03
ϵ = 0.005 83.8265.0469.4099.750.07
ϵ = 0.01 52.6388.2164.0199.530.41
Objective 377.9086.1078.3099.790.14
Synthetic Trajectories IV
( λ = 0.01 )
Objective 1 ϵ = 0.005 99.5337.7453.2099.340.00
ϵ = 0.01 92.3577.9482.8499.680.08
ϵ = 0.02 53.8094.7667.5999.070.89
Objective 2 ϵ = 0.005 99.1737.3152.3999.330.00
ϵ = 0.01 89.3174.3179.3199.610.11
ϵ = 0.02 52.2792.0965.6099.010.92
Objective 387.5584.0882.6399.690.14
Synthetic Trajectories IV
( λ = 0.02 )
Objective 1 ϵ = 0.01 99.6045.8061.7098.920.01
ϵ = 0.02 90.1886.5987.3899.520.20
ϵ = 0.03 65.6895.2877.0298.911.02
Objective 2 ϵ = 0.01 98.9945.7961.6498.910.01
ϵ = 0.02 87.4783.8884.6299.420.26
ϵ = 0.03 63.1893.0274.5398.781.10
Objective 395.0572.6677.9499.370.10

Share and Cite

MDPI and ACS Style

Guo, Y.; Bardera, A. SHNN-CAD+: An Improvement on SHNN-CAD for Adaptive Online Trajectory Anomaly Detection. Sensors 2019, 19, 84. https://doi.org/10.3390/s19010084

AMA Style

Guo Y, Bardera A. SHNN-CAD+: An Improvement on SHNN-CAD for Adaptive Online Trajectory Anomaly Detection. Sensors. 2019; 19(1):84. https://doi.org/10.3390/s19010084

Chicago/Turabian Style

Guo, Yuejun, and Anton Bardera. 2019. "SHNN-CAD+: An Improvement on SHNN-CAD for Adaptive Online Trajectory Anomaly Detection" Sensors 19, no. 1: 84. https://doi.org/10.3390/s19010084

APA Style

Guo, Y., & Bardera, A. (2019). SHNN-CAD+: An Improvement on SHNN-CAD for Adaptive Online Trajectory Anomaly Detection. Sensors, 19(1), 84. https://doi.org/10.3390/s19010084

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop