Time Series Classification with Shapelet and Canonical Features
Abstract
:1. Introduction
2. Related Work
2.1. Classification Method Based on Shapelet
2.2. Other Types of Classification Methods
3. Definition and Background
3.1. Definitions
3.2. Shapelet Transforamtion Technology
Algorithm. 1 shapeletTransform(D,min,max,k) |
Input: Dataset D, Shapelet minimum length min, Shapelet maximum length max, number of Shapelets k Output: Transformed dataset D’ 1. Shapelets←findBestKShapelets(D,min,max,k); 2. D’←{F1, F2,⋯,Fn}; 3. for Ti in D 4. Fi←null; 5. for Shapelet Sj in Shapelets 6. dist, S’←subDist(Sj,Ti); 7. Fi.add(dist); 8. end for 9. Fi.add(yi); 10. end for 11. return D’; |
3.3. Typical Time Series Characteristics (Catch22)
4. Algorithm
4.1. Random Shapelet Transformation Embedded with Typical Time Series Features
Algorithm 2. randShapeletTransform-CF(D, min, max, k, a, shift) |
Input: Dataset D, Shapelet minimum length min, Shapelet maximum length max, number of Shapelets k, number of typical time series features a, Shapelet maximum offset shift Output: Randomly selected Shapelets and their locations, randomly selected feature index Atts, transformed dataset D’ 1. Shapelets, Locations←randSampShapelets(D,min,max,k); 2. Atts←randSampAttIndices({1,2,⋯,25},a); 3. D’←{F1, F2,⋯,Fn}; 4. for Ti in D 5. Fi←null; 6. for Shapelet Sj in Shapelets 7. dist, S’←restrictedSubDist(Sj,Ti,shift,Locations[j]); 8. Fi.add(dist); 9. for c←1 to a 10. val←computeFeature(S’,Atts[c]); 11. Fi.add(val); 12. end for 13. end for 14. Fi.add(yi); 15. end for 16. return Shapelets, Locations, Atts, D’; |
4.2. Ensemble Classification Model Building
Algorithm 3. buildRSFCF(D,min,max,k,a,shift,r) |
Input: Dataset D, Shapelet minimum length min, Shapelet maximum length max, number of Shapelets k, number of typical time series features a, maximum offset shift, number of decision trees r Output: RSFCF classification model 1. RSFCFModel←{Tree1,Tree2,⋯,Treer}; 2. for i←1 to r 3. Shapelets, Locations, Atts, D’←randShapelet Transform-CF(D,min,max,k,a,shift); 4. Treei←buildTimeSeriesTree(D’); 5. Treei.add(Shapelets); 6. Treei.add(Locations); 7. Treei.add(Atts); 8. end for 9. return RSFCFModel; |
Algorithm 4. classification(T,RSFCFModel,shift) |
Inputs: Time series to be classified T, Classification model RSFCFModel, Shapelet maximum offset shift Output: Category y of time series T 1. Y←null; 2. for Treei in RSFCFModel 3. T’←instanceTransform(T,Treei.Shapelets,Treei.Locations,Treei.Atts,shift); 4. yi←Treei.classifyInstance(T’); 5. Y.add(yi); 6. end for 7. y←majorityVote(Y); 8. return y; |
4.3. Time Complexity Analysis
5. Experimental Analysis
5.1. Parameter Settings
5.2. Ablation Experiment
5.3. Accuracy Comparison
5.4. Efficiency Comparison
5.5. Design Strategy Validation
6. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Grabocka, J.; Schilling, N.; Wistuba, M.; Schmidt-Thieme, L. Learning time-series shapelets. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM 2014), Samelsonplatz, Hildesheim, Germany, 24 August 2014; pp. 392–401. [Google Scholar]
- Yan, W.H.; Li, G.L. Research on time series classification based on shapelet. Comput. Sci. 2019, 46, 29–35. [Google Scholar]
- Middlehurst, M.; Large, J.; Bagnall, A. The canonical interval forest (CIF) classifier for time series classification. arXiv 2010, arXiv:2008.09172. [Google Scholar]
- Lee, W.; Park, S.; Joo, W.; Moon, C.I. Diagnosis prediction via medical context attention networks using deep generative modeling. In Proceedings of the 2018 IEEE International Conference on Data Mining, Singapore, 17–20 November 2018; pp. 1104–1109. [Google Scholar]
- Lucas, B.; Shifaz, A.; Pelletier, C.; O’Neill, L.; Zaidi, N.; Goethals, B.; Petitjean, F.; Webb, G.I. Proximity forest: An effective and scalable distance-based classifier for time series. Data Min. Knowl. Discov. 2019, 33, 607–635. [Google Scholar] [CrossRef]
- Fawaz, H.I.; Forestier, G.; Weber, J.; Idoumghar, L.; Muller, P.A. Deep learning for time series classification: A review. Data Min. Knowl. Discov. 2019, 33, 917–963. [Google Scholar] [CrossRef]
- Dau, H.A.; Keogh, E.; Kamgar, K.; Weber, J.; Idoumghar, L.; Muller, P.-A. The UCR Time Series Classification Archive. Available online: https://www.cs.ucr.edu/~eamonn/time_series_data_2018 (accessed on 6 June 2022).
- Ye, L.; Keogh, E. Time series shapelets: A novel technique that allows accurate, interpretable and fast classification. Data Min. Knowl. Discov. 2011, 22, 149–182. [Google Scholar] [CrossRef]
- Rakthanmanon, T.; Keogh, E. Fast shapelets: A scalable algorithm for discovering time series shapelets. In Proceedings of the 13th SIAM International Conference on Data Mining (SIAM), Austin, TX, USA, 2–4 May 2013; pp. 668–676. [Google Scholar]
- Karlsson, I.; Papapetrou, P.; Boström, H. Generalized random shapelet forests. Data Min. Knowl. Discov. 2016, 30, 1053–1085. [Google Scholar] [CrossRef]
- Li, G.L.; Yan, W.H.; Wu, Z.D. Discovering shapelets with key points in time series classification. Expert Syst. Appl. 2019, 132, 76–86. [Google Scholar] [CrossRef]
- Zhang, Z.G.; Zhang, H.W.; Wen, Y.L.; Zhang, Y.; Yuan, X. Discriminative extraction of features from time series. Neurocomputing 2018, 275, 2317–2328. [Google Scholar] [CrossRef]
- Mueen, A.; Keogh, E.; Young, N. Logical-shapelets: An expressive primitive for time series classification. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM), San Diego, CA, USA, 11–24 August 2011; pp. 1154–1162. [Google Scholar]
- Deng, H.; Runger, G.; Tuv, E.; Vladimir, M. A time series forest for classification and feature extraction. Inf. Sci. 2013, 239, 142–153. [Google Scholar] [CrossRef]
- Lubba, C.H.; Sethi, S.S.; Knaute, P.; Schultz, S.R.; Fulcher, B.D.; Jones, N.S. Catch22: Canonical time-series characteristics. Data Min. Knowl. Discov. 2019, 33, 1821–1852. [Google Scholar] [CrossRef]
- Hills, J.; Lines, J.; Baranauskas, E.; Mapp, J.; Bagnall, A. Classification of time series by shapelet transformation. Data Min. Knowl. Discov. 2014, 28, 851–881. [Google Scholar] [CrossRef]
- Bostrom, A.; Bagnall, A. Binary shapelet transform for multiclass time series classification. In Transactions on Large-Scale Data-and Knowledge-Centered Systems XXXII; Springer: Berlin/Heidelberg, Germany, 2017; pp. 24–46. [Google Scholar]
- Bagnall, A.; Lines, J.; Vickers, W. The UEA & UCR Time Series Classification Repository. Available online: http://www.timeseriesclassification.com (accessed on 6 June 2022).
- Wang, Z.; Yan, W.; Oates, T. Time series classification from scratch with deep neural networks: A strong baseline. In Proceedings of the 2017 International Joint Conference on Neural Networks, Anchorage, AK, USA, 14–19 May 2017; pp. 1578–1585. [Google Scholar]
- Lin, J.; Keogh, E.; Wei, L.; Lonardi, S. Experiencing SAX: A novel symbolic representation of time series. Data Min. Knowl. Discov. 2007, 15, 107–144. [Google Scholar] [CrossRef] [Green Version]
- Lines, J.; Taylor, S.; Bagnall, A. Time series classification with HIVE-COTE: The hierarchical vote collective of transformation-based ensembles. ACM Trans. Knowl. Discov. Data 2018, 12, 1–35. [Google Scholar] [CrossRef]
- Lin, J.; Khade, R.; Li, Y. Rotation-invariant similarity in time series using bag-of-patterns representation. J. Intell. Inf. Syst. 2012, 39, 287–315. [Google Scholar] [CrossRef]
- Senin, P.; Malinchik, S. SAX-VSM: Interpretable time series classification using SAX and vector space model. In Proceedings of the 2013 IEEE 13rd International Conference on Data Mining, Dallas, TX, USA, 7–10 December 2013; pp. 1175–1180. [Google Scholar]
- Schäfer, P. The BOSS is concerned with time series classification in the presence of noise. Data Min. Knowl. Discov. 2015, 29, 1505–1530. [Google Scholar] [CrossRef]
- Schäfer, P.; Leser, U. Fast and accurate time series classification with WEASEL. In Proceedings of the 2017 ACM Conference on Information and Knowledge Management, Singapore, 6–10 November 2017; pp. 637–646. [Google Scholar]
- Schäfer, P.; Högqvist, M. SFA: A symbolic fourier approximation and index for similarity search in high dimensional datasets. In Proceedings of the 15th International Conference on Extending Database Technology, Berlin, Germany, 26–29 March 2012; pp. 516–527. [Google Scholar]
- Middlehurst, M.; Vickers, W.; Bagnall, A. Scalable dictionary classifiers for time series classification. In Proceedings of the International Conference on Intelligent Data Engineering and Automated Learning, Manchester, UK, 14–16 November 2019; pp. 11–19. [Google Scholar]
- Yuan, J.D.; Lin, Q.H.; Zhang, W.; Wang, Z.H. Locally slope-based dynamic time warping for time series classification. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (ACM 2019), Beijing, China, 3–7 November 2019; p. 17131722. [Google Scholar]
- Bagnall, A.; Lines, J.; Bostrom, A.; Large, J.; Keogh, E. The great time series classification bake off: A review and experimental evaluation of recent algorithmic advances. Data Min. Knowl. Discov. 2017, 31, 606–660. [Google Scholar] [CrossRef] [PubMed]
- Lines, J.; Bagnall, A. Time series classification with ensembles of elastic distance measures. Data Min. Knowl. Discov. 2015, 29, 565–592. [Google Scholar] [CrossRef]
- Bagnall, A.; Lines, J.; Hills, J. Time-series classification with COTE: The collective of transformation-based ensembles. IEEE Trans. Knowl. Data Eng. 2015, 27, 2522–2535. [Google Scholar] [CrossRef]
- Demšar, J. Statistical comparisons of classifiers over multiple datasets. J. Mach. Learn. Res. 2006, 7, 1–30. [Google Scholar]
Parameter | Description | Value |
---|---|---|
min | Shapelet minimum length | 3 |
max | Shapelet maximum length | l |
k | The number of randomly selected Shapelets for each tree | |
a | The number of features used in the data transformation | 8 |
shift | The maximum offset of Shapelet | l/10 |
r | The number of trees in the RSFCF classification model | 500 |
Dataset | Accuracy | Dataset | Accuracy | Dataset | Accuracy | Dataset | Accuracy | Dataset | Accuracy |
---|---|---|---|---|---|---|---|---|---|
ACSF1 | 86.00 | ECG200 | 86.00 | Herring | 67.19 | PhaOC | 82.52 | Strwbe | 96.76 |
Adiac | 77.75 | ECG5000 | 94.53 | House20 | 97.48 | Phoneme | 38.19 | SwdLeaf | 95.20 |
ArrHead | 81.71 | ECG5D | 100.00 | InlSka | 47.45 | PigAirP | 40.38 | Symbols | 98.69 |
Beef | 76.67 | ElecDev | 74.57 | InEPGRT | 100.00 | PigArtP | 97.60 | SynCtl | 99.33 |
BeetFly | 85.00 | EOGHS | 65.47 | InEPGST | 100.00 | PigCVP | 60.58 | ToeSeg1 | 96.93 |
BirdChi | 90.00 | EOGVS | 56.63 | InWSnd | 65.35 | Plane | 100.00 | ToeSeg2 | 89.23 |
BME | 100.00 | EthLevel | 57.60 | ItPwDem | 96.02 | PowCons | 99.44 | Trace | 100.00 |
Car | 90.00 | FaceA | 76.51 | LKitApp | 80.00 | PrPhOAG | 82.93 | TwLECG | 99.82 |
CBF | 100.00 | FaceF | 100.00 | Light2 | 77.05 | PrPhOC | 87.63 | TwPatt | 99.80 |
Chtown | 97.38 | FacesU | 92.63 | Light7 | 75.34 | PrPhTW | 80.49 | UMD | 97.22 |
ChConc | 70.89 | 50Words | 78.68 | Mallat | 98.59 | RefDev | 58.13 | UWaAll | 97.54 |
CinCECG | 93.70 | Fish | 97.14 | Meat | 93.33 | Rock | 86.00 | UWaX | 82.72 |
Coffee | 100.00 | FordA | 92.27 | MdImg | 74.87 | ScrType | 52.00 | UWaY | 76.33 |
Comput | 72.80 | FordB | 77.90 | MdPhOAG | 62.99 | SeHGCh2 | 94.50 | UWaZ | 77.44 |
CriketX | 76.67 | FreeRT | 100.00 | MdPhOC | 83.16 | SeHMCh2 | 88.89 | Wafer | 99.74 |
CriketY | 81.54 | FreeST | 99.93 | MdPhTW | 58.44 | SeHSCh2 | 93.56 | Wine | 70.37 |
CriketZ | 83.08 | GunP | 99.33 | MixSRT | 95.75 | ShpSim | 98.33 | WordSyn | 70.53 |
Crop | 77.00 | GunPAS | 99.05 | MixSST | 94.14 | ShpAll | 85.67 | Worms | 75.32 |
DiaSRed | 85.29 | GunPMF | 99.68 | MtStr | 93.37 | SKitApp | 81.87 | WormsTC | 79.22 |
DiPhOAG | 74.82 | GunPOY | 100.00 | NoECGT1 | 93.18 | SmthSub | 98.67 | Yoga | 89.47 |
DiPhOC | 78.26 | Ham | 76.19 | NoECGT2 | 94.71 | SonyRS1 | 83.19 | ||
DiPhTW | 69.06 | HandOut | 92.70 | OliOil | 83.33 | SonyRS2 | 85.73 | ||
Earthqu | 74.82 | Haptics | 53.57 | OSULeaf | 75.62 | StarCur | 98.13 |
RSFCF (8 Features) | Catch22 (22 Features) | Meanstdslope (3 Features) | |
---|---|---|---|
Accuracy | 84.52 | 84.50 | 82.52 |
Comparison of Algorithms | Win/Draw/Loss |
---|---|
HIVE-COTE | 36/12/64 |
STC | 65/10/37 |
ResNet | 54/13/45 |
CIF | 62/19/31 |
WEASEL | 63/11/38 |
Proximity Forest | 64/14/34 |
gRSF | 60/12/40 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, H.-Y.; Gao, Z.-Z.; Wang, Z.-H.; Deng, Y.-H. Time Series Classification with Shapelet and Canonical Features. Appl. Sci. 2022, 12, 8685. https://doi.org/10.3390/app12178685
Liu H-Y, Gao Z-Z, Wang Z-H, Deng Y-H. Time Series Classification with Shapelet and Canonical Features. Applied Sciences. 2022; 12(17):8685. https://doi.org/10.3390/app12178685
Chicago/Turabian StyleLiu, Hai-Yang, Zhen-Zhuo Gao, Zhi-Hai Wang, and Yun-Hao Deng. 2022. "Time Series Classification with Shapelet and Canonical Features" Applied Sciences 12, no. 17: 8685. https://doi.org/10.3390/app12178685
APA StyleLiu, H. -Y., Gao, Z. -Z., Wang, Z. -H., & Deng, Y. -H. (2022). Time Series Classification with Shapelet and Canonical Features. Applied Sciences, 12(17), 8685. https://doi.org/10.3390/app12178685