Efficient Ensemble Classification for Multi-Label Data Streams with Concept Drift
Abstract
:1. Introduction
- (1)
- To address the issue of concept drift, a change detection mechanism is employed to offer fast reactions to sudden concept changes. Moreover, a periodic weighting mechanism is adopted to cope with the gradual concept drift.
- (2)
- To deal with the problem of recurrent concept drift, the Jensen–Shannon divergence is selected as a metric to measure the distribution between two consecutive windows, which represent the older and the more recent instances, respectively. Moreover, a novel ensemble classification framework is presented, which maintains a pool of classifiers (each classifier represents one of the existing concepts) and predicts the class of incoming instances using a weighted majority voting rule. Once a change occurs, a new classifier is learned and then a new concept is identified and added to the pool.
- (3)
- A label dependency is taken into account by pruning away some infrequent label combinations to enhance classification performance.
2. Related Work
2.1. Basic Concepts and Notations
2.2. Multi-Label Learning Algorithms
2.3. Classifiers for Multi-Label Data Streams
3. Proposed Method
3.1. Change Detection Based on Jensen–Shannon Divergence
Algorithm 1 Two-windows-based change detection. |
input: data streams S, window size n; output: ChangeAlarm; 01: Initialization t = 0; 02: Set reference window W1 = {xt+1, …, xt+n}; 03: Set current window W2 = { xt+n+1, …, xt+2n}; 04: for each instance in S do 05: calculate d(W1, W2) according to Equation (2); 06: calculate ε according to Theorem 2; 07: if d(W1, W2) > ε then 08: t←current time; 09: Report a change alarm at time t; 10: Clear all the windows and goto 02; 11: else if d(W1, W2) = = 0 then 12: alarm a recurring concept; 13: end if 14: else slide W2 by 1 point; 15: end if 16: end for 17: end |
3.2. Ensemble Classifier for Multi-Label Data Streams Using Change Detection
3.2.1. The Framework of Our Method
3.2.2. Exploiting Label Dependency
3.2.3. Weighting Mechanism
Algorithm 2 MLAW Algorithm. |
input:S: multi-label data streams, k: number of ensemble members |
output:E: ensemble of k weighted classifiers |
01: begin |
02: E←Ø; |
03: for all instances xt ∈ S do |
04: W ← W ∪ {xt}; |
05: if change detected == true then |
06: create a new classifier C’; |
07: update the weight of all classifiers in ensemble; |
08: if |E| < k then E ← E ∪ C’; |
09: else prune the worst classifier; |
10: else if concept is recurring |
11: reuse the classifier in E; |
12: end if |
13: end if |
14: end if |
15: end for |
16: end |
4. Experiments
4.1. Datasets
4.2. Evaluation Metrics
- (1)
- Hamming loss: The Hamming loss averages the standard 0/1 classification error over the m labels and hence corresponds to the proportion of labels whose relevance is incorrectly predicted. Hamming loss is defined as Equation (6):
- (2)
- (3)
- F1 measure can be described as a weighted average of the recall and precision measures. The calculation equation is as follows:
- (4)
- Log-Loss distincts from other measures because it punishes worse errors more harshly, and thus provides a good contrast to other measures [20].
4.3. Methods and Results
4.3.1. Parameter Sensitiveness
4.3.2. Comparative Performance Study
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Cohen, L.; Avrahami-Bakish, G.; Last, M.; Kandel, A.; Kipersztok, O. Real-time data mining of non-stationary data streams from sensor networks. Inf. Fusion 2008, 9, 344–353. [Google Scholar] [CrossRef]
- Bhuiyan, H.; Ashiquzzaman, A.; Juthi, T.I. A Survey of existing E-mail spam filtering methods considering machine learning techniques. Glob. J. Comput. Sci. Technol. 2018, 18, 21–29. [Google Scholar]
- Costa, K.A.; Papa, J.P.; Lisboa, C.O.; Munoz, R.; Albuquerque, V.H.C. Internet of Things: A survey on machine learning-based intrusion detection approaches. Comput. Netw. 2019, 151, 147–157. [Google Scholar] [CrossRef]
- Livieris, I.E.; Kiriakidou, N.; Kanavos, A.; Tampakas, V.; Pintelas, P. On Ensemble SSL Algorithms for Credit Scoring Problem. Informatics 2018, 5, 40. [Google Scholar] [CrossRef]
- Gama, J. Knowledge Discovery from Data Streams; Chapman & Hall/CRC: London, UK, 2010. [Google Scholar]
- Domingos, P.; Hulten, G. Mining high-speed data streams. In Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2000), Boston, MA, USA, 20–23 August 2000; pp. 71–80. [Google Scholar]
- Read, J.; Bifet, A.; Holmes, G.; Pfahringer, B. Scalable and efficient multi-label classification for evolving data streams. Mach. Learn. 2012, 88, 243–272. [Google Scholar] [CrossRef] [Green Version]
- Tsymbal, A. The Problem of Concept Drift: Definitions and Related Work; Technical Report; Department of Computer Science, Trinity College: Dublin, Ireland, 2004. [Google Scholar]
- Gama, J.; Žliobaitė, I.; Bifet, A.; Pechenizkiy, M.; Bouchachia, A. A survey on concept drift adaptation. ACM Comput. Surv. 2014, 46, 231–238. [Google Scholar] [CrossRef]
- Livieris, I.E.; Kanavos, A.; Tampakas, V.; Pintelas, P. A Weighted Voting Ensemble Self-Labeled Algorithm for the Detection of Lung Abnormalities from X-Rays. Algorithms 2019, 12, 64. [Google Scholar] [CrossRef]
- Webb, G.I.; Hyde, R.; Cao, H.; Nguyen, H.L.; Petitjean, F. Characterizing concept drift. Data Mining Knowl. Discov. 2016, 30, 964–994. [Google Scholar] [CrossRef] [Green Version]
- Zhang, M.L.; Zhou, Z.H. A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 2014, 26, 1819–1837. [Google Scholar] [CrossRef]
- Zhou, Z.H. Ensemble Methods: Foundations and Algorithms; Chapman and Hall/CRC: London, UK, 2012. [Google Scholar]
- Tsoumakas, G.; Katakis, I.; Vlahavas, I. Mining Multi-Label Data. Data Mining and Knowledge Discovery Handbook; Springer: Boston, MA, USA, 2010; pp. 667–685. [Google Scholar]
- Clare, A.; King, R.D. Knowledge discovery in multi-label phenotype data. In Proceedings of the Fifth European Conference on Principles of Data Mining and Knowledge Discovery (PKDD 2001), Freiburg, Germany, 3–5 September 2001; pp. 42–53. [Google Scholar]
- Zhang, M.; Zhou, Z.H. Ml-knn: A lazy learning approach to multi-label learning. Patt. Recogn. 2007, 40, 2038–2048. [Google Scholar] [CrossRef]
- Zhang, M.L.; Zhou, Z.H. Multi-label neural networks with applications to functional genomics and text categorization. IEEE Trans. Knowl. Data Eng. 2006, 18, 1338–1351. [Google Scholar] [CrossRef]
- Schapire, R.E.; Singer, Y. Boostexter: A boosting-based system for text categorization. Mach. Learn. 2000, 39, 135–168. [Google Scholar] [CrossRef]
- Read, J.; Pfahringer, B.; Holmes, G.; Frank, E. Classifier chains for multi-label classification. Mach. Learn. 2011, 85, 333–359. [Google Scholar] [CrossRef] [Green Version]
- Cheng, W.; Hullermeier, E. Combining instance-based learning and logistic regression for multilabel classification. Mach. Learn. 2009, 76, 211–225. [Google Scholar] [CrossRef] [Green Version]
- Zhang, M.; Zhang, K. Multi-label learning by exploiting label dependency. In Proceedings of the Sixteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2010), Washington, DC, USA, 25–28 July 2010; pp. 999–1008. [Google Scholar]
- Dembczynski, K.; Waegeman, W.; Cheng, W.; Hüllermeier, E. On label dependence and loss minimization in multi-label classification. Mach. Learn. 2012, 88, 5–45. [Google Scholar] [CrossRef]
- Qu, W.; Zhang, Y.; Zhu, Y.J. Mining multi-label concept-drifting data streams using dynamic classifier ensemble. In Proceedings of the First Asian Conference on Machine Learning (ACML 2009, LNCS 5828), Nanjing, China, 2–4 November 2009; pp. 308–321. [Google Scholar]
- Kong, X.; Yu, P.S. An ensemble-based approach to fast classification of multi-label data streams. In Proceedings of the Seventh International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom 2011), Orlando, FL, USA, 15–18 October 2011; pp. 95–104. [Google Scholar]
- Read, J.; Bifet, A.; Pfahringer, B. Efficient Multi-Label Classification for Evolving Data Streams; Technical Report; University of Waikato: Hamilton, New Zealand, 2011. [Google Scholar]
- Xioufis, E.S.; Spiliopoulou, M.; Tsoumakas, G. Dealing with concept drift and class imbalance in multi-label stream classification. In Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence (IJCAI 2011), Barcelona, Spain, 16–22 July 2011; pp. 1583–1588. [Google Scholar]
- Shi, Z.; Wen, Y.; Feng, C. Drift detection for multi-label data streams based on label grouping and entropy. In Proceedings of the Fourteenth International Conference on Data Mining Workshop (ICDM 2014), Shenzhen, China, 14 December 2014; pp. 724–731. [Google Scholar]
- Aljaž, O.; Panov, P.; Džeroski, S. Multi-label classification via multi-target regression on data streams. Mach. Learn. 2017, 106, 745–770. [Google Scholar]
- Roseberry, M.; Cano, A. Multi-label kNN Classifier with Self Adjusting Memory for Drifting Data Streams. In Proceedings of the Second International Workshop on Learning with Imbalanced Domains: Theory and Applications, Dublin, Ireland, 10–14 September 2018; pp. 23–37. [Google Scholar]
- Büyükçakir, A.; Bonab, H.; Can, F. A novel online stacked ensemble for multi-label stream classification. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, Torino, Italy, 22–26 October 2018; pp. 1063–1072. [Google Scholar]
- Gama, J.; Medas, P.; Castillo, G.; Rodrigues, P.P. Learning with drift detection. In Proceedings of the Seventeenth Brazilian Symposium on Artificial Intelligence (SBIA 2004, LNCS 3171), São Luis, Maranhão, Brazil, 29 September–1 October 2004; pp. 286–295. [Google Scholar]
- Bifet, A.; Gavalda, R. Learning from time-changing data with adaptive windowing. In Proceedings of the Seventh SIAM International Conference on Data Mining (SDM 2007), Bethesda, MD, USA, 20–22 April 2006; pp. 443–448. [Google Scholar]
- Ross, G.J.; Adams, N.M.; Tasoulis, D.K. Exponentially weighted moving average charts for detecting concept drift. Patt. Recogn. Lett. 2012, 33, 191–198. [Google Scholar] [CrossRef] [Green Version]
- Kullback, S.; Leibler, R.A. On information and sufficiency. Ann. Math. Statist. 1951, 22, 79–86. [Google Scholar] [CrossRef]
- Sun, Y.; Wang, Z.; Liu, H. Online Ensemble Using Adaptive Windowing for Data Streams with Concept Drift. Int. J. Distrib. Sens. Netw. 2016, 1–9. [Google Scholar] [CrossRef]
- Bifet, A.; Holmes, G.; Kirkby, R.; Pfahringer, B. MOA: Massive online analysis. J. Mach. Learn. Res. 2010, 11, 1601–1604. [Google Scholar] [CrossRef]
Dataset | Domain | |D| | |X| | |L| | LC |
---|---|---|---|---|---|
TMC2007-500 | Text | 28,596 | 500b | 22 | 2.219 |
Medical Mill | Video | 43,907 | 120n | 101 | 4.376 |
IMDB | Image | 95,424 | 1001b | 28 | 1.999 |
Random Tree | Tree-based | 100,000 | 30b | 6 | 1.8 → 3.0 |
RBF | RBF-based | 100,000 | 80n | 25 | 1.5 → 3.5 |
Sizes of Windows | |||||||
---|---|---|---|---|---|---|---|
500 | 750 | 1000 | 1250 | 1500 | 1750 | 2000 | |
TMC2007-500 | 0.0524 | 0.0498 | 0.0579 | 0.0603 | 0.0599 | 0.0625 | 0.694 |
Medical Mill | 0.0132 | 0.0126 | 0.0109 | 0.0218 | 0.0179 | 0.0230 | 0.0289 |
IMDB | 0.0993 | 0.0963 | 0.0986 | 0.0850 | 0.0911 | 0.0979 | 0.0985 |
Random Tree | 0.2030 | 0.1183 | 0.1035 | 0.1891 | 0.2100 | 0.2301 | 0.2354 |
RBF | 0.0753 | 0.0698 | 0.0650 | 0.0642 | 0.0653 | 0.0668 | 0.0697 |
Sizes of Windows | |||||||
---|---|---|---|---|---|---|---|
500 | 750 | 1000 | 1250 | 1500 | 1750 | 2000 | |
TMC2007-500 | 71.69 | 72.31 | 72.23 | 72.01 | 71.61 | 72.01 | 71.25 |
Medical Mill | 76.68 | 77.59 | 78.69 | 78.12 | 78.02 | 77.12 | 77.01 |
IMDB | 81.16 | 81.25 | 81.36 | 81.54 | 80.95 | 81.53 | 82.01 |
Random Tree | 93.16 | 93.30 | 93.26 | 93.10 | 93.01 | 93.05 | 90.00 |
RBF | 88.58 | 87.90 | 89.89 | 89.05 | 89.49 | 89.71 | 89.69 |
ML-SAM-knn | MHT | MLOzaBag | MLAW | |
---|---|---|---|---|
TMC2007-500 | 0.0475 (2) | 0.1706 (4) | 0.0521 (3) | 0.0509 (1) |
Medical Mill | 0.1013 (3) | 0.0665 (2) | 0.0392 (1) | 0.1039 (4) |
IMDB | 0.0456 (1) | 0.1095 (3) | 0.1304 (4) | 0.0986 (2) |
Random Tree | 0.0541 (2) | 0.0614 (4) | 0.0589 (3) | 0.0135 (1) |
RBF | 0.0811 (2) | 0.1400 (4) | 0.0934 (3) | 0.0650 (1) |
Average Rank | 2.00 | 3.40 | 2.80 | 1.80 |
ML-SAM-knn | MHT | MLOzBag | MLAW | |
---|---|---|---|---|
TMC2007-500 | 0.213 (3) | 0.170 (4) | 0.521 (2) | 0.673 (1) |
Medical Mill | 0.197 (4) | 0.265 (3) | 0.392 (1) | 0.301 (2) |
IMDB | 0.154 (2) | 0.195 (1) | 0.134 (3) | 0.129 (4) |
Random Tree | 0.550 (4) | 0.574 (3) | 0.689 (2) | 0.792 (1) |
RBF | 0.678 (3) | 0.590 (4) | 0.746 (1) | 0.703 (2) |
Average Rank | 3.20 | 3.00 | 1.80 | 2.00 |
ML-SAM-knn | MHT | MLOzBag | MLAW | |
---|---|---|---|---|
TMC2007-500 | 7.9 (3) | 7.5 (2) | 8.2 (4) | 6.2 (1) |
Medical Mill | 15.2 (3) | 9.8 (1) | 19.8 (4) | 13.0 (2) |
IMDB | 5.3 (1) | 19.0 (4) | 7.1 (3) | 9.3 (2) |
Random Tree | 6.4 (2) | 17.5 (4) | 7.6 (3) | 3.8 (1) |
RBF | 3.7 (2) | 5.8 (4) | 4.6 (3) | 2.7 (1) |
Average Rank | 2.20 | 3.00 | 3.40 | 1.40 |
ML-SAM-knn | MHT | MLOzBag | MLAW | |
---|---|---|---|---|
TMC2007-500 | 0.214 (3) | 0.026 (4) | 0.285 (2) | 0.457 (1) |
Medical Mill | 0.028 (4) | 0.034 (3) | 0.042 (2) | 0.103 (1) |
IMDB | 0.105 (1) | 0.021 (4) | 0.087 (2) | 0.016 (4) |
Random Tree | 0.173 (3) | 0.103 (4) | 0.261 (1) | 0.254 (2) |
RBF | 0.389 (2) | 0.274 (4) | 0.378 (3) | 0.436 (1) |
Average Rank | 2.60 | 3.80 | 2.00 | 1.80 |
ML-SAM-knn | MHT | MLOzBag | MLAW | |
---|---|---|---|---|
TMC2007-500 | 8 (1) | 14 (2) | 45 (4) | 36 (3) |
Medical Mill | 34 (1) | 71 (2) | 457 (4) | 89 (3) |
IMDB | 285 (1) | 680 (4) | 359 (2) | 543 (3) |
Random Tree | 38 (3) | 19 (4) | 87 (2) | 9 (1) |
RBF | 113 (1) | 133 (2) | 201 (4) | 152 (3) |
Average Rank | 1.40 | 2.40 | 3.20 | 2.60 |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sun, Y.; Shao, H.; Wang, S. Efficient Ensemble Classification for Multi-Label Data Streams with Concept Drift. Information 2019, 10, 158. https://doi.org/10.3390/info10050158
Sun Y, Shao H, Wang S. Efficient Ensemble Classification for Multi-Label Data Streams with Concept Drift. Information. 2019; 10(5):158. https://doi.org/10.3390/info10050158
Chicago/Turabian StyleSun, Yange, Han Shao, and Shasha Wang. 2019. "Efficient Ensemble Classification for Multi-Label Data Streams with Concept Drift" Information 10, no. 5: 158. https://doi.org/10.3390/info10050158
APA StyleSun, Y., Shao, H., & Wang, S. (2019). Efficient Ensemble Classification for Multi-Label Data Streams with Concept Drift. Information, 10(5), 158. https://doi.org/10.3390/info10050158