A Novel Electricity Theft Detection Scheme Based on Text Convolutional Neural Networks
Abstract
:1. Introduction
- (1)
- We analyze the electricity data structure and transform it into a two-dimensional time-series. This structure carries the complete power consumption information of users, which means the consumption patterns of various time scales, such as the electricity consumption at the same period on different days and the daily consumption of different days.
- (2)
- We propose a novel electricity theft detecting method based on TextCNN. The proposed method can extract features of different time scales from two-dimensional time-series. To improve the accuracy and efficiency of training and detection, we designed our detection network based on TextCNN. To test the performance, we implemented extensive experiments on the residential and industrial datasets from a province in China and the public Irish residential dataset.
- (3)
- We propose the data augmentation method to expand the training data in view of the shortage of electricity theft samples. Experimental analysis indicates that the method can improve the detection accuracy effectively with a proper augmentation process.
2. Methodology
2.1. Data Structure Analysis
2.2. CNN Structure Analysis
2.2.1. Basic Introduction to CNN
2.2.2. Differences between CNN and TextCNN
3. Proposed Approach
3.1. Data Preprocess
3.2. Proposed Neural Network Structure Based on TextCNN
3.2.1. Convolutional Layer
3.2.2. Pooling Layer
3.2.3. Fully-Connected Layer
3.2.4. Parameters of the Proposed Neural Network
3.3. Data Augmentation
4. Experimental Settings
4.1. Datasets
- (a)
- Residential user dataset
- (b)
- Industrial user dataset
- (c)
- Ireland residential user dataset
4.2. Baselines
- Logistic regression (LR). Logistic regression is a statistical model that models the probabilities for classification problems with the dependent variable being binary. It uses maximum likelihood estimation to estimate regression model coefficients that explain the relationship between input and output.
- Support vector machine (SVM). A support vector machine is a supervised learning model and can be used for classification. It uses a kernel trick to map the input into high-dimensional feature spaces implicitly. Then, SVMs construct hyperplane in high-dimensional space, and the hyperplane can be used for classification.
- Deep neural network (DNN). A deep neural network is a feedforward neural network with multilayered hidden layers. DNN can model complex non-linear relationships through the neurons in the hidden layer, which can be used for classification problem. Moreover, backpropagation algorithm is used to update the weight in DNN, because it can compute the gradient of the loss function with respect to the weights of the network efficiently.
- One-dimensional CNN (1D-CNN). The 1D-CNN is a classifier model which is similar to the proposed model. However, the user data are 1D electricity consumption data, and the dimensions of input data would be . The structure of 1D CNN is the same as the proposed model mentioned in Section 3.2.
4.3. Metrics
5. Results and Analysis
5.1. Performance Comparison
5.2. Parameter Study
5.3. Data Augmentation Analysis
6. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Depuru, S.S.S.R.; Wang, L.; Devabhaktuni, V. Electricity theft: Overview, issues, prevention and a smart meter based approach to control theft. Energy Policy 2011, 39, 1007–1015. [Google Scholar] [CrossRef]
- Venkatachary, S.K.; Prasad, J.; Samikannu, R. Overview, issues and prevention of energy theft in smart grids and virtual power plants in Indian context. Energy Policy 2017, 110, 365–374. [Google Scholar] [CrossRef]
- Northeast Group, LLC. Electricity Theft & Non-Technical Losses: Global Markets, Solutions, and Vendors. Available online: http://www.northeast-group.com/reports/Brochure-Electricity%20Theft%20&%20Non-Technical%20Losses%20-%20Northeast%20Group.pdf (accessed on 20 September 2020).
- Liu, Z. Over 110 MWh in 35 Years, Electricity Theft Arrested in Shaoyang. Available online: http://epaper.voc.com.cn/sxdsb/html/2018-08/02/content_1329743.htm?div=-1 (accessed on 20 September 2020).
- Messinis, G.M.; Hatziargyriou, N.D. Review of non-technical loss detection methods. Electr. Power Syst. Res. 2018, 158, 250–266. [Google Scholar] [CrossRef]
- Short, T.A. Advanced Metering for Phase Identification, Transformer Identification, and Secondary Modeling. IEEE Trans. Smart Grid 2013, 4, 651–658. [Google Scholar] [CrossRef]
- Leite, J.B.; Mantovani, J.R.S. Detecting and Locating Non-Technical Losses in Modern Distribution Networks. IEEE Trans. Smart Grid 2018, 9, 1023–1032. [Google Scholar] [CrossRef] [Green Version]
- Jiang, R.; Lu, R.; Wang, Y.; Luo, J.; Shen, C.; Shen, X. Energy-theft detection issues for advanced metering infrastructure in smart grid. Tsinghua Sci. Technol. 2014, 19, 105–120. [Google Scholar] [CrossRef]
- Glauner, P.; Dahringer, N.; Puhachov, O.; Meira, J.A.; Valtchev, P.; State, R.; Duarte, D. Identifying Irregular Power Usage by Turning Predictions into Holographic Spatial Visualizations. In Proceedings of the 2017 IEEE International Conference on Data Mining Workshops (ICDMW), New Orleans, LA, USA, 18–21 November 2017; pp. 258–265. [Google Scholar]
- Buzau, M.-M.; Tejedor-Aguilera, J.; Cruz-Romero, P.; Gomez-Exposito, A. Hybrid Deep Neural Networks for Detection of Non-Technical Losses in Electricity Smart Meters. IEEE Trans. Power Syst. 2020, 35, 1254–1263. [Google Scholar] [CrossRef]
- Jokar, P.; Arianpoo, N.; Leung, V.C.M. Electricity Theft Detection in AMI Using Customers’ Consumption Patterns. IEEE Trans. Smart Grid 2016, 7, 216–226. [Google Scholar] [CrossRef]
- Nagi, J.; Yap, K.S.; Tiong, S.K.; Ahmed, S.K.; Mohamad, M. Nontechnical Loss Detection for Metered Customers in Power Utility Using Support Vector Machines. IEEE Trans. Power Deliv. 2010, 25, 1162–1171. [Google Scholar] [CrossRef]
- Nagi, J.; Yap, K.S.; Tiong, S.K.; Ahmed, S.K.; Nagi, F. Improving SVM-Based Nontechnical Loss Detection in Power Utility Using the Fuzzy Inference System. IEEE Trans. Power Deliv. 2011, 26, 1284–1285. [Google Scholar] [CrossRef]
- Jindal, A.; Dua, A.; Kaur, K.; Singh, M.; Kumar, N.; Mishra, S. Decision Tree and SVM-Based Data Analytics for Theft Detection in Smart Grid. IEEE Trans. Ind. Inf. 2016, 12, 1005–1016. [Google Scholar] [CrossRef]
- Wu, R.; Wang, L.; Hu, T. AdaBoost-SVM for Electrical Theft Detection and GRNN for Stealing Time Periods Identification. In Proceedings of the IECON 2018—44th Annual Conference of the IEEE Industrial Electronics Society, Washington, DC, USA, 21–23 October 2018; pp. 3073–3078. [Google Scholar]
- Adil, M.; Javaid, N.; Qasim, U.; Ullah, I.; Shafiq, M.; Choi, J.-G. LSTM and Bat-Based RUSBoost Approach for Electricity Theft Detection. Appl. Sci. 2020, 10, 4378. [Google Scholar] [CrossRef]
- Zheng, Z.; Yang, Y.; Niu, X.; Dai, H.-N.; Zhou, Y. Wide and Deep Convolutional Neural Networks for Electricity-Theft Detection to Secure Smart Grids. IEEE Trans. Ind. Inf. 2018, 14, 1606–1615. [Google Scholar] [CrossRef]
- Hasan, M.N.; Toma, R.N.; Nahid, A.-A.; Islam, M.M.M.; Kim, J.-M. Electricity Theft Detection in Smart Grid Systems: A CNN-LSTM Based Approach. Energies 2019, 12, 3310. [Google Scholar] [CrossRef] [Green Version]
- Kim, T.T.; Poor, H.V. Strategic Protection Against Data Injection Attacks on Power Grids. IEEE Trans. Smart Grid 2011, 2, 326–333. [Google Scholar] [CrossRef]
- Zanetti, M.; Jamhour, E.; Pellenz, M.; Penna, M.; Zambenedetti, V.; Chueiri, I. A Tunable Fraud Detection System for Advanced Metering Infrastructure Using Short-Lived Patterns. IEEE Trans. Smart Grid 2019, 10, 830–840. [Google Scholar] [CrossRef]
- Wang, X. Analysis of Typical Electricity Theft Cases—Adjust the Metering Time of Meters to Avoid the Peak Period Tariffs. Available online: https://www.zhangqiaokeyan.com/academic-conference-cn_meeting-7953_thesis/020222030513.html (accessed on 22 September 2020).
- Han, W.; Xiao, Y. Combating TNTL: Non-Technical Loss Fraud Targeting Time-Based Pricing in Smart Grid. In Proceedings of the Cloud Computing and Security, Nanjing, China, 29–31 July 2016; Volume 10040, pp. 48–57. [Google Scholar]
- Dhillon, A.; Verma, G.K. Convolutional neural network: A review of models, methodologies and applications to object detection. Prog. Artif. Intell. 2020, 9, 85–112. [Google Scholar] [CrossRef]
- Jiao, J.; Zhao, M.; Lin, J.; Liang, K. A comprehensive review on convolutional neural network in machine fault diagnosis. Neurocomputing 2020, 417, 36–63. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Girshick, R. Fast R-CNN. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Kim, Y. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1746–1751. [Google Scholar]
- Zhang, Y.; Wallace, B. A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification. In Proceedings of the Eighth International Joint Conference on Natural Language Processing, Taipei, Taiwan, 27 November–1 December 2017; Volume 1, pp. 253–263. [Google Scholar]
- Szegedy, C.; Wei, L.; Yangqing, J.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. In Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the Inception Architecture for Computer Vision. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
- Chandola, V.; Banerjee, A.; Kumar, V. Anomaly detection: A survey. ACM Comput. Surv. 2009, 41, 15:1–15:58. [Google Scholar] [CrossRef]
- Haixiang, G.; Yijing, L.; Shang, J.; Mingyun, G.; Yuanyue, H.; Bing, G. Learning from class-imbalanced data: Review of methods and applications. Expert Syst. Appl. 2017, 73, 220–239. [Google Scholar] [CrossRef]
- Zhou, L. Performance of corporate bankruptcy prediction models on imbalanced dataset: The effect of sampling methods. Knowl. Based Syst. 2013, 41, 16–25. [Google Scholar] [CrossRef]
Datasets | (a) | (b) | (c) |
---|---|---|---|
Time | 1 October 2015–31 March 2019 | 1 October 2015–31 March 2019 | 1 January 2009–31 December 2010 |
Total uses | 4627 | 8144 | 5000 |
Normal uses | 3564 | 8052 | 5000 |
Electricity thieves | 1063 | 92 | 0 |
Baselines | Data Dimension | Parameters |
---|---|---|
LR | 1-D | Penalty: L1 Solver: Liblinear Inverse of regularization strength: 1 |
SVM | 1-D | Regularization parameter: 1.0 Kernel: RBF |
DNN | 1-D | Hidden layer: 3 Neurons in the hidden layer: 100, 60, 60 |
1D-CNN | 1-D | Same as parameters of the proposed method |
Proposed Method | 2-D | Introduced in Section 3.2.4 |
Confusion Matrix | Actual | ||
---|---|---|---|
Negative (Normal) | Positive (Theft) | ||
Classified | Negative (normal) | True Negative (TN) | False Negative (FN) |
Positive (theft) | False Positive (FP) | True Positive (TP) |
Training = 50% | ||||||||||||
Model | Dataset (a) | Dataset (b) | Dataset (c) | |||||||||
AR | PR | RR | F1 | AR | PR | RR | F1 | AR | PR | RR | F1 | |
LR | 0.851 | 0.623 | 0.488 | 0.547 | 0.577 | 0.487 | 0.442 | 0.463 | 0.867 | 0.706 | 0.827 | 0.762 |
SVM | 0.694 | 0.733 | 0.022 | 0.042 | 0.523 | 1.000 | 0.023 | 0.045 | 0.793 | 0.886 | 0.223 | 0.356 |
DNN | 0.843 | 0.596 | 0.429 | 0.487 | 0.514 | 0.384 | 0.349 | 0.366 | 0.844 | 0.655 | 0.543 | 0.575 |
1D-CNN | 0.871 | 0.689 | 0.439 | 0.536 | 0.714 | 0.913 | 0.488 | 0.636 | 0.843 | 0.682 | 0.677 | 0.787 |
Proposed CNN | 0.830 | 0.956 | 0.601 | 0.738 | 0.795 | 0.945 | 0.634 | 0.759 | 0.870 | 0.719 | 0.803 | 0.835 |
Training = 60% | ||||||||||||
Model | Dataset (a) | Dataset (b) | Dataset (c) | |||||||||
AR | PR | RR | F1 | AR | PR | RR | F1 | AR | PR | RR | F1 | |
LR | 0.851 | 0.614 | 0.519 | 0.562 | 0.654 | 0.586 | 0.531 | 0.557 | 0.898 | 0.748 | 0.888 | 0.812 |
SVM | 0.700 | 0.917 | 0.027 | 0.052 | 0.676 | 0.727 | 0.094 | 0.167 | 0.810 | 0.838 | 0.290 | 0.431 |
DNN | 0.851 | 0.618 | 0.429 | 0.497 | 0.732 | 0.375 | 0.562 | 0.450 | 0.863 | 0.683 | 0.600 | 0.635 |
1D-CNN | 0.890 | 0.730 | 0.500 | 0.594 | 0.719 | 0.944 | 0.531 | 0.680 | 0.812 | 0.610 | 0.615 | 0.744 |
Proposed CNN | 0.846 | 0.931 | 0.669 | 0.779 | 0.834 | 0.952 | 0.720 | 0.819 | 0.919 | 0.825 | 0.876 | 0.897 |
Training = 70% | ||||||||||||
Model | Dataset (a) | Dataset (b) | Dataset (c) | |||||||||
AR | PR | RR | F1 | AR | PR | RR | F1 | AR | PR | RR | F1 | |
LR | 0.852 | 0.633 | 0.496 | 0.556 | 0.678 | 0.714 | 0.400 | 0.513 | 0.907 | 0.725 | 0.930 | 0.815 |
SVM | 0.700 | 0.833 | 0.030 | 0.058 | 0.660 | 0.500 | 0.094 | 0.158 | 0.833 | 0.793 | 0.324 | 0.460 |
DNN | 0.855 | 0.636 | 0.411 | 0.494 | 0.833 | 0.692 | 0.360 | 0.474 | 0.856 | 0.660 | 0.624 | 0.636 |
1D-CNN | 0.875 | 0.778 | 0.398 | 0.527 | 0.735 | 0.833 | 0.600 | 0.698 | 0.840 | 0.750 | 0.500 | 0.750 |
Proposed CNN | 0.893 | 0.839 | 0.690 | 0.757 | 0.844 | 0.952 | 0.756 | 0.850 | 0.920 | 0.785 | 0.966 | 0.904 |
Training = 80% | ||||||||||||
Model | Dataset (a) | Dataset (b) | Dataset (c) | |||||||||
AR | PR | RR | F1 | AR | PR | RR | F1 | AR | PR | RR | F1 | |
LR | 0.837 | 0.610 | 0.472 | 0.532 | 0.652 | 0.529 | 0.529 | 0.529 | 0.917 | 0.712 | 0.977 | 0.824 |
SVM | 0.695 | 0.778 | 0.035 | 0.066 | 0.660 | 0.615 | 0.094 | 0.163 | 0.833 | 0.684 | 0.302 | 0.419 |
DNN | 0.856 | 0.630 | 0.434 | 0.511 | 0.784 | 0.522 | 0.706 | 0.600 | 0.857 | 0.653 | 0.630 | 0.635 |
1D-CNN | 0.859 | 0.800 | 0.359 | 0.496 | 0.714 | 0.909 | 0.588 | 0.741 | 0.833 | 0.705 | 0.574 | 0.762 |
Proposed CNN | 0.723 | 0.908 | 0.742 | 0.816 | 0.901 | 0.958 | 0.841 | 0.896 | 0.958 | 0.857 | 1.000 | 0.947 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Feng, X.; Hui, H.; Liang, Z.; Guo, W.; Que, H.; Feng, H.; Yao, Y.; Ye, C.; Ding, Y. A Novel Electricity Theft Detection Scheme Based on Text Convolutional Neural Networks. Energies 2020, 13, 5758. https://doi.org/10.3390/en13215758
Feng X, Hui H, Liang Z, Guo W, Que H, Feng H, Yao Y, Ye C, Ding Y. A Novel Electricity Theft Detection Scheme Based on Text Convolutional Neural Networks. Energies. 2020; 13(21):5758. https://doi.org/10.3390/en13215758
Chicago/Turabian StyleFeng, Xiaofeng, Hengyu Hui, Ziyang Liang, Wenchong Guo, Huakun Que, Haoyang Feng, Yu Yao, Chengjin Ye, and Yi Ding. 2020. "A Novel Electricity Theft Detection Scheme Based on Text Convolutional Neural Networks" Energies 13, no. 21: 5758. https://doi.org/10.3390/en13215758
APA StyleFeng, X., Hui, H., Liang, Z., Guo, W., Que, H., Feng, H., Yao, Y., Ye, C., & Ding, Y. (2020). A Novel Electricity Theft Detection Scheme Based on Text Convolutional Neural Networks. Energies, 13(21), 5758. https://doi.org/10.3390/en13215758