A Knowledge Graph Method towards Power System Fault Diagnosis and Classification

Li, Cheng; Wang, Bo

doi:10.3390/electronics12234808

Open AccessArticle

A Knowledge Graph Method towards Power System Fault Diagnosis and Classification

by

Cheng Li

^1,* and

Bo Wang

²

¹

School of Engineering Mathematics and Technology, University of Bristol, Bristol BS8 1QU, UK

²

School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(23), 4808; https://doi.org/10.3390/electronics12234808

Submission received: 15 October 2023 / Revised: 19 November 2023 / Accepted: 22 November 2023 / Published: 28 November 2023

(This article belongs to the Special Issue Knowledge Engineering and Data Mining Volume II)

Download

Browse Figures

Versions Notes

Abstract

:

As the scale and complexity of electrical grids continue to expand, the necessity for robust fault detection techniques becomes increasingly urgent. This paper seeks to address the limitations in traditional fault detection approaches, such as the dependence on human experience, low efficiency, and a lack of logical relationships. In response, this study presents a cascaded model that leverages the Random Forest classifier in combination with knowledge reasoning. The proposed method exhibits a high efficiency and accuracy in identifying six basic fault types. This approach not only simplifies fault detection and handling processes but also improves their interpretability. The paper begins by constructing a power fault simulation model, which is based on the IEEE 14-bus system. Subsequently, a Random Forest classification model is developed and compared with other commonly used models such as Support Vector Machines (SVMs), k-Nearest Neighbor (KNN), and Naïve Bayes, using metrics such as the F1-score, accuracy, and confusion matrices. Our results reveal that the Random Forest classifier outperforms the other models, particularly in small-sample datasets, with an accuracy of 90%. Then, we apply knowledge mining technology to create a comprehensive knowledge graph of power faults. At last, we use the transE model for knowledge reasoning to enhance the interpretability to assist decision making and to validate its reliability.

Keywords:

power fault diagnosis; knowledge graph; random forest; transE

1. Introduction

The modern world is heavily reliant on electrical power, making it a critical component of our infrastructure [1]. However, power systems are susceptible to various types of faults, including short circuits, voltage fluctuations, equipment failures, and environmental disturbances [2]. These faults can lead to power outages and equipment damage, and even pose risks to public safety [3]. The economic and social consequences of power faults are significant, emphasizing the need for robust fault detection and diagnosis methods.

Efficient fault detection techniques are essential to mitigate the impact of power faults. Rapid identification and localization of faults allow power grid operators to take corrective actions promptly, reducing downtime and minimizing economic losses [4]. Furthermore, early fault detection can prevent cascading failures and ensure the stability of the entire power system. Therefore, the development of accurate and reliable fault detection methods is of utmost importance in the field of power system engineering.

Various methods have been proposed for power fault detection over the years [5], encompassing a wide spectrum of approaches. These range from traditional techniques like rule-based systems, which rely on predefined heuristics and thresholds, to more advanced methodologies such as machine learning and data-driven methods [6]. Expert systems, for instance, are designed to harness human knowledge by encoding expert rules, enabling them to detect faults with a certain degree of expertise [7].

Support Vector Machines (SVMs), on the other hand, provide a potent tool for classification tasks, including fault detection, by identifying optimal hyperplanes for separating different classes [8]. Recent years have witnessed the ascendancy of Convolutional Neural Networks (CNNs) and Graph Neural Networks (GNNs), particularly in the realms of image- and graph-based data, where their capabilities have shone in power fault detection [9,10].

Naive Bayes, rooted in the principles of Bayes’ theorem, employs probabilistic reasoning to make predictions based on statistical probabilities, making it a valuable addition to the arsenal of fault detection methods [11]. On the other hand, the simplicity and effectiveness of k-Nearest Neighbor (KNN) lie in its ability to classify data points based on the majority class among their nearest neighbors, a feature that makes it an attractive option for many applications, including power fault detection [12].

Despite their promise, each of these methods comes with its own set of limitations that can impede their effectiveness. Traditional rule-based systems, characterized by their reliance on predefined heuristics, can struggle to adapt to dynamic conditions and are often ill suited to the management of complex, interconnected power systems. Deep learning techniques like CNNs and GNNs frequently require large quantities of labeled data for training, posing significant challenges in the context of power system fault detection. Furthermore, the performance of KNN can be highly sensitive to the choice of distance metric and the number of neighbors selected, making it important to fine-tune these parameters for optimal results. Additionally, one common drawback shared by many of these methods is their lack of interpretability, making it challenging to gain insights into the underlying causes of detected faults.

While these methods have advanced the field of power fault detection, the need for fault detection techniques that are both interpretable and adaptable to the complex, evolving nature of modern power systems remains unmet. The inability to gain a deep understanding of why certain faults are detected and the challenges in adapting existing methods to new conditions underscore the necessity for innovative approaches.

This paper introduces a novel approach to power fault detection that aims to overcome the limitations of existing methods. As shown in Figure 1, our proposed method leverages the power of knowledge graphs, which allows us to represent and model the complex relationships and dependencies within power systems. By capturing the knowledge of experts and historical data, we create a structured representation of the power grid, making it easier to identify and analyze faults.

To enhance the accuracy and robustness of our approach, we integrate knowledge graphs with the power of random forests, a machine learning ensemble method. This combination allows us to harness the strengths of both approaches, leveraging the structured knowledge encoded in the graph while benefiting from the predictive power of random forests. The result is a powerful and adaptable fault detection model that can handle the intricacies of modern power systems.

In the subsequent sections of this paper, we will delve into the details of our proposed knowledge-graph-based fault detection method, its implementation, and its performance evaluation. We believe that this approach holds the potential to significantly improve the reliability and efficiency of power fault diagnosis, ultimately contributing to the stability and resilience of our electrical power systems.

2. Power Fault Model

With the continuous development of the power industry, the scale of power systems is getting increasingly larger. In order to avoid the limitations of the scale and complexity of the power system, and for the safety and reliability of the experiment, MATLAB simulations are employed for modeling and analyzing the actual operation of the power system. The IEEE 14-bus system is a commonly used standard test system in power systems for simulating and analyzing various phenomena, such as stability, power flow, and power faults [13]. It consists of 14 nodes, including 4 generator nodes, 3 load nodes, and 7 switch nodes. Through simulation, we can more intuitively understand the operation of the power system and can easily measure data under various fault conditions [14].

2.1. Structures and Elements

The various components of the power grid include generators, transformers, switchgear, and transmission lines [15]. Generators are devices used to convert other energy sources into electricity, transformers are devices used to enhance or reduce current and voltage, switchgear is used to control the flow of electricity and maintain the proper operation of the equipment, and transmission lines are devices used to transport electricity from the power plant to the substation and then through the distribution grid to the consumer. All of these components need to maintain a high degree of stability and reliability during operation to ensure the normal operation of the power system.

In this paper, we have extracted the constituent elements of the grid, including the generators, transformers, transmission lines, low-voltage lines, and power-using elements, and then modelled these structures.

2.2. Fault Classification

A power failure refers to a problem or malfunction in the power supply system that results in the inability to transmit or provide electricity effectively. As shown in Figure 2, power failures can typically be categorized into two aspects: short circuit faults and open circuit faults [16].

2.2.1. Short Circuit Fault

A short circuit fault is an electrical fault that occurs when there is an unintended low-resistance pathway for the electrical current to flow, leading to overheating, sparks, fire, or even electrical shock hazards. Short circuit faults can be classified into two main types.

Symmetrical Faults: Also known as balanced faults, symmetrical faults occur when all three phases of an electrical system experience the same level of fault. This can happen due to reasons such as conductor insulation failure, direct contact between conductors of different phases, or faulty equipment.

Unsymmetrical Faults: Unsymmetrical faults, also known as unbalanced faults, occur when the fault affects one or more phases of the electrical system differently. This can occur due to reasons such as phase-to-phase or phase-to-ground faults, faulty insulation, or equipment failure.

2.2.2. Open Circuit Fault

An open circuit fault refers to a discontinuity or break in an electrical circuit which prevents the flow of current. This interruption can occur in one or multiple conductors within the circuit, leading to different types of open circuit faults.

One-conductor open fault: In this type of fault, a single conductor in the electrical circuit is broken or disconnected, causing a gap in the flow of electricity. As a result, current cannot pass through the affected conductor, and the circuit becomes incomplete. This fault could be due to a broken wire, a loose connection, or a faulty component.

Two-conductor open fault: Unlike the previous fault, a two-conductor open fault involves the simultaneous breakage or disconnection of two conductors in the circuit. As a result, two separate gaps are created, preventing the flow of current through both conductors. This fault could occur due to multiple causes, such as physical damage, incorrect wiring, or faulty components.

2.3. Model Construction

The power system model contains a three-phase source, connected to an RLC load (three-phase series RLC load), connected to a three-phase transformer (three-phase transformer) via a three-phase PI section Line, and finally connected to a three-phase transformer by a three-phase PI section Line. The three-phase PI section line is connected to the three-phase transformer. Finally, the three-phase fault and three-phase breaker simulate various faults, and the real-time parameters are measured by an oscilloscope to analyze the voltage of each phase sequence of the fault [17].

3. Power Fault Graph

The knowledge graph of a power fault is a comprehensive representation of various aspects related to electricity disruptions [18]. It encompasses the causes, consequences, and preventive measures associated with power outages. Through interconnected nodes and links, the graph organizes and visualizes the intricate network of factors, such as equipment malfunctions, natural disasters, and human errors, that can result in power failures. This knowledge graph provides a valuable resource for understanding and analyzing power disruptions, aiding in their prevention and facilitating prompt resolution when they occur [19].

3.1. Graph Construction

Constructing a power fault knowledge graph involves four essential steps: data acquisition, knowledge extraction, integration, and quality improvement [20,21]. The process is shown in Figure 3.

Data acquisition involves gathering information from diverse sources such as books, the internet, and expert experience to obtain domain-specific knowledge on power faults.

Knowledge extraction employs techniques like crawling, parsing, and fact extraction. Crawling systematically browses sources, parsing organizes and structures data, and fact extraction identifies specific factual information related to power faults.

Integration combines the extracted information to construct a coherent and interconnected knowledge graph. This process includes knowledge linking, establishing relationships between information, and knowledge fusion, merging diverse knowledge sources to ensure consistency and avoid redundancy.

Quality improvement enhances the constructed graph’s overall quality. It includes knowledge completion and correction. Knowledge completion fills gaps or missing information by leveraging additional sources or expert input, while knowledge correction rectifies inaccuracies or errors for reliable information.

3.2. Power Fault Knowledge Graph Framework

The electric power domain’s knowledge graph framework consists of four indispensable components: the equipment entity graph, the concept graph, the business logic graph, and the fault case graph [22]. The equipment entity graph comprehensively captures detailed information about physical assets and devices within the power system, thus enabling a holistic view of the infrastructure.

Incorporating semantic knowledge, the concept graph augments the power fault knowledge graph by defining relationships and associations between equipment entities, significantly enhancing reasoning capabilities. Meanwhile, the business logic graph integrates operational rules and regulations, providing guidance for the power system’s operation, monitoring, and maintenance practices.

Furthermore, the fault case graph serves as a repository of historical and simulated data, housing fault records, potential causes, and diagnostic outcomes. This repository facilitates timely fault detection, diagnosis, and resolution.

Collectively, these components synergize to create a comprehensive and dynamic power fault knowledge graph. This knowledge graph is pivotal in optimizing system performance and bolstering the reliability of power systems.

4. Random Forest Algorithm

In recent years, machine learning techniques have gained significant attention in the field of power fault detection due to their ability to handle complex and non-linear patterns in power system data. The Random Forest (RF) classifier, a popular ensemble learning algorithm, has shown promise in this regard for its ability to handle high-dimensional power fault data [23]. This section explores the application of RF classifiers for power fault detection. We delve into the principle of the algorithm, discuss parameter settings, and present experimental results, including comparative experiments with other algorithms, along with the evaluation of their performance.

4.1. Principle of the Algorithm

Random Forest is an ensemble learning algorithm that is widely used for classification tasks. It is based on the idea of decision trees and combines multiple trees to make robust predictions. The key principles of the Random Forest algorithm are as follows.

Decision trees: Random Forest is built upon decision trees, which are simple models that partition data into subsets based on feature values. Each tree learns from a random subset of the training data and features, making them less prone to overfitting [24].

Bootstrap aggregating (bagging): Random Forest employs a technique known as bagging, where multiple decision trees are trained independently on different subsets of the training data with replacements. This diversity helps reduce variance and improve overall accuracy.

Random feature selection: Another crucial aspect of Random Forest is the random selection of a subset of features at each node of the tree. This randomness further reduces the correlation among the individual trees and leads to better generalization.

Voting or averaging: In the classification task, Random Forest combines the predictions of individual trees by either majority voting (for classification) or averaging (for regression), resulting in a final prediction.

4.2. Parameter Settings

To effectively apply Random Forest to power fault detection, appropriate parameter settings must be chosen. Some key parameters are listed below (Table 1).

4.3. Experimental Design

To assess the effectiveness of Random Forest in power fault detection, we conducted experiments and compared its performance with other algorithms commonly used in this domain, including Support Vector Machines (SVMs), k-Nearest Neighbor (KNN) and the Bayesian classifier (BC). Due to the small sample size, a four-fold cross-validation technique was employed to maximize data utilization and mitigate overfitting. The performance of each algorithm was evaluated using various metrics such as the accuracy, precision, recall, F1-score, and confusion matrix (shown in Figure 4, Figure 5, Figure 6 and Figure 7).

All simulation experiments were conducted in the Python-PyCharm Community, Edition 2022, and the MATLAB 2022b environment on a computer featuring a 5.4 GHz Intel (R) Core (TM) i9-13900 CPU, 16.0 GB RAM, and 64-bit Windows 11.

4.4. Results

We obtained 66 sets of fault data after simulation using a power fault model. We used 30% of the data as a test set and the rest as a training set. The results are as follows (Table 2).

Our experimental results indicate that Random Forest classifiers exhibit superior performance in power fault detection compared to the alternative algorithms. In conclusion, the application of Random Forest classifiers in power fault detection has shown promising results, owing to its robustness, generalization capabilities, and ease of parameter tuning.

4.5. Statistical Analysis

In order to test whether the RF algorithm is statistically significantly different from the other algorithms, we employed the Friedman test to assess the statistical significance. Subsequently, Nemenyi’s follow-up test was conducted to determine significant differences between algorithm pairs. Table 3 presents the results of Friedman’s test for the accuracy, recall, precision, and F1-score metrics.

The p-values in Table 3 indicate significant differences (p < 0.05) among the algorithms in terms of classification performance on the four data subsets. Based on this observed significance, the Nemenyi follow-up test examines specific algorithm pairs. Figure 8, Figure 9, Figure 10 and Figure 11 display Nemenyi test results, revealing that RF significantly outperforms other algorithms in accuracy, recall, and the F1-score. Notably, RF exhibits superior performance compared to SVMs and the other two algorithms across various metrics.

In conclusion, the statistical analyses confirm the superiority of the proposed RF algorithm over alternative methods, demonstrating an enhanced classification performance.

5. Knowledge-Based Decision Making

In this chapter, we employed the transE algorithm, a knowledge reasoning technique, on the power fault knowledge graph that was previously constructed. Through the training process, we acquired embedded representations of diverse entities and relationships present in the knowledge graph. These embedded representations encapsulate essential features and semantics of the entities and relationships, enabling effective reasoning and analysis [25,26].

5.1. TransE Algorithm

The widely used transE [27] algorithm maps entities and relations to low-dimensional vector representations, enabling semantic representation and inference in knowledge graphs. Inspired by translation invariance in word vectors, transE represents the relationship between entities as the vector difference between them, resulting in a simple and efficient algorithm. During model training, the algorithm learns certain semantic information. Given a triad

(h, r, t)

, the goal of transE is to make the vector

h + r

approximate the vector t, representing the transfer of h to r. This relationship is depicted in the following Figure 12.

5.2. Knowledge Reasoning

To utilize the obtained embeddings, we use the scoring function, denoted as

f (h, r, t)

, which takes a fault type as the head entity [28]. The relationship, denoted as r, can be selected from fault characteristics, suggested solutions, or historical cases, depending on the analysis requirements. By inputting the fault type and relationship into the scoring function, we obtain a score for each potential tail entity.

f_{r} (r, t) = - {∥\begin{matrix} \vec{h} + \vec{r} - \vec{t} \end{matrix}∥}_{1 / 2}

(1)

The selected tail based on the obtained scores entity offers abundant information pertaining to the specific fault being investigated. By leveraging this approach, we enhance the comprehensive understanding of power faults, enabling more effective fault diagnosis, analysis, and decision making [29].

5.3. Result

After training with the transE algorithm, we obtained the embedding representations of the power fault knowledge graph. While receiving the fault types passed by the Random Forest classifier, the scoring function calculates the scores for the possible outcomes and takes the highest scoring outcome as the output. We will use MR, MRR, HITS@1, and HITS@3 as evaluation metrics, and the following Table 4 displays the results of this approach [30].

The results show that this method performs well. The model achieved a Mean Rank (MR) of 1.672, signifying the model’s ability to rank correct answers higher within a list of possibilities. Furthermore, the HITS@3 score of 0.867 reflects the model’s strong performance in correctly predicting relationships, with the correct answer frequently ranking within the top three. This is due to the fact that power fault knowledge graphs are mostly one-to-one relationships well suited for transE.

6. Power Fault Diagnosis and Decision-Making Model

These diverse model’s results were integrated into a robust and all-encompassing framework for fault classification, detection, and decision support within the intricate domain of electric power systems. The primary dataset comprises power fault data, meticulously collected and preprocessed, which are then subjected to an advanced machine learning approach, namely the Random Forest classifier. This classification step serves as a crucial initial gatekeeper, as it effectively discerns the specific category or nature of the fault occurrence, allowing for precise diagnosis.

Subsequently, the identified fault type becomes the linchpin of a more elaborate knowledge reasoning model. This model acts as an intellectual cornerstone, facilitating operators and engineers to access many domain-specific insights and recommendations. These insights include historical patterns, potential root causes, recommended mitigation strategies, and even expert guidance on handling similar incidents. This dynamic fusion of data-driven classification and knowledge-driven reasoning empowers decision makers to make well-informed choices and respond swiftly to electrical power faults.

This comprehensive end-to-end solution optimizes the entire workflow, streamlining the process from raw fault data to effective solutions. In doing so, it significantly elevates the operational efficiency, reduces downtime, and ultimately contributes to the reliability and sustainability of the electric power infrastructure.

7. Discussion and Limitations

7.1. Discussion

We propose a combined model. It enables automatic output of the corresponding fault type and various information related to the fault based on the observed time-series voltage data. One of the primary contributions of our research is the substantial improvement in fault diagnosis accuracy. Our integrated model, which employs a Random Forest classifier as a fault detection algorithm, has demonstrated a substantial enhancement in the precision of fault identification. Through extensive testing and validation, we have consistently achieved accuracy rates exceeding 90%, outperforming modern machine-learning-based approaches.

Another important result is the successful incorporation of domain knowledge into our fault diagnosis process. Our model’s knowledge base includes a comprehensive repository of historical data, industry standards, and expert-generated rules. This ensures that our system not only detects faults but also provides a more contextually relevant understanding of the issues at hand. We believe this integration of domain knowledge sets our approach apart from other purely data-driven methods and significantly contributes to the interpretability and reliability of our results.

Our method also excels in terms of interpretability, which is essential for power grid operators who need to understand the reasoning behind fault diagnoses. The system provides detailed explanations for its decisions, showing how it arrived at a particular diagnosis. This transparency allows operators to trust and act upon the provided insights, reducing the risk of human–machine miscommunication.

Furthermore, our solution’s scalability is a key feature. We have designed the system architecture to be modular and flexible, allowing it to seamlessly incorporate improvements in fault detection and knowledge-reasoning algorithms. As these algorithms evolve and improve, our model’s accuracy and effectiveness will continue to increase. This scalability ensures that our system can adapt to the evolving demands of the power industry, which is characterized by an ever-expanding power grid and the constant need for efficient fault diagnosis and handling. It also makes our solution future-proof, positioning it to play a pivotal role in the ongoing transformation of the power sector.

7.2. Limitations

Limited by experimental conditions, the fault data in this paper are mainly from the simulation of power system faults in the IEEE 14-bus system, and thus the data quality is not high enough. However, due to the good robustness and generalization ability of the Random Forest classifier, the performance would be maintained if trained with better quality data. In addition, we have only explored the voltage data observed under the six basic fault types, and although this essentially covers most fault situations, other fault types, such as complex faults caused by combinations of several basic fault types, still exist. There are limitations under some certain situations.

8. Conclusions

In conclusion, our study has introduced a novel combined model that leverages the synergy between knowledge graphs and machine learning algorithms, achieving a higher level of integration and automation compared to prior research efforts. This combined model integrates two specialized sub-models for fault detection and knowledge reasoning, both of which have demonstrated a strong performance.

The key contributions of our research lie in providing end-to-end solutions for the classification, diagnosis, and handling of most basic fault types. As a modular model, it excels in terms of modularity, specialization, and flexibility, allowing for easy adaptability to various fault scenarios.

While our model showcases advantages in interpretability and scalability, there is room for improvement. Future research directions include enhancing the quality and scale of our dataset, improving the knowledge reasoning model to accommodate complex relationships, refining the fault detection algorithm for greater accuracy, and extending the model’s capabilities to diagnose and handle more complex fault types. Addressing these areas in future work will further bolster the model’s performance and expand its applicability to a broader range of fault scenarios, contributing to the ongoing advancement of fault detection and handling in power systems.

Author Contributions

Conceptualization, B.W.; Formal analysis, C.L.; Investigation, C.L.; Data curation, C.L.; Writing—original draft, C.L.; Project administration, B.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy and confidentiality concerns.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Abbreviations

The following abbreviations are used in this manuscript:

LL fault	Line-to-Line fault
LG fault	Single Line-to-Ground fault
LLG fault	Double Line-to-Ground fault
LLL fault	Three-phase short circuit fault

References

Nadour, M.; Essadki, A.; Nasser, T. Improving low-voltage ride-through capability of a multimegawatt DFIG based wind turbine under grid faults. Prot. Control Mod. Power Syst. 2020, 5, 33. [Google Scholar] [CrossRef]
LUe, Y.H.; Fan, M. Analysis of the harmful effects to buried oil pipeline from power line short-circuit fault. J. China Univ. Posts Telecommun. 2012, 19, 124–128. [Google Scholar] [CrossRef]
Das, T.; Roy, R.; Mandal, K.K. Impact of the penetration of distributed generation on optimal reactive power dispatch. Prot. Control Mod. Power Syst. 2020, 5, 31. [Google Scholar] [CrossRef]
Nagpal, M.; Henville, C. Impact of power-electronic sources on transmission line ground fault protection. IEEE Trans. Power Deliv. 2017, 33, 62–70. [Google Scholar] [CrossRef]
Raza, A.; Benrabah, A.; Alquthami, T.; Akmal, M. A review of fault diagnosing methods in power transmission systems. Appl. Sci. 2020, 10, 1312. [Google Scholar] [CrossRef]
Vaish, R.; Dwivedi, U.; Tewari, S.; Tripathi, S.M. Machine learning applications in power system fault diagnosis: Research advancements and perspectives. Eng. Appl. Artif. Intell. 2021, 106, 104504. [Google Scholar] [CrossRef]
Husain, Z. Fuzzy logic expert system for incipient fault diagnosis of power transformers. Int. J. Electr. Eng. Inform. 2018, 10, 300–317. [Google Scholar] [CrossRef]
Fei, S.W.; Zhang, X.b. Fault diagnosis of power transformer based on support vector machine with genetic algorithm. Expert Syst. Appl. 2009, 36, 11352–11357. [Google Scholar] [CrossRef]
Han, J.; Miao, S.; Li, Y.; Yang, W.; Yin, H. Fault diagnosis of power systems using visualized similarity images and improved convolution neural networks. IEEE Syst. J. 2021, 16, 185–196. [Google Scholar] [CrossRef]
Liao, W.; Yang, D.; Wang, Y.; Ren, X. Fault diagnosis of power transformers using graph convolutional network. Csee J. Power Energy Syst. 2020, 7, 241–249. [Google Scholar]
Lu, D.; Ning, Q.; Yang, X. Fault diagnosis of rolling bearing based on knn-naive bayesian algorithm. Comput. Meas. Control 2018, 26, 21–23. [Google Scholar]
Chen, Q.; Li, Q.; Wu, J.; He, J.; Mao, C.; Li, Z.; Yang, B. State Monitoring and Fault Diagnosis of HVDC System via KNN Algorithm with Knowledge Graph: A Practical China Power Grid Case. Sustainability 2023, 15, 3717. [Google Scholar] [CrossRef]
Pattanaik, P.P.; Panigrahi, C.K. Stability and fault analysis in a power network considering IEEE 14 bus system. In Proceedings of the 2018 2nd International Conference on Inventive Systems and Control (ICISC), Coimbatore, India, 19–20 January 2018; pp. 1134–1138. [Google Scholar]
Mahapatra, S.; Singh, M. Analysis of symmetrical fault in IEEE 14 bus system for enhancing over current protection scheme. Int. J. Future Gener. Commun. Netw. 2016, 9, 51–62. [Google Scholar] [CrossRef]
Wu, Y.N.; Chen, J.; Liu, L.R. Construction of China’s smart grid information system analysis. Renew. Sustain. Energy Rev. 2011, 15, 4236–4241. [Google Scholar] [CrossRef]
Mousa, M.; Abdelwahed, S.; Kluss, J. Review of fault types, impacts, and management solutions in smart grid systems. Smart Grid Renew. Energy 2019, 10, 98. [Google Scholar] [CrossRef]
Sakhnini, J.; Karimipour, H.; Dehghantanha, A. Smart grid cyber attacks detection using supervised learning and heuristic feature selection. In Proceedings of the 2019 IEEE 7th International Conference on Smart Energy Grid Engineering (SEGE), Oshawa, ON, Canada, 12–14 August 2019; pp. 108–112. [Google Scholar]
Wang, J.; Wang, X.; Ma, C.; Kou, L. A survey on the development status and application prospects of knowledge graph in smart grids. IET Gener. Transm. Distrib. 2021, 15, 383–407. [Google Scholar] [CrossRef]
Li, J.; Li, X.; Gao, T.; Zhang, J.; Zhang, B. Research and application of fault handling based on power grid multivariate information knowledge graph. Power Inf. Commun. Technol. 2021, 19, 30–38. [Google Scholar]
Liu, P.; Tian, B.; Liu, X.; Gu, S.; Yan, L.; Bullock, L.; Ma, C.; Liu, Y.; Zhang, W. Construction of Power Fault Knowledge Graph Based on Deep Learning. Appl. Sci. 2022, 12, 6993. [Google Scholar] [CrossRef]
Liu, S.; Yang, H.; Li, J.; Kolmanič, S. Preliminary study on the knowledge graph construction of Chinese ancient history and culture. Information 2020, 11, 186. [Google Scholar] [CrossRef]
Pu, T.; Tan, Y.; Peng, G.; Xu, H.; Zhang, Z. Construction and application of knowledge graph in the electric power field. Power Syst. Technol. 2021, 45, 2080–2091. [Google Scholar]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Ali, J.; Khan, R.; Ahmad, N.; Maqsood, I. Random forests and decision trees. Int. J. Comput. Sci. Issues (IJCSI) 2012, 9, 272. [Google Scholar]
Chen, X.; Jia, S.; Xiang, Y. A review: Knowledge reasoning over knowledge graph. Expert Syst. Appl. 2020, 141, 112948. [Google Scholar] [CrossRef]
Zhang, Q.; Jia, Q.; Wang, Y. Question Answering Based Assisted Decision for Electric Power Fault Diagnosis. In Proceedings of the 2020 IEEE 5th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), Chengdu, China, 10–13 April 2020; pp. 194–198. [Google Scholar]
Bordes, A.; Usunier, N.; Garcia-Duran, A.; Weston, J.; Yakhnenko, O. Translating embeddings for modeling multi-relational data. Adv. Neural Inf. Process. Syst. 2013, 26, 2787–2795. [Google Scholar]
Mohamed, S.K.; Novácek, V.; Vandenbussche, P.Y.; Muñoz, E. Loss Functions in Knowledge Graph Embedding Models. DL4KG@ ESWC 2019, 2377, 1–10. [Google Scholar]
Pezeshkpour, P.; Tian, Y.; Singh, S. Investigating robustness and interpretability of link prediction via adversarial modifications. arXiv 2019, arXiv:1905.00563. [Google Scholar]
Dai, Y.; Wang, S.; Xiong, N.N.; Guo, W. A survey on knowledge graph embedding: Approaches, applications and benchmarks. Electronics 2020, 9, 750. [Google Scholar] [CrossRef]

Figure 1. Random Forest–transE combined model.

Figure 2. Basic power fault types.

Figure 3. Power fault KG construction diagram.

Figure 4. Confusion matrix of the RF model.

Figure 5. Confusion matrix of the SVM model.

Figure 6. Confusion matrix of the KNN model.

Figure 7. Confusion matrix of the BC model.

Figure 8. Nemenyi test results of accuracy.

Figure 9. Nemenyi test results of recall.

Figure 10. Nemenyi test results of precision.

Figure 11. Nemenyi test results of F1 score.

Figure 12. TransE algorithm schematic diagram.

Table 1. Parameters settings of random forest classifier.

Parameter Name	Parameter Setting
n_estimators	100
criterion	gini
max_depth	10
min_samples_split	2
min_samples_leaf	1
min_weight_fraction_leaf	0
max_features	auto
max_leaf_nodes	50

Table 2. Experimental results of four methods.

Method	Accuracy	Recall	Precision	F1-Score
RF	0.9	0.9	0.94	0.912
SVM	0.8	0.82	0.875	0.805
KNN	0.667	0.685	0.72	0.675
BC	0.75	0.77	0.85	0.737

Table 3. The Friedman’s test of the classification capability of all algorithms.

	Accuracy	Recall	Precision	F1-Score
$χ^{2}$	11.09	12	10.79	9.29
df	3	3	3	3
p	0.0111	0.0073	0.0128	0.0255

Table 4. Experimental results of transE.

Metric	MR	MRR	HITS@3	HITS@10
Test sample	1.672	0.817	0.867	0.912

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, C.; Wang, B. A Knowledge Graph Method towards Power System Fault Diagnosis and Classification. Electronics 2023, 12, 4808. https://doi.org/10.3390/electronics12234808

AMA Style

Li C, Wang B. A Knowledge Graph Method towards Power System Fault Diagnosis and Classification. Electronics. 2023; 12(23):4808. https://doi.org/10.3390/electronics12234808

Chicago/Turabian Style

Li, Cheng, and Bo Wang. 2023. "A Knowledge Graph Method towards Power System Fault Diagnosis and Classification" Electronics 12, no. 23: 4808. https://doi.org/10.3390/electronics12234808

APA Style

Li, C., & Wang, B. (2023). A Knowledge Graph Method towards Power System Fault Diagnosis and Classification. Electronics, 12(23), 4808. https://doi.org/10.3390/electronics12234808

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Knowledge Graph Method towards Power System Fault Diagnosis and Classification

Abstract

1. Introduction

2. Power Fault Model

2.1. Structures and Elements

2.2. Fault Classification

2.2.1. Short Circuit Fault

2.2.2. Open Circuit Fault

2.3. Model Construction

3. Power Fault Graph

3.1. Graph Construction

3.2. Power Fault Knowledge Graph Framework

4. Random Forest Algorithm

4.1. Principle of the Algorithm

4.2. Parameter Settings

4.3. Experimental Design

4.4. Results

4.5. Statistical Analysis

5. Knowledge-Based Decision Making

5.1. TransE Algorithm

5.2. Knowledge Reasoning

5.3. Result

6. Power Fault Diagnosis and Decision-Making Model

7. Discussion and Limitations

7.1. Discussion

7.2. Limitations

8. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI