1.1. Research Progress of Power System Fault Processing Technology
With the increase in the complexity and type of the power system, the number of faults in China’s power system is increasing. It poses a huge threat to the production, operation, maintenance, and security of China’s power system and the stable development of the national economy. At present, the power system is moving in the direction of intelligence and automation. A large number of sensor devices are used for equipment condition monitoring. It is of great significance to ensure the safe and stable operation of the power grid to mine the information related to equipment status from the accumulated massive data and judge whether there is a fault or predict a fault. The related technologies of power system fault diagnosis have been developed rapidly. Various methods are used to diagnose power system faults from different perspectives. The fault diagnosis method can be divided into expert system [
1,
2], artificial neural network [
3,
4], Bayesian network [
5], Petri net technology [
6], analytic model [
7,
8], information fusion [
9,
10], and other methods.
The expert system simulates the knowledge and experience of experts to deduce and judge, and finally obtain the explanation closest to the actual situation. With the continuous expansion of the power system scale, the construction and maintenance of knowledge base has become increasingly complex. When information is missing or confused, the accuracy of expert system diagnosis is greatly reduced [
1,
2].
Artificial Neural Network (ANN) is a technology that can process information by simulating the human nervous system. The ANN obtains fault diagnosis results through information input, but it is difficult to complete the explanation of the whole fault development process [
3]. In recent years, the ANN have been widely used in fault area division and fault location, which greatly improved learning efficiency. However, its interpretation ability in the fault diagnosis of a large power grid needs to be further improved [
4].
The Bayesian network can reasonably explain the behavior of the system and can express and reason knowledge in the presence of uncertain factors. Under the condition that complete information can be provided to a Bayesian network it provides accurate fault diagnosis, but in a single information event, or in the case where there are omissions or errors in the information, this method cannot accurately diagnose a fault. This method needs to quantify the probabilities of collected information, process the uncertain information, then through using pattern recognition, establish a correlation Bayesian network model to determine fault components and fault locations [
5].
The fault diagnosis method of Analytic Models is mainly based on the switching information of power systems. The action and alarm information of the protection device are analyzed, and the optimal solution is solved by intelligent algorithm [
7]. In the process of solving the model, there are multiple optimal solutions, and the most accurate diagnosis is difficult to obtain [
8].
Information fusion is a new technology for fault diagnosis in recent years, and it is also an important development direction for the future. The main advantage of this method is that it can analyze power system faults from multiple angles and avoid diagnosis errors caused by a single point of information or information loss [
9]. With the continuous improvement of big data technology, power system fault diagnosis based on multi-source information fusion has also achieved certain development [
10].
In the case of complex faults of a power system, there are many uncertain factors, which lead to some errors in the above diagnosis methods. At present, the main countermeasures for the complex fault diagnosis of power systems still rely on the experience of operation and maintenance personnel. Manually determining the fault type and providing the fault solution greatly depends on previous knowledge. This restricts the efficiency and accuracy of complex fault handling.
1.2. Power System Fault Processing Technology Based on Deep Learning Technology
In recent years, with the rapid development of computer technology, the research tide of deep learning technology has been rising. Deep learning technology can process large amounts of data and extract deep features of input data. It has been applied to healthcare, image recognition and natural language processing with great success [
11]. The application of deep learning technology in fault diagnosis can realize the combination of feature extraction and classification, and has less dependence on professional knowledge. Deep learning is widely used in power systems, such as power consumption prediction, wind and solar energy prediction, power disturbance detection and classification, fault detection, energy management and energy optimization [
12,
13,
14]. Deep learning technology has strong abilities of data feature extraction and provides a new research direction for power system fault diagnosis [
15].
In 2012, Google took the lead in introducing the concept of knowledge graphs. As a special knowledge base, a knowledge graph extracts entities and relationships from unstructured knowledge to form relationships, and stores these relationships in the form of a directed graph, which can express unstructured knowledge in a standardized way. A knowledge graph can be regarded as a semantic map composed of several nodes and edges. Each entity or concept is a node in the knowledge graph [
16]. The concept is the abstraction of a certain kind of phenomenon or transaction, such as the system Operation and Maintenance personnel; an entity is a specific person or thing, such as the specific name of the system operation and maintenance personnel. The edge connecting two nodes represents a certain relationship between the nodes. Among many different knowledge representation methods, the two characteristics of knowledge graphs are particularly prominent, that is, strong expressive ability and ease of extension. Utilizing the powerful expressive ability of a knowledge graph and its rich relationship information, then combining it with the relationship reasoning rules in the scenario of power system production, operation and maintenance, the log data can be mined to complete various reasoning applications, such as inconsistency detection, inference completion, knowledge discovery, auxiliary reasoning, and decision-making.
Knowledge graphs can structure the heterogeneous knowledge and construct the correlation between pieces of knowledge. They can be used to solve the diversification, complexity and islanding of fault data, and standardize the storage of fault knowledge for fault diagnosis. In addition, the knowledge graph also has the advantage of interpretability, which can provide a reasonable explanation for the fault causes and solutions. At present, knowledge graph technology is not widely used in the field of fault diagnosis. Li Jinxing et al. [
17] studied fault treatment based on power information knowledge graphs and proposed a fault diagnosis method. A BiLSTM-CRF model was used to conduct entity recognition of power domain knowledge and realize the construction and application of a power grid domain knowledge graph. Feng et al. [
18] proposed an intelligent question answering system for fault diagnosis of power information acquisition systems based on knowledge graph technology, which can efficiently traverse searching of nodes and paths thus significantly improve reasoning efficiency. Meng et al. [
19] proposed the BERT–BiLSTM–CRF model to extract knowledge from Chinese technical literature and construct a knowledge graph of electric power equipment faults.
In power equipment fault defect text, there are many entities related to equipment fault information, with various categories and large differences in features. It is difficult to extract all entities in the text by using a single dictionary or machine learning method, some entity values are expandable, so there is no directly applied extraction method. In addition, there is not much research on text processing in this field, and there is still a lack of universal text data sets, which need to be made by ourselves. Due to the diversity of entities in power equipment fault defect text, the relationships between entities will be more complicated, which greatly increases the difficulty of accurately judging the relationship between entities. A log system usually contains a large variety of information related to faults and a log analysis helps researchers collect fault information and perform association analysis. Therefore, making full use of the fault prior knowledge in the power field, integrating the abnormal logs generated in the operation of the power system, and constructing a perfect and practical power system fault knowledge graph is of great significance to improving the power system fault handling capacity and level of intelligence.
At present, the construction of the power grid fault knowledge graph is mainly based on the distribution network equipment account data, fault handling data, dispatching regulation data, and distribution network defect data as objects, most of which are unstructured data written in Chinese. In the process of Chinese knowledge graph construction, the word segmentation system can directly realize the boundary division of most entities and concepts, so the performance of the word segmentation system plays a key role in the construction of a knowledge graph. The performance of Chinese word segmentation systems in standard data sets (such as news corpus) is nearly perfect, but the performance of the Chinese word segmentation system in other specific fields without annotations is not satisfactory, especially in some professional fields, because there are a large number of unknown words. In order to solve this problem, this paper introduces logs in the system as the source of fault data. Logs are semi-structured data with a large quantity of rich information, which is convenient for automatic processing. A new BBIGRU-GA-CRF model is proposed. The model firstly identifies and extracts the power equipment entities from the pre-processed logs and Chinese technical literature, and then extracts the semantic relationships between the entities using the relationship classification method based on dependency parsing. Finally, the extracted knowledge is stored in the form of triples in the Neo4j database and visualized in the form of graphs. Through the above steps, the Chinese knowledge graph of power equipment fault is established.