A survey of the main research databases in the field of electrical engineering, found articles that use cluster analysis to characterize the power quality phenomena. The following is a summary of each of these works. In [
7] a method for the evaluation of the events of power quality considering different network operating conditions was proposed. The measured data may depend on the load changes, generation and different network configurations. For this reason, the author of the paper uses clustering techniques to divide acquired data into groups that reflect operating conditions. In work [
8], a technique based on graphical cluster analysis was developed to be implemented in a smart power quality analyzer, to monitor electrical networks. In the presence of a fault, the equipment starts the measurement procedure and higher order statistics are calculated in the time domain to allow classification. The results showed the division into two groups of events (voltage sags and transients), with an accuracy of 80%. The paper [
9] presents an algorithm that uses the k-means method to recognize and classify the voltage sags of measurement data from a large power grid in Shenzhen (China). The results showed that nearly all voltage sags disturbances can be classified into 11 clusters that probably represent the characteristics and causes of most events occurring in typical distribution systems. In [
10], a method developed to determine the optimal number of groups to be formed in power quality measurement data is presented using a data mining algorithm based on the minimum message length (MML) technique. To test the proposed method, three different databases were used, and the test results confirmed the effectiveness of the proposed method, finding the optimal number of groups. A new approach to identify the severity profile of busbar voltage sags was introduced in [
11], Voltage sags data caused by faults in all nodes of the system are separated into clusters using the k-means technique. By implementing the method, as a result, information is obtained from the buses that have the lowest occurrence of severe events, hence allowing the choice of installation of sensitive loads at such points of the system. In addition, knowing the most affected buses, the allocation of attenuation devices such as dynamic voltage restorers (DVRs) can be better evaluated. It is presented in [
12] a hybrid model for power quality analysis composed by a modification of the fuzzy min-max neural network (FMM) method added to a modification of the clustering tree (CT) technique. The results were compared with those obtained when applying other clustering algorithms, indicating a better accuracy of the proposed new method. A methodology for detecting and classifying power quality disturbances using a Stockwell transform was developed in [
13]. The disturbances were generated by MatLab according to the standards established in the IEEE—1159. Several signal characteristics were extracted from the S-transform based multiresolution analysis. These characteristics are used to classify the disturbances by the fuzzy c-means clustering method. The effectiveness of the proposed algorithm was verified by satisfactory results from several case studies, showing an assertiveness of 99%. Reference [
14] proposes a new method for reducing the training set size for the K–nearest neighbors (KNN) algorithm. The proposed method is based on an iterative process. Experimental results showed that the accuracy after sample reduction by recursive process had no difference compared to the original training set. However, the classification of a new signal became faster. For a signal from a real measuring device, the classification time has been reduced from 1.35 s to 0.09 s. The work [
15] proposes a method to comprehensively evaluate the power quality based on the maximum tree (MT) algorithm for clustering by the fuzzy method. For the test, 4 indicators were selected: voltage deviation, frequency variation, voltage unbalance and harmonic. The results achieved in a practical case proved the viability of the method, which provides some scientifically based guidelines for the consumer to select the electricity utility and adjust the price paid for the energy according to the quality offered. The paper [
16] proposes a methodology to locate the source of voltage sags, initially cluster analysis is used to divide data of voltage signals measured in different nodes into groups. Then, the set of decision rules is defined using the partial decision trees algorithm, which will confront the characteristics of each cluster and define which group the location of the disturbance source fits into. The IEEE 34-bus test feeder system was used to evaluate the methodology and the results showed a hit rate greater than 98%. The work [
17] proposes and evaluates an alternative methodology to characterize and classify voltage sags. PCA and K-means clustering technique are applied to identify RMS voltage patterns and reduce the number of RMS voltage profiles representative of the events considered. Real data from 300 events collected at a wind farm in Spain were used to validate the methodology. The proposed methodology proved to be efficient to assess a large number of events. The paper [
18] based on a statistical procedure that considers the correlation between the index and the number of equipment trips, proposes a methodology to determine different sensitivity regions and weighting factors from those established in [
6]. Therefore, it proposes an improvement of the standard [
6]. The research conducted in [
19] shows a methodology for clustering distribution systems considering the variables related to voltage sags. The methodology is summarized in four processes: selection of the variables by their correlation with the frequency of voltage sags, implementation of the cluster analysis considering various methods for further investigation of the most appropriate, evaluation of the methods that generated the best clusters through analysis of variance between the response and the generated membership and finally robustness analysis made by including small noises in the input variable, observing which of the methods is more assertive in this condition. The results showed that Ward’s method was the most appropriate to the considered database. In the paper [
20] it is proposed to apply principal component analysis (PCA) to reduce 32 variable input data (with some level of redundancy) by seven principal components (PCs) which account for 97.9% of the information from the original variables, and from these PCs form clusters of substations, using the Ward’s method, considering the Euclidean distance between the elements. The formed clusters allowed to classify the distribution systems in three categories regarding the number of occurrence of voltage sags (high, medium and low levels). Studies conducted in [
21] show a novel methodology to increase discriminatory power in the estimation of voltage sag patterns using ellipsoidal functions. Ward’s method was used to form clusters of substations with a similarity level to voltage sags, three distinct groups were found with small, medium and large amount of voltage sags. The work [
21] is an evolution of that presented in [
20]. The method showed results that are more precise, stable and reliable.