2. Related Work and Contribution
Presence of the distributed generation (DG) could lead to inaccurate operation of the system protection. Challenges that protection faces in DG environment are dynamics in the fault current magnitude, blinding of protection, unintentional islanding and loss of mains [
1]. To address these challenges, beside conventional methods, advanced methods have been implemented. These methods are separated to event estimation-based, fuzzy-based, field transform-based and intelligent fault detection methods [
8].
Conventional fault detection methods use fixed thresholds to detect faults, making them unsuitable for application in a microgrid that accommodated distributed generation (DG). Overcurrent protection, which is the basic protection method, could fail to detect a fault in a microgrid environment in both operating modes, grid-connected or islanded. Namely, false tripping or a failure to trip caused by the fault current level change are the main issues affecting the overcurrent protection [
2]. Differential protection, another traditional distribution system protection method, was adopted in the microgrid environment in [
9]. It was proved applicable, but the difficulty of determining a multi-terminal protection zone with several inputs makes it unsuitable. For example, a power line with a varying number of connected sources and loads requires continuous monitoring and updating the trip thresholds.
Event estimation-based protection schemes compare analytically obtained models with real-world measurements to detect faults. In [
10], the fault current transient derivative equations are derived for faults at busbars and feeders and used for the threshold selection. Since the observed system is DC, the current derivative magnitude during faults is significant, making the detection more simple. Moreover, the advantage of this protection scheme, compared to the traditional differential schemes, is that it works with a multi-terminal system. However, the analytical model must be very accurate or the protection might fail.
Fuzzy logic offers another, logic-based approach for system protection. The method’s fundamentals were explained in [
11] where it was also applied to the transmission line fault identification. The identification procedure based on eight rules enabled the differentiation of line-to-line, line-to-ground, and line-to-line involving ground faults. Furthermore, ten types of short-circuit faults (phase-to-phase, phase-to-ground, two-phase-to-ground, and three-phase) occurring at the transmission lines are successfully detected using fuzzy logic [
12]. Line loading and different fault resistances were considered and proven not to impact the fault identification, which is case effective in over 97% cases. Type-2 fuzzy logic is a generalization of type-1 fuzzy logic and offers a significant level of imprecision modeling [
13], making it a better choice for complex systems such as microgrids. It is successfully applied for microgrid protection in both islanded and grid-tied mode of operation [
14]. The presented protection method successfully detects and classifies faults and determines the fault direction. This protection strategy is immune to the fault location and type and can protect a microgrid even after a single-phase trip. However, according to [
15], fuzzy logic lacks real-time response, which is crucial for system protection.
Field transform-based methods transform signals from the time domain to a domain that could provide a more clear insight into the data characteristics. Commonly used methods that transform signals to frequency domain are Short-time Fourier transform (STFT), S-transform (ST) and Hilbert-Huang transform (HHT). STFT is introduced to overcome the drawbacks of the Fourier transform on the confined interval of the signal. Amplitude-frequency characteristic obtained by STFT is used for fault detection in most of STFT-based fault detection methods. HVDC (High Voltage Direct Current) protection against pole-to-pole faults based on STFT is proposed in [
16]. The current of the system is monitored and decomposed into frequency components. The standard deviation of the side lobes obtained from the amplitude-frequency characteristic is used for fault detection. During the transient states, high-frequency components will gain in amplitude and increase standard deviation. In the case of a fault, the increase will be significant compared to the load change. A similar method was used in [
17], but instead of the standard deviation, amplitudes at specific frequencies are used as the indicators of transients. Again, the current change will cause high-frequency amplitudes to increase and, if the threshold is reached, the trip signal will be sent to the circuit breakers. Amplitude and frequency provided by STFT enable detection of under/over-voltage and under/over-frequency in an AC microgrid [
18]. The voltage signal is monitored in real-time, and its magnitude and frequency change indicate disturbances in voltage magnitude and frequency.
S-transform is an extension to the special case of STFT, called the Gabor transform and the Wavelet transform [
19]. ST will, much like STFT, provide the frequency spectrum of the signal. ST-based islanding detection for distributed generation is proposed in [
19]. Negative sequence voltage and current are transformed and energy spectral content is obtained. Next, the cumulative sum of consecutive samples’ energy content is calculated. Load change and other DG trips produce a less significant change in the cumulative sum, which enables the threshold determination. The protection method for both islanded and grid-connected microgrid operating mode is also proposed [
20]. ST provided frequency spectrum where high frequencies have shown to be efficient and robust fault indicators independently of the fault parameters. Moreover, the computational burden of the proposed method is reduced by using a simplified version of ST, suitable for online calculation.
HHT is an adaptive method for time-frequency representation applied to non-stationary signals [
21]. It was applied for the AC microgrid protection in [
22] and compared to ST. The differential current was processed by HHT and its differential energy calculated and used as a fault detection parameter. The thresholds are divided into three ranges, for grid-connected, islanded, and high impedance fault (HIF). After a comparative evaluation, the authors concluded that HHT is as effective as ST. Multiterminal system protection, as stated before, has to provide protection at continuously changing system states. In [
23] HHT was applied for multiterminal HVDC system distance protection. As it is used in DC system fault detection, the transform is used to detect high-frequency components during transients. Voltage is the input and the output is the distance from the circuit breaker to the fault location. The role of the HHT is to provide the instantaneous amplitude and frequency of the signal components, which is later averaged and used for distance estimation. The algorithm was also implemented for real-time testing, where it showed a 10% error. It should be taken into account that during the real-time testing, the signal noise is present and the method does not use any communication.
Wavelet transform (WT) in its discrete form (DWT) is also a popular choice by many researchers. Wavelets are analyzing functions that adjust their time width to the frequency [
24]. The transform decomposes signal to produce a set of coefficients, later used for fault detection. WT-based transmission line distance protection [
25] uses one decomposition level containing high frequencies for disturbance detection and two levels for phasor estimation. High frequencies are again used for fault detection, similar to the STFT, ST, and HHT-based protection methods. For detection, db1 mother wavelet is used, and for estimation db4. Once the current disturbance is detected, the impedance is calculated from the estimated current and voltage phasors. The method proved effective, with the ability to detect HIFs. WT was also applied for DC microgrid protection in [
26], where the second derivative of the current is subjected to the transform. Level 2 WT coefficients’ energy is extracted for the fault indication. However, compared to DWT, wavelet packet transform (WPT) provides more precise analysis [
27]. Both, DWT and WPT use high and low-pass filters to extract components. In every decomposition level of DWT, only the low-frequency component is again decomposed by a low and high-pass filter. In the case of WPT, both components are decomposed, making it more accurate. In [
27] a WPT-based fault detection of a photovoltaic system is proposed. As a fault indicator, level 2 coefficient (500–750 Hz) energy of voltage and impedance are used. Recently, another WT method, called an un-decimated wavelet transform (UWT), was introduced for fault detection. It was used in [
8], where the authors find UWT more suitable than DWT and WPT for real-time fault detection. It is also stated that the method is less sensitive to noise.
Considering the amount of information obtained by transforming a signal into the frequency domain, ST, WT and HHT provide a more detailed observation compared to the STFT. The frequency resolution of the STFT depends strongly on the window size. Since the window size of STFT is fixed, a long window will have a problem detecting short perturbations, while a short window will poorly depict low frequencies. This problem is overcome with ST using the time and frequency dependent window function and WT with variable frequency resolution. These features are better suited for the analysis of non-stationary signals, which is the case when detecting faults/disturbances. However, choosing a suitable “mother wavelet” and the appropriate decomposition level is a challenge when using WT. HHT uses an adaptive basis function, making it suitable not only for the analysis of non-stationary but also non-linear signals. However, mode mixing in the Empirical Mode Decomposition (EMD) part of HHT presents a problem when intermittent waves occur at a lower-frequency signal [
28]. In addition, complex methods have a more complex implementation, which increases the response time. In contrast, the STFT uses a simple algorithm based on the DFT. The DFT has already been implemented in digital protection relays [
29], so it can easily be adapted for this purpose.
Over the past few years, intelligent classifiers established themselves as a reliable choice for fault detection, usually combined with one of the field transform-based methods. Decision trees, artificial neural networks, naive Bayes, and support vector machine are often used as classifiers. An example of the direct application of intelligent classifiers for fault detection is presented in [
30]. Artificial neural network’s (ANN) inputs are voltage and current time signals and outputs are binary variables that indicate whether the fault is detected and the direction of the fault. The method proved reliable with an accuracy of 99% and section identification accuracy of 100%. However, the time signal is usually transformed using field transform-based methods first. Features are then selected from the transformed signal and fed to the intelligent classifier. For example, when WT is a feature provider its output coefficients are used for feature selection. DWT provided features to k-Nearest Neighbours (k-NN) [
31] and Bayes [
32] classifiers for power system fault detection. Both the Bayes and the k-NN classifiers showed capable of detecting HIFs among other transients. For the microgrid fault detection, DWT is combined with support vector machine (SVM) in [
33] and decision tree (DT)/random forest (RF) in [
34]. The mentioned SVM-based protection uses a standard deviation of the coefficients obtained by DWT as classifier input. Moreover, a single SVM and SVM ensemble is tried and the ensemble method was proven to be more effective. DT and RF-based protection used the change in energy, Shannon entropy, and standard deviation of DWT coefficients as the features. Both methods proved accurate, but RF faced implementation issues. In [
35], HHT provided features in the form of energy distribution and standard deviation of the signal component amplitude and phase. ANN classifier was used and achieved 92.85% accuracy. The same classifier was used with ST for fault detection in a radial distribution system. Again, various features such as maximum amplitude and frequency of the S-matrix and its standard deviation and entropy were extracted and used for model training. A simple form of ANN, feed-forward neural network (FFNN) was used as a classifier for ST-based fault detection in a distribution system in [
36]. The used features included the standard deviation of the
-matrices along with their means and skewness. In [
37], the microgrid protection used STFT to extract features from the voltage signal and DT/RF for detection and classification. The features were extracted from the main frequency contour. Some of the features are the average, root mean square (RMS), and kurtosis. The method proved to be very accurate. Finally, clustering and classification of pulsed loads on a naval shipboard power system presented in [
38] also use STFT of current signal for feature vector extraction.
The contribution of this paper is the development of a protection method based on the Short-time Fourier transform (STFT) and intelligent classification. Since the STFT is merely investigated as a feature provider, in this work it will be combined with different classifiers. The employed classifiers are logistic regression, naive Bayes, k-nearest neighbours (k-NN), DT, SVM, and AdaBoost. PV-based microgrids with battery energy storage systems are becoming increaasingly common, which is why it was selected as a case study.
Section 3 presents STFT-based fault detection method and applied Machine learning methods. In
Section 4 microgrid simulation setup is described together with the STFT parameters and classifier evaluation method used.
Section 5 offers a results of the proposed method, and
Section 6 concludes the paper.
5. Results
The proposed STFT-based protection method was applied to the PV-based microgrid described above. The pole-to-pole fault was investigated by short-circuiting poles with resistances from 0.1 to 20
, and load change by setting step change of the current reference of the DC-DC converter. The results presented in the
Table 2,
Table 3,
Table 4,
Table 5,
Table 6,
Table 7 and
Table 8 show intelligent classifier weighted F1-score in percentages for different taper (
) and window sizes. The tables are separated by an (
) value, since this value determines the effects of the windowing on the original signal, and thus on the features. The window size should also have a significant impact, since the longer window provides more features. Note that a longer window with a fixed hop size limits feature variation.
Table 2 shows the results for the rectangle window (
). Here the window function has no effect on the data, i.e., the data is multiplied by one. The absence of windowing does not seem to bother nonparametric methods, as they show an accuracy of over 97%, except for AdaBoost. Decision Tree achieved the best score of 98.74%. The k-NN is slightly lower at 98.63%. Both scores are achieved for window size 128. As far as the parametric methods are concerned, the score of logistic regression ranks with the nonparametric methods. Naive Bayes has a problem with longer windows, while the first two show good results.
Increasing from 0 to 0.15 affects the edges of the window function, which are now brought to zero. This change has the biggest impact on the Naive Bayes classifier, whose performance is down by 4% to 45%. Logistic regression and SVM experience performance decrease for the window size 16, while AdaBoost increases its performance for the same window size. Other classifiers experience only a slight change in performance.
The increase from
to 0.35 does not seem to have much impact on the nonparametric methods, with the exception of AdaBoost, which now demonstrated better results in two out of four window sizes, and SVM, which lost high performance for window size 16. For this taper size, the k-NN shows the best result, with Decision tree closely following. Naive Bayes remains fairly inaccurate as can be seen from
Table 4. Logistic regression also kept high performance for all windows except the size 16.
Increasing to 0.50 results with remarkable results, as k-NN and Decision tree score increase above 99% for the window size 128. Logistic regression and SVM again increased their score above 90%. AdaBoost also increased its score above 90% for three out of four windows with a significant reduction in the score for window size 32 compared to . Naive Bayes increased its score to 96.79% for window size 32, but the score for other windows remains unsatisfactory.
Further increase of to 0.75 results in only a slight change for nonparametric methods, again with the exception of AdaBoost. The logistic regression also shows a slight change with an decrease of 1% and 2% for the window sizes 128 and 16. Naive Bayes is still unable to achieve satisfactory results as its score is below 85%. AdaBoost has continued the trend of increasing its score, with the lowest score being 88.25% which is an acceptable result. k-NN, Decision tree and SVM remain consistent with their previous scores.
As
approaches the value 1, all classifiers except Naive Bayes score above 90% (
Table 7). k-NN and DT still have the best score, followed by logistic regression and SVM. AdaBoost achieves score above 90% for all windows sizes for the first time. In contrast, Naive Bayes scored worse for the three of the four window sizes.
Finally, for (Hanning window) all classifiers except one score over 93% for all window sizes. The decision tree shows the best score with 99.33% for the window size 128. k-NN with score 99.28% for the same window size is very close to the best score. The score of SVM and AdaBoost increased about five percentage points for window size 16, compared to . Naive Bayes recorded a significant increase in score, achieving over 88% for three out four window sizes.
The Decision tree classifier achieves the best overall score with 99.33% for the Hanning () window function and window size 128. The k-NN comes very close as the second best result with 99.28% for the same settings. Both classifiers proved to be consistent and reliable, independent of the window and taper size, with a score of over 97% for all the examined cases. The SVM is also reliable, with a score over 96% for window sizes 32, 64 and 128 for all taper sizes. However, for window size 16, the score varies from 89.63 to 97.35%. Logistic regression, although parametric method, performed similar to the SVM. Its score was above 95% for window sizes 32, 64, and 128, but in the range of 83.81-97.30% for the window size 16. AdaBoost behaves differently for different window sizes. Window sizes 16 and 64 have the best results for , 32 for , and 128 for . Its best overall score is 97.23%, for Hanning window and window size 64. Naive Bayes is also depend on the taper and window size. The best score is achieved for window sizes 16, 32 and 64, where it exceeds 89% for a few cases. For window size 128, however, the best overall score is only 59.30%.
The interpretability of parametric methods could reveal information about features. The high score of logistic regression implies that classes of data points are linearly separable. Furthermore, the basic assumption that makes Naive Bayes classifier is that all features are independent. Its poor performance implies that the features used in this work are not independent, which is reasonable, since the STFT is used for feature extraction.