Calculation Method of Theoretical Line Loss in Low-Voltage Grids Based on Improved Random Forest Algorithm
Abstract
:1. Introduction
- The reasonable characteristic factors of the low-voltage grids are constructed according to the physical and operational characteristics. The concept of power supply torque is proposed for the first time.
- The random forest algorithm is improved by modifying the property classifying process of decision tree and optimizing the weight factor allocation method when data is missing. The problems of the high characteristic data integrity requirement is solved and the accuracy of the model is improved when a large number of samples are missing.
2. Analysis of Characteristic Factors
2.1. Influence Mechanism of Theoretical Line Loss
2.2. Qualitative Influence Analysis of the Factors
- (1)
- Static line factors
- (2)
- Dynamic operation factors
- (3)
- Fusion factors
2.3. Construction of Characteristic Factors
3. Construction of the Method
3.1. Model Based on Traditional Random Forest
- (1)
- Extract data samples randomly from original datasets and repeat the process for times, then sets of training datasets are obtained.
- (2)
- Input corresponding data set into each decision tree and select classifying property for each node of the decision tree. Randomly select a subset including properties from data properties, and then choose the optimum classifying property from the subset. Normally, equals to the integer closest to . Considering the decision tree algorithm is adopted for an individual learner in the random forest algorithm, the learning capability of the random forest algorithm is contingent on the performance of the decision tree. The implementation steps are described below:
- (a)
- If label values of all the data in are the same, the decision tree including only one node is generated and the node value is the same as the label value.
- (b)
- If is empty or all the data in S have the same value in , the decision tree including only one node is then generated, and the node value is the same as the label value belonging to most of the data samples in .
- (c)
- Select optimum classifying subset from .
- (d)
- Traverse all the values of , and form dataset including all the data with value of from property subset in .
- (e)
- If is empty, mark as node, and the node value is the same as the label value belonging to most of the data samples in .
- (f)
- If is not empty, treat as input dataset and as property set. Repeat steps (a)~(e) until a decision tree is generated.
- (3)
- Average strategy can be applied for regression. All the output values from the decision tree are averaged as final output value. Voting strategy can be applied for classification. Compare all the output classified values from the decision tree and take the one with most votes as the final output value.
- (4)
- Based on historical documentations and measurement data of various low-voltage grids from the consumption data collection system, marketing system, production management system, and geographic information system, the characteristic factor data can be calculated for each low-voltage grid using definition and the calculation principle of various characteristic factors. Meanwhile, abnormal characteristic data should be cleaned. Feed the cleaned sample data into a random forest algorithm for training and establish a theoretical line loss model of low-voltage grids. Finally, finish detailed theoretical line loss calculation with the established model. A detailed algorithm flow chart is given in Figure 6.
3.2. Improved Random Forest Algorithm
3.3. Evaluation of the Algorithm
4. Results and Discussion
4.1. Data Preparation
4.2. Analysis Based on Traditional Random Forest in High Cleaning Rate
4.3. Analysis Based on Traditional Random Forest in Lower Cleaning Rate
4.4. Analysis Based on Improved Random Forest in Lower Cleaning Rate
4.5. Discussion
- (1)
- According to definitions and calculation principles of electrical characteristics for low-voltage grids, a reasonable range of various characteristic are formed. The cleaning of abnormal sample data out of range is then performed. With the change of the sample data cleaning rule, model training effects using the random forest algorithm under the cleaning rates of 95.57% and 46.44% are compared. The accuracy errors of the model are 1.3239 and 1.7319, respectively.
- (2)
- The issue of the characteristic factor missing using modified random forest algorithm is solved. Furthermore, the model is trained by the modified random algorithm, and model accuracy error is only 1.2161 compared to other approaches when the sample data cleaning rate is 46.44%.
- (3)
- Correlation between the model calculated value and the observed value reached 0.6711 when the improved random forest algorithm was used in the situation of a lower sample cleaning rate, which was much higher than the other two situations (0.4522 and 0.4366). Meanwhile, it can also show the good calculation accuracy of improved random forest algorithm in different line loss intervals.
- (4)
- More characteristic samples can be preserved when using the modified random forest algorithm to deal with samples featuring characteristics missing than by using forced cleaning. Therefore, better accuracy can be obtained during model training and calculation, which demonstrates that it is more effective to calculate and analyze low-voltage grids’ theoretical line loss using the method proposed in this paper.
5. Conclusions
- (1)
- The reasonable electric characteristic factors of the low-voltage grids were constructed according to the physical and operational influencing mechanism of theoretical line loss. The concept of power supply torque was proposed for the first time to analyze the influence mechanism more accurately by coupling the physical factors and the operational factors.
- (2)
- The random forest algorithm was improved by modifying the property classifying process of the decision tree and optimizing the weight factor allocation method when sample data is missing. The problems of a high characteristic data integrity requirement was solved and the accuracy of the model was improved when a large amount of samples are missing. When the sample data cleaning rate changes from 95.57% to 46.44%, the accuracy of the traditional random forest increases from 1.3239 to 1.7319. However, the accuracy error of the improved random forest is only 1.2161 when the classifying process of the decision tree is modified.
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Ding, X.H.; Luo, Y.F.; Liu, W.; Shi, L.Z. Proposals on improving the current methods for calculating line losses of distribution network. Autom. Electr. Power Syst. 2001, 25, 57–60. [Google Scholar]
- Fu, X.Q.; Chen, H.Y. Energy losses estimation using equivalent time of average current loss method. Trans. China Electrotech. Soc. 2015, 30, 377–382. [Google Scholar]
- Zhang, K.K.; Yang, X.Y.; Bu, C.R.; Ru, W.; Liu, C.J.; Yang, Y.; Chen, Y. Theoretical analysis on distribution network loss based on load measurement and counter measures to reduce the loss. Proc. CSEE 2013, 33, 59–63. [Google Scholar]
- Liu, T.L.; Wang, S.; Zhang, Z.; Zhu, J.F. Newton-Raphson method for theoretical line loss calculation of low-voltage distribution transformer district by using the load electrical energy. Power Syst. Prot. Control 2015, 43, 143–148. [Google Scholar]
- Zhang, Y.; Wu, Y.F.; Zhang, F.; Yao, X.D.; Liu, A.; Tang, L.; Mo, J.G. A real-time three-phase line loss calculation method for distribution network based on feeder terminal unit. Energy Rep. 2022, 8, 146–152. [Google Scholar]
- Zhang, Y.; Zhu, Y.; Bai, X.Q.; Hua, W. CIM-based Data-sharing Scheme for Online Calculation of Theoretical Line Loss. Energy Procedia 2012, 16, 1619–1626. [Google Scholar] [CrossRef] [Green Version]
- Marcio, A.R.; André, L.V.G.; Miguel, E.M.U.; Eduardo, C.G.; Leonardo, M.O.Q. Technical loss estimation approach in power distribution systems using load model in frequency domain. Electr. Power Syst. Res. 2022, 209, 107982. [Google Scholar]
- Bao, H.; Ma, Q. Physical distribution mechanism of network loss for power system. Proc. CSEE 2005, 25, 82–86. [Google Scholar]
- Pablo, A.; Matias, A.K.; Selina, K. Flexibility management in the low-voltage distribution grid as a tool in the process of decarbonization through electrification. Energy Rep. 2022, 8, 248–256. [Google Scholar]
- Kim, Y.J. Development and analysis of a sensitivity matrix of a three-phase voltage unbalance factor. IEEE Trans. Power Syst. 2018, 33, 3192–3195. [Google Scholar] [CrossRef]
- Dai, Z.; Lin, W. Adaptive estimation of three-phase grid voltage parameters under unbalanced faults and harmonic disturbances. IEEE Trans. Power Electron. 2017, 32, 5613–5627. [Google Scholar] [CrossRef]
- Karami, E.; Gharehpetian, G.B.; Madrigal, M.; Chavez, J.D.J. Dynamic phasor-based analysis of unbalanced three-phase systems in presence of harmonic distortion. IEEE Trans. Power Syst. 2018, 33, 6642–6654. [Google Scholar] [CrossRef]
- Tan, Y.; Wang, Z. Incorporating unbalanced operation constraints of three-phase distributed generation. IEEE Trans. Power Syst. 2019, 34, 2449–2452. [Google Scholar] [CrossRef]
- Xu, C.; Song, X.; Tao, Y.; Yang, Q.Q. Research on influencing factors of line loss rate of regional distribution network based on apriori-interpretative structural model. Energy Rep. 2022, 8, 53–64. [Google Scholar]
- Xi, C.; Song, C.H.; Wang, T.R. Spatiotemporal analysis of line loss rate: A case study in China. Energy Rep. 2021, 7, 7048–7059. [Google Scholar]
- Wen, F.S.; Han, Z.X. The calculation of energy losses in distribution systems based upon a clustering algorithm and an artificial neutral network model. Proc. CSEE 1993, 13, 41–50. [Google Scholar]
- Jiang, H.L.; An, M.; Liu, X.J.; Zhao, X.; Zhang, J.H. The calculation of energy losses in distribution systems based on RBF network with dynamic clustering algorithm. Proc. CSEE 2005, 25, 35–39. [Google Scholar]
- Li, Y.; Liu, L.P.; Li, B.Q.; Yi, J.; Wang, Z.Z.; Tian, S.M. Calculation of Line Loss Rate in Transformer District Based on Improved K-Means Clustering Algorithm and BP Neural Network. Proc. Chin. Soc. Electr. Eng. 2016, 36, 4543–4552. [Google Scholar]
- Ma, L.Y.; Liu, J.H.; Lu, Z.G.; Wang, H.Y.; Yuan, Q.F.; Yang, L.P. Theoretical line loss calculation method of low voltage transform district based on deep belief network. Electr. Power Autom. Equip. 2020, 40, 7. [Google Scholar]
- Wang, S.X.; Zhou, K.; Su, Y. Line loss rate estimation method of transformer district based on random forest algorithm. Electr. Power Autom. Equip. 2017, 37, 39–45. [Google Scholar]
- Zhao, Q.M. The Calculation of Line Loss Rate in Transformer District Based on Affinity Propagation Algorithm and Random Forest Regression. Proc. CSU-EPSA 2020, 32, 94–98. [Google Scholar]
- Hu, W.; Guo, Q.; Wang, W.; Wang, W.; Song, S. Loss reduction strategy and evaluation system based on reasonable line loss interval of transformer area. Appl. Energy 2022, 306, 118123. [Google Scholar] [CrossRef]
- Bernard, S.; Adam, S.; Heutte, L. Dynamic random forests. Pattern Recognit. Lett. 2012, 33, 1580–1586. [Google Scholar] [CrossRef] [Green Version]
- Bonissone, P.; Cadenas, J.M.; Garrido, M.C.; DíAzvalladares, R.A. A fuzzy random forest. Int. J. Approx. Reason. 2010, 51, 729–747. [Google Scholar] [CrossRef] [Green Version]
- Ibrahim, I.A.; Khatib, T. A novel hybrid model for hourly global solar radiation prediction using random forests technique and firefly algorithm. Energy Convers. Manag. 2017, 138, 413–425. [Google Scholar] [CrossRef]
- Ristin, M.; Guillaumin, M.; Gall, J.; Van, G.L. Incremental learning of random forests for large-scale image classification. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 38, 490–503. [Google Scholar] [CrossRef]
- Anaissi, A.; Kennedy, P.J.; Goyal, M.; Catchpoole, D.R. A balanced iterative random forest for gene selection from microarray data. BMC Bioinform. 2013, 14, 261. [Google Scholar] [CrossRef] [Green Version]
No Additional Measurement Equipment Required | No Complete Grid and Line Parameters Required | Factors Can Be Interpreted by the Circuitous Philosophy | Feature Integrity Requirement | Complexity of the Model | Accuracy of the Model | |
---|---|---|---|---|---|---|
Ref. [5] | × | × | √ | High | Low | High |
Ref. [14] | × | √ | × | High | High | High |
Refs. [16,17] | √ | √ | × | Low | Moderate | Low |
Ref. [19] | √ | √ | √ | Moderate | Moderate | Moderate |
Refs. [20,21] | √ | √ | × | Moderate | High | Moderate |
Proposed method | √ | √ | √ | Low | Low | High |
Characteristic Factor | Characteristic Definition |
---|---|
Power supply radius | Physical distance from furthest load point to distribution transformer |
Total line length | Sum of total low-voltage line length in low-voltage grids |
User numbers | Total user numbers in low-voltage grids, including single-phase users and three-phase users |
Load rate | Ratio of power consumption capacity to rating capacity of distribution transformer |
Three-phase unbalance degree | Unbalance degree of three-phase current in three-phase power system, which is the relative deviation between the maximum and the mean value of the three-phase current in a transformer |
Load shape factor | Ratio of daily current RMS value to average value on distribution transformer side in low-voltage grids |
Transformer daily average power factor | Ratio of active power to apparent power in low-voltage grids |
power supply torque | Multiplication of average power supply distance of low-voltage grids load and user average power consumption capacity |
Characteristic Factors | Cleaning Rules |
---|---|
Power supply radius | [200, 800] |
Total line length | [500, 10,000] |
User numbers | [50, 500] |
Load rate | [5, 60] |
Three-phase unbalance degree | [0, 200] |
Load shape factor | (0, 20] |
Power factor | [0.8, 1] |
Power supply torque | [0, 16,000] |
Cleaning Rate | RMSE | Correlation Coefficient | |
---|---|---|---|
BPNN | 95.97% | 1.9875 | 0.4021 |
SVM | 95.97% | 1.7621 | 0.4978 |
KNN | 95.97% | 2.0255 | 0.3687 |
RF | 95.97% | 1.3239 | 0.4522 |
RF | 46.44% | 1.7319 | 0.4366 |
Proposed Method | 46.44% | 1.2639 | 0.6733 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Huang, L.; Zhou, G.; Zhang, J.; Zeng, Y.; Li, L. Calculation Method of Theoretical Line Loss in Low-Voltage Grids Based on Improved Random Forest Algorithm. Energies 2023, 16, 2971. https://doi.org/10.3390/en16072971
Huang L, Zhou G, Zhang J, Zeng Y, Li L. Calculation Method of Theoretical Line Loss in Low-Voltage Grids Based on Improved Random Forest Algorithm. Energies. 2023; 16(7):2971. https://doi.org/10.3390/en16072971
Chicago/Turabian StyleHuang, Li, Gan Zhou, Jian Zhang, Ying Zeng, and Lei Li. 2023. "Calculation Method of Theoretical Line Loss in Low-Voltage Grids Based on Improved Random Forest Algorithm" Energies 16, no. 7: 2971. https://doi.org/10.3390/en16072971
APA StyleHuang, L., Zhou, G., Zhang, J., Zeng, Y., & Li, L. (2023). Calculation Method of Theoretical Line Loss in Low-Voltage Grids Based on Improved Random Forest Algorithm. Energies, 16(7), 2971. https://doi.org/10.3390/en16072971