Q-RPL: Q-Learning-Based Routing Protocol for Advanced Metering Infrastructure in Smart Grids
Abstract
:1. Introduction
Contribution and Organization
- We propose a novel routing strategy based on a Reinforcement Learning technique, Q-learning, to improve the routing decisions of RPL in AMI deployments.
- Our approach balances the use of RL and traditional routing metrics, while the parent selection is guided by the Q-learning algorithm, traditional routing metrics like the Expected Transmission Count (ETX) and Received Signal Strength Indicator (RSSI) are used to enhance the Q-learning policy and the exploration-exploitation strategy.
- We have conducted simulations using smart meter locations from two real deployments of smart meters in the cities of Montreal and Barcelona to evaluate the performance of our proposed routing strategy and compare it to other benchmark protocols. The results show a significant improvement in key performance metrics such as PDR, average end-to-end delay, and compliant factor.
- Q-RPL bridges the gap between traditional routing methods and advanced Machine Learning techniques, offering insights into how these two domains can be effectively combined for improved network performance and reliability. Our approach, although used here with RPL, could potentially be adapted to other routing protocols used in the same context.
2. Related Work
3. RPL Parent Selection Background
4. Q-RPL: Q-Learning-Based Routing Protocol Design
4.1. Q-Learning Algorithm
4.2. State–Action Space Design
4.3. Reward and Policy Design
- First attempt success. A reward of is granted for a packet that successfully reaches its next hop on the first attempt, not requiring re-transmissions. This scenario represents the ideal case, where the routing decision leads to an efficient and effective outcome.
- Success on the first re-transmission. After a failure on the first transmission attempt, if the packet is successfully transmitted on the first re-transmission attempt, the reward is set to . This represents a less optimal successful transmission.
- Success on the second re-transmission. A reward of 0 is given for a packet that reaches the next hop on the second re-transmission. This situation indicates a less efficient routing decision.
- Success on the third re-transmission. A reward of is given for a packet that reaches the next hop on the third re-transmission. This situation warrants a slight penalty.
- Failure to transmit. If the packet fails to reach the next hop after all attempts, the reward is . This condition represents an action failure and is penalized accordingly.
Algorithm 1 -Greedy Policy. |
Require: |
Initialize: , , |
while the learning process is ongoing do |
Choose a random number x, |
if then |
Choose a random action : |
else |
Choose the best-known action : |
end if |
Decrease gradually: |
end while |
4.4. Integration into RPL
Algorithm 2 Enhanced Parent Selection Algorithm for RPL. |
Require: , a set of n candidate parents of node i, where is the . . |
Initialize: , , , , , . |
for each new routing decision do |
Choose next-hop based on -greedy policy: |
Choose a random number x, |
if then |
Choose a random action : |
else |
Choose the best-known action : |
if in Q-values then |
Identify all actions with max Q-value: |
Choose the action with the lowest ETX value: |
end if |
end if |
Transmit packet to chosen next-hop: |
Observe outcome and calculate reward: |
if then |
else |
get → re-transmission attempt |
end if |
Update Q-value for the state–action pair: |
Update : |
end for |
Monitor RSSI of nodes in : |
if < , then |
if == AND k == then |
trigger exploration by resetting to initial value |
else |
remove k from |
end if |
end if |
5. Performance Evaluation
5.1. Simulation Settings
5.2. Montreal Scenario
5.3. Barcelona Scenario
5.4. General Discussion
6. Technical/Critical Analysis and Recommendations for Deployment
6.1. Technical Analysis
6.2. Critical Analysis
6.3. Recommendation for Deployment
- Initial testbed trials: Begin with small-scale experiments on actual hardware to understand how the protocol performs outside of simulation. This step is crucial for identifying any unforeseen issues that were not apparent during the simulation study.
- Adaptation to hardware constraints: This step may be necessary to ensure that the algorithm can operate efficiently without overwhelming device capabilities, maintaining optimal performance even within resource constraints. This step is important if initial evaluations indicate that the current learning algorithm exceeds the device’s operational limits.
- Incremental deployment: Gradually increases the scale of deployment while continuously monitoring system performance. This step allows for adjusting strategies in response to real-world challenges and complexities as they arise.
- Performance monitoring and optimization: Continuously collect and analyze performance data to optimize the protocol settings and adjustments.
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
AEs | Alarm Events |
AMI | Advanced Metering Infrastructure |
AUC | Area Under the Curve |
CF | Compliant Factor |
CTP | Collection Tree Protocol |
DAO | Destination Advertisement Object |
DAP | Data Aggregation Points |
DIO | DAG information Object |
DIS | DODAG Information Solicitation |
DODAG | Destination-Oriented Directed Acyclic Graph |
ETX | Expected Transmission Count |
EWMA | Exponential Weighted Moving Average |
GBDT | Gradient Boosted Decision Tree |
GPSR | Geographic Routing Protocol |
HWMP | Hybrid Wireless Mesh Protocol |
HYDRO | Hybrid Routing Protocol |
IoT | Internet of Things |
LLN | Low-Power and Lossy Network |
LOADng | Lightweight On-Demand Ad hoc Distance-vector Routing |
Protocol–Next-Generation | |
MAC | MAC Access Control |
ML | Machine Learning |
MR | Meter Reading |
MRHOF | Minimum Rank with Hysteresis Objective Function |
OLSR | Optimized Link State Routing Protocol |
PDR | Packet Delivery Ratio |
PQ | Power Quality |
PRR | Packet Reception Ratio |
QoS | Quality of Service |
RF | Random Forest |
RL | Reinforcement Learning |
RPL | Routing Protocol for Low-Power and Lossy Networks |
RSSI | Received Signal Strength Indicator |
SGs | Smart Grids |
UCB | Upper Confidence Bound |
Wi-SUNs | Wireless Smart Utility Networks |
WRF-RPL | Weighted Random Forward RPL |
WSNs | Wireless Sensor Networks |
References
- IEEE P802.11s Draft D; IEEE Draft Standard for Information Technology-Telecommunications and Information Exchange between Systems-Local and Metropolitan Area Networks-Specific Requirements-Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications-Amendment 10: Mesh Networking. IEEE: Piscataway, NJ, USA, 2021.
- IEEE 802.15.4g; IEEE Standard for Local and Metropolitan Area Networks-Part 15.4: Low-Rate Wireless Personal Area Networks (LR-WPANs) Amendment 3: Physical Layer (PHY) Specifications for Low-Data-Rate, Wireless, Smart Metering Utility Networks. IEEE: Piscataway, NJ, USA, 2011.
- Harada, H.; Mizutani, K.; Fujiwara, J.; Mochizuki, K.; Obata, K.; Okumura, R. IEEE 802.15. 4g based Wi-SUN communication systems. IEICE Trans. Commun. 2017, 100, 1032–1043. [Google Scholar] [CrossRef]
- Chang, K.H.; Mason, B. The IEEE 802.15. 4g standard for smart metering utility networks. In Proceedings of the 2012 IEEE Third International Conference on Smart Grid Communications (SmartGridComm), Tainan, Taiwan, 5–8 November 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 476–480. [Google Scholar]
- Karp, B.; Kung, H.T. GPSR: Greedy Perimeter Stateless Routing for Wireless Networks. In Proceedings of the 6th Annual International Conference on Mobile Computing and Networking, New York, NY, USA, 6–11 August 2000; MobiCom ’00. pp. 243–254. [Google Scholar] [CrossRef]
- Fonseca, R.; Gnawali, O.; Jamieson, K.; Kim, S.; Levis, P.; Woo, A. The collection tree protocol (CTP). TinyOS TEP 2006, 123, 1–14. [Google Scholar]
- Clausen, T.; Dearlove, C.; Jacquet, P.; Herberg, U. The Optimized Link State Routing Protocol Version 2; RFC 7181; Internet Engineering Task Force (IETF): Fremont, CA, USA, 2014. [Google Scholar]
- Winter, T.; Thubert, P.; Brandt, A.; Hui, J.; Kelsey, R.; Levis, P.; Pister, K.; Struik, R.; Vasseur, J.P. IPv6 Routing Protocol for Low-Power and Lossy Networks; RFC 6550; Internet Engineering Task Force (IETF): Fremont, CA, USA, 2012. [Google Scholar]
- Dawson-Haggerty, S.; Tavakoli, A.; Culler, D. Hydro: A hybrid routing protocol for low-power and lossy networks. In Proceedings of the 2010 First IEEE International Conference on Smart Grid Communications, Gaithersburg, MD, USA, 4–6 October 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 268–273. [Google Scholar]
- Clausen, T.; Yi, J.; Herberg, U. Lightweight on-demand ad hoc distance-vector routing-next generation (LOADng): Protocol, extension, and applicability. Comput. Netw. 2017, 126, 125–140. [Google Scholar] [CrossRef]
- Joshi, A.; Bahr, M. HWMP specification. IEEE P802 2006, 11, 802–811. [Google Scholar]
- Darabkh, K.A.; Al-Akhras, M.; Zomot, J.N.; Atiquzzaman, M. RPL routing protocol over IoT: A comprehensive survey, recent advances, insights, bibliometric analysis, recommendations, and future directions. J. Netw. Comput. Appl. 2022, 207, 103476. [Google Scholar] [CrossRef]
- Iyer, G.; Agrawal, P.; Monnerie, E.; Cardozo, R.S. Performance analysis of wireless mesh routing protocols for smart utility networks. In Proceedings of the 2011 IEEE International Conference on Smart Grid Communications (SmartGridComm), Brussels, Belgium, 17–20 October 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 114–119. [Google Scholar]
- Iyer, G.; Agrawal, P.; Cardozo, R.S. Performance comparison of routing protocols over smart utility networks: A simulation study. In Proceedings of the 2013 IEEE Globecom Workshops (GC Wkshps), Atlanta, GA, USA, 9–13 December 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 969–973. [Google Scholar]
- Ho, Q.D.; Gao, Y.; Rajalingham, G.; Le-Ngoc, T. Performance and applicability of candidate routing protocols for smart grid’s wireless mesh neighbor area networks. In Proceedings of the 2014 IEEE International Conference on Communications (ICC), Sydney, NSW, Australia, 10–14 June 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 3682–3687. [Google Scholar]
- Ghaleb, B.; Al-Dubai, A.Y.; Ekonomou, E.; Alsarhan, A.; Nasser, Y.; Mackenzie, L.M.; Boukerche, A. A Survey of Limitations and Enhancements of the IPv6 Routing Protocol for Low-Power and Lossy Networks: A Focus on Core Operations. IEEE Commun. Surv. Tutorials 2019, 21, 1607–1635. [Google Scholar] [CrossRef]
- Lamaazi, H.; Benamar, N. A comprehensive survey on enhancements and limitations of the RPL protocol: A focus on the objective function. Hoc Netw. 2020, 96, 102001. [Google Scholar] [CrossRef]
- Santos, C.L.D.; Mezher, A.M.; León, J.P.A.; Barrera, J.C.; Guerra, E.C.; Meng, J. ML-RPL: Machine Learning-based routing protocol for Wireless Smart Grid Networks. IEEE Access 2023, 11, 57401–57414. [Google Scholar] [CrossRef]
- Mezher, A.M.; Dueñas Santos, C.L.; Rebollo-Monedero, D.; Cárdenas-Barrera, J.; Aguilar Igartua, M.; Meng, J.; Castillo Guerra, E. GNB-RPL: Gaussian Naïve Bayes for RPL Routing Protocol in Smart Grid Communications. In Proceedings of the 19th ACM International Symposium on QoS and Security for Wireless and Mobile Networks, Montreal, QC, Canada, 30 October–3 November 2023; pp. 53–60. [Google Scholar]
- Mezher, A.M.; Dueñas Santos, C.L.; Astudillo Leon, J.P.; Cárdenas-Barrera, J.; Meng, J.; Castillo Guerra, E. Are ML Models Scenario-Independent in Enhancing Routing Efficiency for Smart Grid Networks? In Proceedings of the Int’l ACM Symposium on Performance Evaluation of Wireless Ad Hoc, Sensor, & Ubiquitous Networks, New York, NY, USA, 30 October–2 November 2023; PE-WASUN ’23. pp. 83–90. [Google Scholar] [CrossRef]
- Pister, K.; Dejean, N.; Barthel, D. Routing Metrics Used for Path Calculation in Low-Power and Lossy Networks; RFC 6551; RFC 7181; Internet Engineering Task Force (IETF): Fremont, CA, USA, 2012. [Google Scholar]
- Thubert, E.P. Objective Function Zero for the Routing Protocol for Low-Power and Lossy Networks (RPL); RFC 6552; Internet Engineering Task Force (IETF): Fremont, CA, USA, 2012; pp. 5–48. [Google Scholar]
- Gnawali, P.L.O. The Minimum Rank with Hysteresis Objective Function; RFC 6719; Internet Engineering Task Force (IETF): Fremont, CA, USA, 2012. [Google Scholar]
- Mardini, W.; Aljawarneh, S.; Al-Abdi, A. Using Multiple RPL Instances to Enhance the Performance of New 6G and Internet of Everything (6G/IoE)-Based Healthcare Monitoring Systems. Mob. Netw. Appl. 2021, 26, 952–968. [Google Scholar] [CrossRef]
- Bhandari, K.S.; Ra, I.H.; Cho, G. Multi-Topology Based QoS-Differentiation in RPL for Internet of Things Applications. IEEE Access 2020, 8, 96686–96705. [Google Scholar] [CrossRef]
- Musaddiq, A.; Zikria, Y.B.; Zulqarnain; Kim, S.W. Routing protocol for Low-Power and Lossy Networks for heterogeneous traffic network. Eurasip J. Wirel. Commun. Netw. 2020, 2020, 1–23. [Google Scholar] [CrossRef]
- Acevedo, P.D.; Jabba, D.; Sanmartin, P.; Valle, S.; Nino-Ruiz, E.D. WRF-RPL: Weighted Random Forward RPL for High Traffic and Energy Demanding Scenarios. IEEE Access 2021, 9, 60163–60174. [Google Scholar] [CrossRef]
- Mishra, S.N.; Khatua, M. Achieving Hard Reliability in RPL for Mission-Critical IoT Applications. In Proceedings of the 2022 IEEE 8th World Forum on Internet of Things (WF-IoT), Yokohama, Japan, 26 October–11 November 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1–6. [Google Scholar]
- Gaddour, O.; Koubǎa, A.; Baccour, N.; Abid, M. OF-FL: QoS-aware fuzzy logic objective function for the RPL routing protocol. In Proceedings of the 2014 12th International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks, WiOpt 2014, Hammamet, Tunisia, 12–16 May 2014; pp. 365–372. [Google Scholar] [CrossRef]
- Harshavardhana, T.G.; Vineeth, B.S.; Anand, S.V.; Hegde, M. Power control and cross-layer design of RPL objective function for low power and lossy networks. In Proceedings of the 2018 10th International Conference on Communication Systems and Networks, COMSNETS 2018, Bengaluru, India, 3–7 January 2018; pp. 214–219. [Google Scholar] [CrossRef]
- Darabkh, K.A.; Al-Akhras, M.; Ala’F, K.; Jafar, I.F.; Jubair, F. An innovative RPL objective function for broad range of IoT domains utilizing fuzzy logic and multiple metrics. Expert Syst. Appl. 2022, 205, 117593. [Google Scholar] [CrossRef]
- Prajapati, V.K.; Sharma, T.; Awasthi, L.K. Data Dissemination Framework for Optimizing Overhead in IoT-Enabled Systems Using Tabu-RPL. SN Comput. Sci. 2024, 5, 343. [Google Scholar] [CrossRef]
- Shetty, S.P.; Shetty, M.; Kishore, V.; Shetty, P. Trickle timer modification for RPL in Internet of things. Soft Comput. 2024, 28, 2621–2635. [Google Scholar] [CrossRef]
- Duenas Santos, C.L.; Astudillo León, J.P.; Mezher, A.M.; Cardenas Barrera, J.; Meng, J.; Castillo Guerra, E. RPL+: An Improved Parent Selection Strategy for RPL in Wireless Smart Grid Networks. In Proceedings of the 19th ACM International Symposium on Performance Evaluation of Wireless Ad Hoc, Sensor, & Ubiquitous Networks, Montreal, QC, Canada, 24–28 October 2022; pp. 75–82. [Google Scholar]
- Raschka, S.; Mirjalili, V. Python Machine Learning: Machine Learning and Deep Learning with Python, Scikit-Learn, and TensorFlow; Packt Publishing Ltd.: Birmingham, UK, 2017. [Google Scholar]
- Sun, Y.; Peng, M.; Zhou, Y.; Huang, Y.; Mao, S. Application of machine learning in wireless networks: Key techniques and open issues. IEEE Commun. Surv. Tutorials 2019, 21, 3072–3108. [Google Scholar] [CrossRef]
- Ridwan, M.A.; Radzi, N.A.M.; Abdullah, F.; Jalil, Y. Applications of machine learning in networking: A survey of current issues and future challenges. IEEE Access 2021, 9, 52523–52556. [Google Scholar] [CrossRef]
- Tang, F.; Mao, B.; Kawamoto, Y.; Kato, N. Survey on machine learning for intelligent end-to-end communication toward 6G: From network access, routing to traffic control and streaming adaption. IEEE Commun. Surv. Tutorials 2021, 23, 1578–1598. [Google Scholar] [CrossRef]
- Kim, B.S.; Suh, B.; Seo, I.J.; Lee, H.B.; Gong, J.S.; Kim, K.I. An Enhanced Tree Routing Based on Reinforcement Learning in Wireless Sensor Networks. Sensors 2023, 23, 223. [Google Scholar] [CrossRef] [PubMed]
- Zahedy, N.; Barekatain, B.; Quintana, A.A. RI-RPL: A new high-quality RPL-based routing protocol using Q-learning algorithm. J. Supercomput. 2023, 80, 7691–7749. [Google Scholar] [CrossRef]
- Alilou, M.; Babazadeh Sangar, A.; Majidzadeh, K.; Masdari, M. QFS-RPL: Mobility and energy aware multi path routing protocol for the internet of mobile things data transfer infrastructures. Telecommun. Syst. 2024, 85, 289–312. [Google Scholar] [CrossRef]
- Rabet, I.; Fotouhi, H.; Alves, M.; Vahabi, M.; Björkman, M. ACTOR: Adaptive Control of Transmission Power in RPL. Sensors 2024, 24, 2330. [Google Scholar] [CrossRef] [PubMed]
- Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction; MIT Press: Cambridge, MA, USA, 1998; p. 22447. [Google Scholar]
- Raschka, S.; Liu, Y.H.; Mirjalili, V.; Dzhulgakov, D. Machine Learning with PyTorch and Scikit-Learn: Develop Machine Learning and Deep Learning Models with Python; Packt Publishing Ltd.: Birmingham, UK, 2022. [Google Scholar]
- IEEE Std 802.15.4-2020 (Revision of IEEE Std 802.15.4-2015); IEEE Standard for Low-Rate Wireless Networks. IEEE: Piscataway, NJ, USA, 2020; pp. 1–800. [CrossRef]
- Kim, H.S.; Cho, H.; Kim, H.; Bahk, S. DT-RPL: Diverse bidirectional traffic delivery through RPL routing protocol in low power and lossy networks. Comput. Netw. 2017, 126, 150–161. [Google Scholar] [CrossRef]
- OMNeT++ Discrete Event Simulator. Available online: https://omnetpp.org/ (accessed on 10 July 2024).
- Adday, G.H.; Subramaniam, S.K.; Zukarnain, Z.A.; Samian, N. Investigating and Analyzing Simulation Tools of Wireless Sensor Networks: A Comprehensive Survey. IEEE Access 2024, 12, 22938–22977. [Google Scholar] [CrossRef]
- The ns-3 Network Simulator Project. ns-3 Network Simulator. Available online: https://www.nsnam.org/ (accessed on 29 April 2024).
- Bartolozzi, L.; Pecorella, T.; Fantacci, R. ns-3 RPL module: IPv6 routing protocol for low power and lossy networks. In Proceedings of the 5th International ICST Conference on Simulation Tools and Techniques, Desenzano del Garda, Italy, 19–23 March 2012; pp. 359–366. [Google Scholar]
- Chen, Y.b.; Hou, K.M.; Chanet, J.P.; El Gholami, K. A RPL based Adaptive and Scalable Data-collection Protocol module for NS-3 simulation platform. In Proceedings of the NICST 2103 New Information Communication Science and Technology for Sustainable Development: France-China International Workshop, Clermont-Ferrand, France, 18–20 September 2013; p. 8. [Google Scholar]
- El Ghomali, K.; Elkamoun, N.; Hou, K.M.; Chen, Y.; Chanet, J.P.; Li, J.J. A new WPAN Model for NS-3 simulator. In Proceedings of the NICST’2103 New Information Communication Science and Technology for Sustainable Development: France-China International Workshop, Clermont-Ferrand, France, 18–20 September 2013; p. 8. [Google Scholar]
- Nagai, Y.; Guo, J.; Orlik, P.; Sumi, T.; Rolfe, B.A.; Mineno, H. Sub-1 ghz frequency band wireless coexistence for the internet of things. IEEE Access 2021, 9, 119648–119665. [Google Scholar] [CrossRef]
- Leon, J.P.A.; Rico-Novella, F.J.; De La Cruz Llopis, L.J. Predictive Traffic Control and Differentiation on Smart Grid Neighborhood Area Networks. IEEE Access 2020, 8, 216805–216821. [Google Scholar] [CrossRef]
- León, J.P.A.; Santos, C.L.D.; Mezher, A.M.; Barrera, J.C.; Meng, J.; Guerra, E.C. Exploring the potential, limitations, and future directions of wireless technologies in smart grid networks: A comparative analysis. Comput. Netw. 2023, 235, 109956. [Google Scholar] [CrossRef]
Work | Key Contribution | Methodology | Performance Metrics | Limitations | Main Results |
---|---|---|---|---|---|
[24] | Uses multiple RPL instances to manage diverse traffic types in 6G/IoE health systems. | Standard RPL with hop count and ETX, tested in Cooja. | Improvements in packet delivery and highlighting significant latency reductions. | Standard RPL limitations may not meet all QoS needs in dynamic environments. | Enhanced diverse traffic management and efficiency in health applications. |
[25] | Multi-topology RPL enhances QoS by novel parent selection in diverse traffic. | Multi-attribute decision-making for IoT parent selection. | Assesses delay, packet loss, PDR, throughput, queue loss ratio, routing overhead, and energy. | Limited simulation time may affect long-term adaptability in diverse network conditions | Improves QoS, shows scalability in large networks. |
[26] | QWL-RPL targets heterogeneous traffic with dynamic queue and workload-based routing. | Utilizes queue length and MAC transmission rates for parent selection to minimize congestion. | Evaluates overhead, PRR, delay, and jitter under various loads. | Omission of link quality in routing risks suboptimal paths. | Reduces congestion significantly and improves PRR, delay, and jitter, but may overlook link quality. |
[27] | WRF-RPL integrates energy and parent counts for balanced routing, enhancing network life and PDR. | Applies a weighted random forward method using energy and parent count for next-hop selection. | Evaluates network life, PDR, control overhead, and energy under high traffic. | Lacks link quality metrics, risking suboptimal paths; parent count may not reflect actual load. | Increases network life and PDR, risks uneven load distribution and inefficiencies. |
[28] | RMP-RPL uses node mobility, connectivity, and ETX for reliable multi-parent selection. | Dynamic multi-path selection based on mobility, connectivity, and ETX. | Evaluates PDR, end-to-end delay, and overhead, showing notable improvements. | May increase congestion due to multiple simultaneous transmissions and is untested across variable traffic loads. | Enhances PDR and reduces delays, suitable for mision-critical applications with scalability concerns. |
[31] | FL-HELR-OF uses fuzzy logic in a cross-layer architecture to dynamically select parents using hop count, energy, latency, and RSSI. | Fuzzy logic integrates multiple metrics into a cohesive routing decision framework. | Assesses PDR, latency, energy, overhead, and hop count, showing major improvements. | Requires complex setup and tuning, with potential scalability challenges. | Exceeds standard RPL in performance, improving reliability and efficiency in diverse IoT setups. |
[34] | RPL+ uses Random Forest to refine parent selection by analyzing and weighting routing metrics. | Employs Random Forest for analyzing ETX, MAC losses, channel utilization, and throughput to optimize routing. | Evaluates PDR and end-to-end delay, noting improvements in network responsiveness. | Static weights in decision-making may limit adaptability to network changes. | Shows significant PDR improvements, enhancing reliability and efficiency compared to standard RPL. |
[18] | ML-RPL uses CatBoost GBDT to enhance routing decisions via comprehensive metrics. | Model trained on smart meter data deployments, and predicts optimal routes based on probability. | Improvements in PDR and end-to-end delay demonstrate enhanced network efficiency. | Depends on training data quality, affecting adaptability in new scenarios. | Outperforms standard RPL and RPL+, especially in dynamic conditions. |
[19] | GNB-RPL employs Gaussian Naive Bayes to optimize routing in smart grids for enhanced scalability. | Applies Gaussian Naive Bayes based on smart meter data from Montreal for routing decisions. | Shows marked improvements in packet delivery and reduced delays across varied loads. | Performance variability due to Naive Bayes’ simplistic assumptions about feature independence. | Enhances scalability and reduces training data needs, suitable for dynamic, large networks. |
[20] | Analyzes supervised ML methods in RPL, focusing on scenario-dependent performance. | Assesses Catboost and Naive Bayes in varied scenarios for routing effectiveness. | Uses AUC to evaluate ML model predictions for routing success. | Effectiveness decreases in scenarios different to the training environment; requires frequent retraining. | Recommends exploring RL for adaptable, scenario-independent routing in dynamic settings. |
[39] | Employs Q-learning in an RL-based protocol to optimize parent node selection in WSNs. | Uses Q-learning to dynamically choose the best parent based on network data and performance metrics. | Shows improvements in PDR, delay, and energy consumption over linear methods. | Needs more detail on reward function and concerns about scalability with periodic messages. | Surpasses traditional linear weighted methods, enhancing responsiveness and efficiency in dynamic WSNs. |
[40] | Rl-RPL employs Q-learning to optimize IoT routing by addressing dynamic conditions and instant negative impacts of path selection. | Utilizes Q-learning to adapt to network changes, improving parent selection with real-time data. | Measures success delivery ratio, latency, energy, throughput, and data loss, noting major improvements. | Uniform weight in the reward function could impact decision accuracy; stability may reduce responsiveness after convergence. | Enhances service delivery, reduces delays, and boosts energy efficiency over previous methods. |
[32] | Tabu-RPL employs Tabu Search to dynamically optimize routing in IoT, reducing network overhead. | Uses metaheuristic search based on ETX and residual energy to adapt routing to network conditions. | Evaluates network overhead, energy, PDR, and delay, showing significant reductions and improvements. | Details on overhead reduction and adaptability in diverse networks are not fully explained. | Achieves a 30% reduction in overhead, significantly enhancing energy efficiency and PDR. |
[33] | EE-trickle modifies the trickle timer to boost energy efficiency and PDR in RPL. | Optimizes listening and transmission intervals to minimize energy use while maintaining performance. | Achieves notable reductions in energy per node and enhancements in PDR through simulations and testbeds. | May overlook challenges in dynamic environments, especially in link quality management. | Demonstrated enhanced energy efficiency and better PDR compared to the standard trickle method. |
[41] | QFS-RPL combines Q-learning and FSR to optimize routing for mobile nodes in RPL. | Applies Q-learning and FSR for dynamic adjustments to network changes and mobile node management. | Boosts PDR, latency, throughput, and control overhead, while improving energy efficiency in mobile settings. | Performs similarly to standard RPL in static conditions and lacks extensive testing in dynamic environments. | Improves mobile performance by effectively managing mobility and load, but does not exceed standard RPL in static conditions. |
[42] | ACTOR employs a UCB-based RL strategy for dynamic power management in RPL, boosting throughput in dense networks. | Uses UCB to dynamically adjust transmission power, optimizing power levels for better performance. | Significantly improves end-to-end delay, packet delivery, and energy consumption in dense networks. | Depends mainly on ETX for routing decisions, which may affect adaptability. | Enhances throughput and energy efficiency, stabilizes network topology with fewer parent switches. |
State–Action Pair (s, a) | Q-Value |
---|---|
⋮ | ⋮ |
⋮ | ⋮ |
Network simulator | OMNeT++ v6.0.1 & INET Framework v4.4.1 |
Simulation runs | 10/per scenario/per traffic load |
Simulation time | 5 h |
Smart meters | 200 (Montreal), 355 (Barcelona) |
Collectors | 1 (Montreal), 1 (Barcelona) |
Channel characteristics | Path loss, = 3.6 |
Shadowing, Lognormal, = 7.4 | |
PHY Layer | Standard, 802.15.4g |
Frequency band, 2.4 GHz | |
Transmission rate, 115 Kbps | |
Transmission power, 14 dBm | |
Reception sensitivity, −100 dBm | |
Energy detection, −90 dBm | |
Min interference power, −120 dBm | |
MAC Layer | Standard, 802.15.4g |
Operation mode, Mesh | |
ACK, Enable | |
Max re-transmission, 3 | |
Backoff procedure, Exponential | |
Min backoff exponent, 3 | |
Max backoff exponent, 8 | |
Learning Layer | Learning rate, = 0.3 |
Discount factor, = 0.6 | |
= 1, = 0.05, = 0.95 | |
Packet size & sending interval | Application dependent, according to Table 4. |
Traffic Load | Applications | Sending Period | Payload (Bytes) | Percentage of Meters |
---|---|---|---|---|
Meter reading (MR) | Every 1 h | 400 | 100% | |
1 | Alarm events (AE) | Every 1 h | 278 | 25% |
Power Quality (PQ) | Every 1 h | 278 | 25% | |
Meter reading (MR) | Every 30 min | 400 | 100% | |
2 | Alarm events (AEs) | Every 1 h | 278 | 50% |
Power Quality (PQ) | Every 1 h | 278 | 50% |
Application | Network Transit Time |
---|---|
MR | 2000 ms |
AEs | 500 ms |
PQ | 750 ms |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Duenas Santos, C.L.; Mezher, A.M.; Astudillo León, J.P.; Cardenas Barrera, J.; Castillo Guerra, E.; Meng, J. Q-RPL: Q-Learning-Based Routing Protocol for Advanced Metering Infrastructure in Smart Grids. Sensors 2024, 24, 4818. https://doi.org/10.3390/s24154818
Duenas Santos CL, Mezher AM, Astudillo León JP, Cardenas Barrera J, Castillo Guerra E, Meng J. Q-RPL: Q-Learning-Based Routing Protocol for Advanced Metering Infrastructure in Smart Grids. Sensors. 2024; 24(15):4818. https://doi.org/10.3390/s24154818
Chicago/Turabian StyleDuenas Santos, Carlos Lester, Ahmad Mohamad Mezher, Juan Pablo Astudillo León, Julian Cardenas Barrera, Eduardo Castillo Guerra, and Julian Meng. 2024. "Q-RPL: Q-Learning-Based Routing Protocol for Advanced Metering Infrastructure in Smart Grids" Sensors 24, no. 15: 4818. https://doi.org/10.3390/s24154818
APA StyleDuenas Santos, C. L., Mezher, A. M., Astudillo León, J. P., Cardenas Barrera, J., Castillo Guerra, E., & Meng, J. (2024). Q-RPL: Q-Learning-Based Routing Protocol for Advanced Metering Infrastructure in Smart Grids. Sensors, 24(15), 4818. https://doi.org/10.3390/s24154818