A Novel Approach for Target Attraction and Obstacle Avoidance of a Mobile Robot in Unknown Environments Using a Customized Spiking Neural Network
Abstract
:1. Introduction
2. Methodology
2.1. The Mobile Robot System
2.2. Sensor Types and Specifications
2.3. Network Architecture and Training Algorithm
2.4. The Izhikevich Model Used for Neuron Modeling
2.5. Performance Evaluation Metrics
Algorithm 1. SNN Algorithm with STDP | |
Steps | Code |
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: | Determine fired neurons. Determine motor torques. Update robot location. Update target and obstacle position in the robot view. Give currents to sensory neurons based on robot location. Update STDP (Spike-Timing-Dependent Plasticity) based on fired neurons. if Target Error < specific threshold then Set dopamine to 1. if Obstacle Distance < specific threshold then Set dopamine to −1. for j in fired neurons do Update synaptic weights derivation based on LTP (Long-Term Potentiation) and LTD (Long-Term Depression) rules. Update membrane potentials based on currents. Update synaptic weights based on values of synaptic weights derivation and dopamine. Decrease STDP, dopamine, and synaptic weights derivation exponentially. |
3. Results and Discussion
3.1. Parameters in SNN Learning Algorithm
3.2. Optimizing Control Systems for Mobile Robots
- ➢
- Step 1: Time Frame
- ➢
- Step 2: Experiment Initialization
- ➢
- Step 3: Sensor Calculations
- ➢
- Step 4: Reward–Punishment Mechanism
- ➢
- Step 5: Motor Babbling for Exploration
- ➢
- Step 6: Firing Patterns and Synaptic Weight Adjustment
- ➢
- Step 7: Motor Neurons and Location Calculation
- ➢
- Step 8: Iterative Experimentation
3.3. The Robot’s Navigation and Obstacle Avoidance Performance
3.4. Synaptic Weight at Training Time
- ✓
- The synaptic connection between the left target sensor and the suitable motor begins with a train-time synaptic weight of 0 and gradually increases, ultimately reaching 3.45 over time.
- ✓
- The left motor is linked to the left target sensor and exhibits a stable train-time synaptic weight of 200, varying from 1 to 2.2.
- ✓
- Similarly, the left motor connects to the suitable target sensor with a train-time synaptic weight of 400, and this weight undergoes fluctuations from 1.02 to 2.09.
- ✓
- The synaptic connection between the left motor and the right obstacle sensor begins with a train-time synaptic weight of 600 and varies between 1 and 3.52 over time.
- ✓
- Conversely, the synaptic connection between the suitable motor and the left obstacle sensor starts with a train-time synaptic weight of 800 and gradually decreases, reaching 0.2.
- ✓
- The left motor is associated with the left obstacle sensor, maintaining a stable train-time synaptic weight of 1000, with fluctuations within the range of 1 to 1.2.
- ✓
- Finally, the right obstacle sensor connects to the suitable motor, initially featuring a train-time synaptic weight of 0, increasing to 1.8 over time.
- ✓
- These dynamic adjustments in synaptic weights illustrate the capacity of the SNN to continually refine its connections, enabling the robot to adapt and learn in response to sensory input and motor actions as it navigates its complex environment.
4. Verification with Counterparts
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
References
- García, J.; Shafie, D. Teaching a humanoid robot to walk faster through Safe Reinforcement Learning. Eng. Appl. Artif. Intell. 2020, 88, 103360. [Google Scholar] [CrossRef]
- Wang, H.; Yuan, S.; Guo, M.; Chan, C.Y.; Li, X.; Lan, W. Tactical driving decisions of unmanned ground vehicles in complex highway environments: A deep reinforcement learning approach. Proc. Inst. Mech. Eng. Part D J. Automob. Eng. 2021, 235, 1113–1127. [Google Scholar] [CrossRef]
- Adams, C.S.; Rahman, S.M. Design and Development of an Autonomous Feline Entertainment Robot (AFER) for Studying Animal-Robot Interactions. In Proceedings of the SoutheastCon 2021, Atlanta, GA, USA, 10–13 March 2021; pp. 1–8. [Google Scholar]
- Dooraki, A.R.; Lee, D.J. An innovative bio-inspired flight controller for quad-rotor drones: Quad-rotor drone learning to fly using reinforcement learning. Robot. Auton. Syst. 2021, 135, 103671. [Google Scholar] [CrossRef]
- Randazzo, M.; Ruzzenenti, A.; Natale, L. Yarp-ros inter-operation in a 2d navigation task. Front. Robot. AI 2018, 5, 5. [Google Scholar] [CrossRef] [PubMed]
- Panigrahi, P.K.; Bisoy, S.K. Localization strategies for autonomous mobile robots: A review. J. King Saud Univ.-Comput. Inf. Sci. 2022, 34, 6019–6039. [Google Scholar] [CrossRef]
- y Arcas, B.A.; Fairhall, A.L.; Bialek, W. Computation in a single neuron: Hodgkin and Huxley revisited. Neural Comput. 2003, 15, 1715–1749. [Google Scholar] [CrossRef] [PubMed]
- Burkitt, A.N. A review of the integrate-and-fire neuron model: I. Homogeneous synaptic input. Biol. Cybern. 2006, 95, 1–19. [Google Scholar] [CrossRef]
- Izhikevich, E.M. Which model to use for cortical spiking neurons? IEEE Trans. Neural Netw. 2004, 15, 1063–1070. [Google Scholar] [CrossRef]
- Izhikevich, E.M. Simple Model of Spiking Neurons. IEEE Trans. Neural Netw. 2003, 14, 1569–1572. [Google Scholar] [CrossRef]
- Gerstner, W.; Kistler, W.M.; Naud, R.; Paninski, L. Neuronal Dynamics: From Single Neurons to Networks and Models of Cognition; Cambridge University Press: Cambridge, UK, 2014. [Google Scholar]
- de Ponte Müller, F. Survey on ranging sensors and cooperative techniques for relative positioning of vehicles. Sensors 2017, 17, 271. [Google Scholar] [CrossRef]
- Ko, N.Y.; Kuc, T.Y. Fusing range measurements from ultrasonic beacons and a laser range finder for localization of a mobile robot. Sensors 2015, 15, 11050–11075. [Google Scholar] [CrossRef] [PubMed]
- Azimirad, V.; Sani, M.F. Experimental study of reinforcement learning in mobile robots through spiking architecture of thalamo-cortico-thalamic circuitry of mammalian brain. Robotica 2020, 38, 1558–1575. [Google Scholar] [CrossRef]
- Lu, H.; Liu, J.; Luo, Y.; Hua, Y.; Qiu, S.; Huang, Y. An autonomous learning mobile robot using biological reward modulate STDP. Neurocomputing 2021, 458, 308–318. [Google Scholar] [CrossRef]
- Liu, J.; Lu, H.; Luo, Y.; Yang, S. Spiking neural network-based multitask autonomous learning for mobile robots. Eng. Appl. Artif. Intell. 2021, 104, 104362. [Google Scholar] [CrossRef]
- Wang, Z.; Jin, X.; Zhang, T.; Li, J.; Yu, D.; Cheong, K.H.; Chen, C.P. Expert system-based multiagent deep deterministic policy gradient for swarm robot decision making. IEEE Trans. Cybern. 2022. [Google Scholar] [CrossRef] [PubMed]
- Lobov, S.A.; Mikhaylov, A.N.; Shamshin, M.; Makarov, V.A.; Kazantsev, V.B. Spatial properties of STDP in a self-learning spiking neural network enable controlling a mobile robot. Front. Neurosci. 2020, 14, 88. [Google Scholar] [CrossRef]
- Jiang, Z.; Bing, Z.; Huang, K.; Knoll, A. Retina-based pipe-like object tracking implemented through spiking neural network on a snake robot. Front. Neurorobot. 2019, 13, 29. [Google Scholar] [CrossRef]
- Harandi, F.A.; Derhami, V.; Jamshidi, F. A new feature selection method based on task environments for controlling robots. Appl. Soft Comput. 2019, 85, 105812. [Google Scholar] [CrossRef]
- Wang, X.; Hou, Z.G.; Lv, F.; Tan, M.; Wang, Y. Mobile robots׳ modular navigation controller using spiking neural networks. Neurocomputing 2014, 134, 230–238. [Google Scholar] [CrossRef]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Hassabis, D. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef]
- Ge, C.; Kasabov, N.; Liu, Z.; Yang, J. A spiking neural network model for obstacle avoidance in simulated prosthetic vision. Inf. Sci. 2017, 399, 30–42. [Google Scholar] [CrossRef]
- Yu, X.; Bai, C.; Wang, C.; Yu, D.; Chen, C.P.; Wang, Z. Self-Supervised Imitation for Offline Reinforcement Learning with Hindsight Relabeling. IEEE Trans. Syst. Man Cybern. Syst. 2023, 53, 7732–7743. [Google Scholar] [CrossRef]
- Yu, D.; Kang, Q.; Jin, J.; Wang, Z.; Li, X. Smoothing group L1/2 regularized discriminative broad learning system for classification and regression. Pattern Recognit. 2023, 141, 109656. [Google Scholar] [CrossRef]
- Yang, Y.; Juntao, L.; Lingling, P. Multi-robot path planning based on a deep reinforcement learning DQN algorithm. CAAI Trans. Intell. Technol. 2020, 5, 177–183. [Google Scholar] [CrossRef]
- Lobo, J.L.; Del Ser, J.; Bifet, A.; Kasabov, N. Spiking neural networks and online learning: An overview and perspectives. Neural Netw. 2020, 121, 88–100. [Google Scholar] [CrossRef] [PubMed]
- Arena, P.; Fortuna, L.; Frasca, M.; Patané, L. Learning anticipation via spiking networks: Application to navigation control. IEEE Trans. Neural Netw. 2009, 20, 202–216. [Google Scholar] [CrossRef] [PubMed]
- Pandey, A.; Pandey, S.; Parhi, D.R. Mobile robot navigation and obstacle avoidance techniques: A review. Int. Rob. Auto J. 2017, 2, 22. [Google Scholar] [CrossRef]
- Shamsfakhr, F.; Bigham, B.S. A neural network approach to navigation of a mobile robot and obstacle avoidance in dynamic and unknown environments. Turk. J. Electr. Eng. Comput. Sci. 2017, 25, 1629–1642. [Google Scholar] [CrossRef]
- Zheng, Y.; Yan, B.; Ma, C.; Wang, X.; Xue, H. Research on obstacle detection and path planning based on visual navigation for mobile robot. J. Phys. Conf. Ser. 2020, 1601, 062044. [Google Scholar] [CrossRef]
- Benavidez, P.; Jamshidi, M. Mobile robot navigation and target tracking system. In Proceedings of the 2011 6th International Conference on System of Systems Engineering, Albuquerque, NM, USA, 27–30 June 2011; pp. 299–304. [Google Scholar]
- Diehl, P.U.; Cook, M. Frontiers in Computational Neuroscience. Front. Comput. Neurosci. 2015, 9, 99. [Google Scholar]
- Wu, Y.; Deng, L.; Li, G.; Zhu, J.; Shi, L. Spatio-temporal backpropagation for training high-performance spiking neural networks. Front. Neurosci. 2018, 12, 331. [Google Scholar] [CrossRef] [PubMed]
- Izhikevich, E.M. Dynamical Systems in Neuroscience; MIT Press: Cambridge, MA, USA, 2007. [Google Scholar]
- Subbulakshmi Radhakrishnan, S.; Sebastian, A.; Oberoi, A.; Das, S.; Das, S. A biomimetic neural encoder for spiking neural network. Nat. Commun. 2021, 12, 2143. [Google Scholar] [CrossRef] [PubMed]
- Bing, Z.; Baumann, I.; Jiang, Z.; Huang, K.; Cai, C.; Knoll, A. Supervised learning in SNN via reward-modulated spike-timing-dependent plasticity for a target reaching vehicle. Front. Neurorobot. 2019, 13, 18. [Google Scholar] [CrossRef] [PubMed]
- Ramne, M. Spiking Neural Network for Targeted Navigation and Collision Avoidance in an Autonomous Robot. Master’s Thesis, Chalmers University of Technology, Göteborg, Sweden, 2020. [Google Scholar]
- Tai, L.; Li, S.; Liu, M. A deep-network solution towards model-less obstacle avoidance. In Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Republic of Korea, 9–14 October 2016; pp. 2759–2764. [Google Scholar]
- Liu, C.; Zheng, B.; Wang, C.; Zhao, Y.; Fu, S.; Li, H. CNN-based vision model for obstacle avoidance of mobile robot. MATEC Web Conf. 2017, 139, 7. [Google Scholar] [CrossRef]
- Yang, J.; Shi, Y.; Rong, H.J. Random neural Q-learning for obstacle avoidance of a mobile robot in unknown environments. Adv. Mech. Eng. 2016, 8, 1687814016656591. [Google Scholar] [CrossRef]
- Bing, Z.; Meschede, C.; Röhrbein, F.; Huang, K.; Knoll, A.C. A survey of robotics control based on learning-inspired spiking neural networks. Front. Neurorobot. 2018, 12, 35. [Google Scholar] [CrossRef]
Parameter | a | b | c | d |
Value | 0.02 | 0.2 | −65 | 8 |
Parameter Description | Value |
---|---|
Number of sensory neurons | 100 |
Number of motor neurons | 200 |
Number of steps to send signals to postsynaptic neurons | 1 |
Maximum synaptic weight | 4 |
The time scale of the recovery variable | 0.02 |
The sensitivity of the recovery variable to the subthreshold fluctuations of the membrane potential | 0.2 |
Reset the value of the membrane potential | −65 mv |
Reset the recovery variable | 8 |
Initial synaptic weights | 1 |
Membrane potential threshold to spike | 30 mv |
Number of experiments | 100 |
Constant of reward | 1 |
Constant of punishment | −1 |
Reward/punishment decreasing factor in each time step: | 0.995 |
Time step to decrease reward/punishment value | 1 ms |
Synaptic weights derivation decreasing factor in each time step | 0.95 |
STDP values decreasing factor in each time step | 0.99 |
STDP factor in LTP part | 1 |
STDP factor in LTD part | −1.1 |
Time step to decrease synaptic weights derivation value | 5 ms |
The time step to encode the sensor signals to the network | 10 ms |
Time step to decrease STDP values | 1 ms |
Connection Words | Connection (Sensor, Motor) | Refer to Line Colors in Figure 9 |
---|---|---|
tarL->MR | Left target sensor to the right motor | Blue |
tarL->ML | Left target sensor to the left motor | orange |
tarR->MR | Right target sensor to the right motor | Green |
tarR->ML | The right target sensor to the left motor | Red |
obsL->MR | Left obstacle sensor to the right motor | Purple |
obsL->ML | Left obstacle sensor to the left motor | Brown |
obsR->MR | Right obstacle sensor to the right motor | Pink |
obsR->ML | Right obstacle sensor to the left motor | Gray |
No. | Reference | Algorithm | Accuracy | |
---|---|---|---|---|
Reached the Target (%) | Collisions (%) | |||
1 | Proposed method (3 obstacles) | SNN | 94% | 6% |
2 | Ramne [38] | SNN | 84 | 16 |
3 | Ramne [38] | Non-SNN | 91 | 9 |
4 | Tai et al. [39] | CNN | 80.2 | 19.98 |
5 | Liu et al. [40] | CNN-based vision model | 81.72% | 18.28% |
6 | Liu et al. [40] | CNN-based vision model | 93.21% | 6.79% |
7 | Yang et al. [41] | Backpropagation Q-learning | 91.8% | 7.4% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Abubaker, B.A.; Razmara, J.; Karimpour, J. A Novel Approach for Target Attraction and Obstacle Avoidance of a Mobile Robot in Unknown Environments Using a Customized Spiking Neural Network. Appl. Sci. 2023, 13, 13145. https://doi.org/10.3390/app132413145
Abubaker BA, Razmara J, Karimpour J. A Novel Approach for Target Attraction and Obstacle Avoidance of a Mobile Robot in Unknown Environments Using a Customized Spiking Neural Network. Applied Sciences. 2023; 13(24):13145. https://doi.org/10.3390/app132413145
Chicago/Turabian StyleAbubaker, Brwa Abdulrahman, Jafar Razmara, and Jaber Karimpour. 2023. "A Novel Approach for Target Attraction and Obstacle Avoidance of a Mobile Robot in Unknown Environments Using a Customized Spiking Neural Network" Applied Sciences 13, no. 24: 13145. https://doi.org/10.3390/app132413145
APA StyleAbubaker, B. A., Razmara, J., & Karimpour, J. (2023). A Novel Approach for Target Attraction and Obstacle Avoidance of a Mobile Robot in Unknown Environments Using a Customized Spiking Neural Network. Applied Sciences, 13(24), 13145. https://doi.org/10.3390/app132413145