Sound-Based Localization Using LSTM Networks for Visually Impaired Navigation
Abstract
:1. Introduction
2. System Design
2.1. The System Architecture
2.2. The Prototype Assembly
- Ultrasound sensors (HC-SR04, Kuongshun Electronic, Shenzhen, China): Ultrasonic sensors are used to detect a person entering/leaving a room. The ultrasonic sensor is a piece of electronic equipment that uses the duration of the time interval between the sound wave traveling from the trigger and the wave coming back after colliding with a target.
- Arduino Uno microcontroller (ATmega328P, embedded chip, Microchip Technology Inc, Chandler, Arizona, USA): The microcontroller is used to collect ultrasonic, GPS, and compass data. Arduino is a microcontroller-based platform that uses an electronic environment and flash memory to save user programs and data. In the study, the Arduino module was used to read the input data that came from a different sensor.
- XBEE module s2 and wireless communication (771-6333, Digi International, Inc. Hopkins, MI, U.S): This is a radio frequency module (RF) that uses ZigBee mesh communication protocols (IEEE 802.15.4 PHY).
- Raspberry Pi4 (Single-board computer, Rockchip RK3399 CP, clocked up to 2 GHz, Creative Commons Corporation, Mountain View, CA, USA): As the central core of the system, this is used to receive data from different modules, such as Wi-Fi communications, via an XBEE module, or via a serial port such as a GPS, compass, microphone, etc.
- GPS module (HW-658 for Raspberry Pi, Dynamic—IT Solutions, Joppa, MD, USA): This is used to detect and locate user destinations. The GPS module (hw-658) is connected directly to the Arduino, which transfers GPS data to Raspberry Pi4 via a USB cable.
- Digital compass (HMC5883L chip, three-axis magnetic sensor, made by Honeywell Aerospace Inc., Phoenix, AZ, USA): This is a 3-axis digital compass that is used to determine the current direction of the module based on the magnetic field. The module must be kept in a horizontal position all the time. The study used the hmc5883l compass, which is suitable for Arduino applications.
- Headset (Razer Kraken Gaming Headset, Irvine, CA, USA): A headset is used for the audio communication between Raspberry Pi4 and the user so as to provide a good level of audio data transmission. The unidirectional microphone of the headset leaves no room for miscommunication and delivers crystal clear sound reproduction, with balanced, natural vocal tones and less background noise.
3. Method
3.1. The Indoor Navigation Algorithm
3.2. The Outdoor Navigation Algorithm
3.3. Voice Recognition Approach
3.4. LSTM Model Adoption
3.5. Dijkstra SPF Algorithm
Algorithm1: Dijkstra SPF |
Input: “voice command data” |
Output: GPS “SPF” |
1: Start |
2: Initialization: {distances to source node (s) = 0; distances to other nodes is empty (n) = ∅; queue (q) ∈ {all nodes}} |
3: Start value ← 0 |
4: for all N ∈ n-{s} |
5: dist [n] ← ∞ (all other distances should be set to infinity) |
6: while q ≠ ∅ (while there is a queue) |
7: do x ← min_distance (q, dist) (choose the q with the lowest distance) |
8: For all N ∈ neighbors [x] |
9: do if dist [N] > dist [x] + w (x, N) (if a new shortest path is discovered) |
10: then d[v]← d[x] + w (x, N) (change the shortest path with new value) |
Return dist |
4. Simulation Protocols and Evaluation Methods
5. Results
5.1. Indoor System Test
5.2. Outdoor System Test
5.3. Outdoor Shortest Path First (SPF)
6. Discussion
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Future Work
Appendix A
Parameter | Value |
---|---|
Raspberry pi4 | SoC: quad-core, 64-bit @ 1.5 GHz. Networking: 2.4 GHz and 5 GHz wireless LAN. Bluetooth: Bluetooth 5.0 RAM: 4 GB SDRAM. GPU: Broadcom Video Core VI. Storage: microSD. |
Ultrasound sensors | Supply Voltage: 5 V. Current Consumption: 15 mA. Frequency: 40 kHz. Trigger Pulse Width: 10 µs. |
Arduino Uno mi-controller | ATmega328P: (embedded chip). Analog inputs: 6. Digital input/output pins: I14. Connection: USB. |
XBEE module s2 | 1 MB / 128 KB RAM (32 KB are available for Micro Python) |
GPS module | Speed precision: <0.1 m/s. Capturing sensitivity: −148 dm. Voltage: 3.3–5 V. Location precision: <2.5 m CEP. |
Digital compass | Voltage: 3–5 V Measuring range: ±1.3–8 gauss. |
Headset | Min Frequency Response: 20 Hz. Max Frequency Response: 20 kHz. Impedance: 32 Ohm. Power Input: 50 mW. Sound Features: Sensitivity 112 dB/mW. |
Sensor Specification | Sensor type | Ultrasound sensor: HC-SR04 | |
Target Object | Human | ||
Operating min range | 2 cm | ||
Operating max range | 400 cm | ||
Operating frequency range | 40 kHz | ||
Study result | The required targeting range | The dead zone is <0.5 cm > 151 cm. The active zone is <150 cm >1 cm | |
Participant numbers | 35 in active zone | 35 in dead zone | |
Positive | 34 | 1 | |
Negative | 1 | 34 | |
Accuracy | 97% |
References
- Paiva, S.; Gupta, N. Technologies and systems to improve mobility of visually impaired people: A state of the art. In Technological Trends in Improved Mobility of the Visually Impaired; Springer: Cham, Germany, 2020; pp. 105–123. [Google Scholar]
- Tapu, R.; Mocanu, B.; Zaharia, T. Wearable assistive devices for visually impaired: A state of the art survey. Pattern Recognit. Lett. 2020, 137, 37–52. [Google Scholar] [CrossRef]
- Plikynas, D.; Žvironas, A.; Budrionis, A.; Gudauskis, M. Indoor Navigation Systems for Visually Impaired Persons: Mapping the Features of Existing Technologies to User Needs. Sensors 2020, 20, 636. [Google Scholar] [CrossRef] [PubMed]
- El-taher, F.E.-z.; Taha, A.; Courtney, J.; Mckeever, S. A Systematic Review of Urban Navigation Systems for Visually Impaired People. Sensors 2021, 21, 3103. [Google Scholar] [CrossRef]
- Romlay, M.R.M.; Toha, S.F.; Ibrahim, A.M.; Venkat, I. Methodologies and evaluation of electronic travel aids for the visually impaired people: A review. Bull. Electr. Eng. Inform. 2021, 10, 1747–1758. [Google Scholar] [CrossRef]
- Grumiaux, P.A.; Kitić, S.; Girin, L.; Guérin, A. A survey of sound source localization with deep learning methods. J. Acoust. Soc. Am. 2022, 152, 107–151. [Google Scholar] [CrossRef] [PubMed]
- Yassin, A.; Nasser, Y.; Awad, M.; Al-Dubai, A.; Liu, R.; Yuen, C.; Raulefs, R.; Aboutanios, E. Recent advances in indoor localization: A survey on theoretical approaches and applications. IEEE Commun. Surv. Tutor. 2016, 19, 1327–1346. [Google Scholar] [CrossRef]
- Zhang, D.; Han, J.; Cheng, G.; Yang, M.H. Weakly supervised object localization and detection: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 5866–5885. [Google Scholar] [CrossRef]
- Nagarajan, B.; Shanmugam, V.; Ananthanarayanan, V.; Bagavathi Sivakumar, P. Localization and indoor navigation for visually impaired using bluetooth low energy. In Smart Systems and IoT: Innovations in Computing: Proceeding of SSIC 2019; Springer: Singapore, 2019; pp. 249–259. [Google Scholar]
- Martinez-Sala, A.S.; Losilla, F.; Sánchez-Aarnoutse, J.C.; García-Haro, J. Design, Implementation and Evaluation of an Indoor Navigation System for Visually Impaired People. Sensors 2015, 15, 32168–32187. [Google Scholar] [CrossRef]
- Tang, Y.; Chen, M.; Wang, C.; Luo, L.; Li, J.; Lian, G.; Zou, X. Recognition and localization methods for vision-based fruit picking robots: A review. Front. Plant Sci. 2020, 11, 510. [Google Scholar] [CrossRef]
- Suman, S.; Mishra, S.; Sahoo, K.S.; Nayyar, A. Vision navigator: A smart and intelligent obstacle recognition model for visually impaired users. Mob. Inf. Syst. 2022, 2022, 9715891. [Google Scholar] [CrossRef]
- Xiao, J.; Joseph, S.L.; Zhang, X.; Li, B.; Li, X.; Zhang, J. An assistive navigation framework for the visually impaired. IEEE Trans. Hum. -Mach. Syst. 2015, 45, 635–640. [Google Scholar] [CrossRef]
- Desai, D.; Mehendale, N. A review on sound source localization systems. Arch. Comput. Methods Eng. 2022, 29, 4631–4642. [Google Scholar] [CrossRef]
- Ashiq, F.; Asif, M.; Ahmad, M.B.; Zafar, S.; Masood, K.; Mahmood, T.; Mahmood, M.T.; Lee, I.H. CNN-based object recognition and tracking system to assist visually impaired people. IEEE Access 2022, 10, 14819–14834. [Google Scholar] [CrossRef]
- Tan, T.-H.; Lin, Y.-T.; Chang, Y.-L.; Alkhaleefah, M. Sound Source Localization Using a Convolutional Neural Network and Regression Model. Sensors 2021, 21, 8031. [Google Scholar] [CrossRef]
- Pang, C.; Liu, H.; Li, X. Multitask learning of time-frequency CNN for sound source localization. IEEE Access 2019, 7, 40725–40737. [Google Scholar] [CrossRef]
- Al-kafaji, R.D.; Gharghan, S.K.; Mahdi, S.Q. Localization techniques for blind people in outdoor/indoor environments. In IOP Conference Series: Materials Science and Engineering; IOP Publishing: Bristol, UK, 2020; Volume 745, p. 012103. [Google Scholar]
- Mai, C.; Xie, D.; Zeng, L.; Li, Z.; Li, Z.; Qiao, Z.; Qu, Y.; Liu, G.; Li, L. Laser Sensing and Vision Sensing Smart Blind Cane: A Review. Sensors 2023, 23, 869. [Google Scholar] [CrossRef]
- Messaoudi, M.D.; Menelas, B.-A.J.; Mcheick, H. Review of Navigation Assistive Tools and Technologies for the Visually Impaired. Sensors 2022, 22, 7888. [Google Scholar] [CrossRef]
- Khan, S.; Nazir, S.; Khan, H.U. Analysis of navigation assistants for blind and visually impaired people: A systematic review. IEEE Access 2021, 9, 26712–26734. [Google Scholar] [CrossRef]
- Simões, W.C.S.S.; Machado, G.S.; Sales, A.M.A.; de Lucena, M.M.; Jazdi, N.; de Lucena, V.F., Jr. A Review of Technologies and Techniques for Indoor Navigation Systems for the Visually Impaired. Sensors 2020, 20, 3935. [Google Scholar] [CrossRef]
- Real, S.; Araujo, A. Navigation Systems for the Blind and Visually Impaired: Past Work, Challenges, and Open Problems. Sensors 2019, 19, 3404. [Google Scholar] [CrossRef] [PubMed]
- Hasan, M.R.; Hasan, M.M.; Hossain, M.Z. How many Mel-frequency cepstral coefficients to be utilized in speech recognition? A study with the Bengali language. J. Eng. 2021, 2021, 817–827. [Google Scholar] [CrossRef]
- Bakouri, M.; Alsehaimi, M.; Ismail, H.F.; Alshareef, K.; Ganoun, A.; Alqahtani, A.; Alharbi, Y. Steering a Robotic Wheelchair Based on Voice Recognition System Using Convolutional Neural Networks. Electronics 2022, 11, 168. [Google Scholar] [CrossRef]
- Alhussein, M.; Aurangzeb, K.; Haider, S.I. Hybrid CNN-LSTM model for short-term individual household load forecasting. IEEE Access 2020, 8, 180544–180557. [Google Scholar] [CrossRef]
- Bakouri, M. Development of Voice Control Algorithm for Robotic Wheelchair Using MIN and LSTM Models. CMC-Comput. Mater. Contin. 2022, 73, 2441–2456. [Google Scholar] [CrossRef]
- Behera, R.K.; Jena, M.; Rath, S.K.; Misra, S. Co-LSTM: Convolutional LSTM model for sentiment analysis in social big data. Inf. Process. Manag. 2021, 58, 102435. [Google Scholar] [CrossRef]
- Lotfi, M.; Osório, G.J.; Javadi, M.S.; Ashraf, A.; Zahran, M.; Samih, G.; Catalão, J.P. A Dijkstra-inspired graph algorithm for fully autonomous tasking in industrial applications. IEEE Trans. Ind. Appl. 2021, 57, 5448–5460. [Google Scholar] [CrossRef]
- Bulut, O.; Shin, J.; Cormier, D.C. Learning Analytics and Computerized Formative Assessments: An Application of Dijkstra’s Shortest Path Algorithm for Personalized Test Scheduling. Mathematics 2022, 10, 2230. [Google Scholar] [CrossRef]
- Abdulkareem, S.A.; Abboud, A.J. Evaluating python, c++, javascript and java programming languages based on software complexity calculator (halstead metrics). In IOP Conference Series: Materials Science and Engineering; IOP Publishing: Bristol, UK, 2021; Volume 1076, p. 012046. [Google Scholar]
- Ramadhan, A.J. Wearable Smart System for Visually Impaired People. Sensors 2018, 18, 843. [Google Scholar] [CrossRef]
- Garcia-Macias, J.A.; Ramos, A.G.; Hasimoto-Beltran, R.; Hernandez, S.E.P. Uasisi: A modular and adaptable wearable system to assist the visually impaired. Procedia Comput. Sci. 2019, 151, 425–430. [Google Scholar] [CrossRef]
- Yiwere, M.; Rhee, E.J. Sound Source Distance Estimation Using Deep Learning: An Image Classification Approach. Sensors 2020, 20, 172. [Google Scholar] [CrossRef] [PubMed]
- Hu, Y.; Samarasinghe, P.N.; Gannot, S.; Abhayapala, T.D. Semi-supervised multiple source localization using relative harmonic coefficients under noisy and reverberant environments. IEEE/ACM Trans. Audio Speech Lang. Process. 2020, 28, 3108–3123. [Google Scholar] [CrossRef]
- Ma, W.; Liu, X. Phased microphone array for sound source localization with deep learning. Aerosp. Syst. 2019, 2, 71–81. [Google Scholar] [CrossRef]
- Rahman, M.M.; Islam, M.M.; Ahmmed, S.; Khan, S.A. Obstacle and fall detection to guide the visually impaired people with real time monitoring. SN Comput. Sci. 2020, 1, 219. [Google Scholar] [CrossRef]
- Șipoș, E.; Ciuciu, C.; Ivanciu, L. Sensor-Based Prototype of a Smart Assistant for Visually Impaired People—Preliminary Results. Sensors 2022, 22, 4271. [Google Scholar] [CrossRef]
- AL-Madani, B.; Orujov, F.; Maskeliūnas, R.; Damaševičius, R.; Venčkauskas, A. Fuzzy Logic Type-2 Based Wireless Indoor Localization System for Navigation of Visually Impaired People in Buildings. Sensors 2019, 19, 2114. [Google Scholar] [CrossRef]
Raspberry Voice Command | IP Address Detected |
---|---|
You are going to the bedroom | Bedroom |
You are going to the kitchen | Kitchen |
You are going to the bathroom | Bathroom |
Number of Trials | Rooms Name | |||||
---|---|---|---|---|---|---|
Kitchen | Bedroom | Path Room | ||||
Detected | Undetected | Detected | Undetected | Detected | Undetected | |
1 | 1 | 0 | 1 | 0 | 1 | 0 |
2 | 1 | 0 | 1 | 0 | 1 | 0 |
3 | 1 | 0 | 1 | 0 | 1 | 0 |
4 | 1 | 0 | 1 | 0 | 1 | 0 |
5 | 0 | 1 | 1 | 0 | 1 | 0 |
6 | 1 | 0 | 1 | 0 | 1 | 0 |
7 | 1 | 0 | 1 | 0 | 1 | 0 |
8 | 1 | 0 | 1 | 0 | 1 | 0 |
9 | 1 | 0 | 0 | 1 | 1 | 0 |
10 | 1 | 0 | 1 | 0 | 1 | 0 |
11 | 1 | 0 | 1 | 0 | 1 | 0 |
12 | 1 | 0 | 1 | 0 | 1 | 0 |
13 | 1 | 0 | 1 | 0 | 1 | 0 |
14 | 1 | 0 | 1 | 0 | 1 | 0 |
15 | 1 | 0 | 1 | 0 | 1 | 0 |
Root means square error | 0.192 |
Actual Voice Command | |||||
---|---|---|---|---|---|
Prediction ratio % | Class | Mosque | Laundry | Supermarket | Home |
Mosque | 96% | 1% | 2% | 3% | |
Laundry | 1% | 98% | 2% | 2% | |
Supermarket | 2% | 2% | 97% | 1% | |
Home | 3% | 2% | 1% | 96% |
Class | Accuracy | Precision | Recall | F-Score |
---|---|---|---|---|
Mosque | 95% | 0.73 | 0.74 | 0.735 |
Laundry | 96.3% | 0.75 | 0.75 | 0.75 |
Supermarket | 98.2% | 0.77 | 0.75 | 0.76 |
Home | 94.8% | 0.75 | 0.73 | 0.74 |
Planned Longitude | Planned Latitude | Actual Longitude | Actual Latitude |
---|---|---|---|
24.89482 | 46.61831 | 24.89499 | 46.618357 |
24.894803 | 46.61832 | 24.89485 | 46.618365 |
24.894784 | 46.61832 | 24.894799 | 46.618373 |
24.894765 | 46.61833 | 24.894795 | 46.618381 |
24.894748 | 46.61834 | 24.894792 | 46.618389 |
24.894731 | 46.61835 | 24.894787 | 46.618387 |
24.894712 | 46.61836 | 24.894712 | 46.61836 |
24.894697 | 46.61837 | 24.894697 | 46.618393 |
24.894673 | 46.61838 | 24.894699 | 46.618398 |
24.894651 | 46.61839 | 24.894689 | 46.618391 |
24.894629 | 46.6184 | 24.894679 | 46.618399 |
24.894612 | 46.6184 | 24.894662 | 46.618454 |
24.894595 | 46.61841 | 24.894595 | 46.618442 |
24.894558 | 46.61844 | 24.894598 | 46.618466 |
24.894541 | 46.61844 | 24.894589 | 46.618474 |
24.894522 | 46.61846 | 24.894572 | 46.618477 |
24.894503 | 46.61847 | 24.894503 | 46.61847 |
24.894484 | 46.61848 | 24.894494 | 46.618493 |
24.89446 | 46.61849 | 24.89486 | 46.618494 |
24.894423 | 46.6185 | 24.894463 | 46.618544 |
24.894406 | 46.61852 | 24.894456 | 46.618555 |
24.894382 | 46.61852 | 24.894392 | 46.618558 |
24.894348 | 46.61849 | 24.894388 | 46.618494 |
24.894326 | 46.61846 | 24.894366 | 46.618499 |
24.894311 | 46.61842 | 24.894361 | 46.618446 |
24.894296 | 46.61839 | 24.894296 | 46.618399 |
24.894272 | 46.61835 | 24.894262 | 46.618389 |
24.89425 | 46.61833 | 24.89475 | 46.618358 |
24.894231 | 46.61831 | 24.894271 | 46.618339 |
24.894216 | 46.61826 | 24.894266 | 46.618291 |
24.894197 | 46.61819 | 24.894157 | 46.618194 |
24.89417 | 46.61814 | 24.89417 | 46.618135 |
24.894148 | 46.61808 | 24.894178 | 46.618081 |
24.894129 | 46.61803 | 24.894129 | 46.61803 |
24.89408 | 46.61803 | 24.89408 | 46.618025 |
24.894034 | 46.61804 | 24.894094 | 46.618061 |
24.894 | 46.61805 | 24.894 | 46.618084 |
24.893949 | 46.61808 | 24.893989 | 46.618098 |
24.893908 | 46.6181 | 24.893948 | 46.618142 |
24.893872 | 46.61812 | 24.893892 | 46.618151 |
24.893831 | 46.61815 | 24.893891 | 46.618178 |
24.893804 | 46.61817 | 24.893854 | 46.618189 |
24.893782 | 46.61819 | 24.893792 | 46.618198 |
24.893765 | 46.61827 | 24.893785 | 46.618296 |
Expected Tracks | Found the Shortest Distance (Dijkstra) | Could Not Find the Shortest Distance (Dijkstra) |
---|---|---|
Home to mosque | 738 m | ---- |
Home to supermarket | 561 m | ---- |
Home to laundry | 214 m | ---- |
Mosque to home | 738 m | ---- |
Laundry to home | 214 m | ---- |
Supermarket to home | 561 m | ---- |
Mosque to supermarket | 940 m | ---- |
Mosque to laundry | 584 m | ---- |
Laundry to supermarket | 676 m | ---- |
Laundry to mosque | 584 m | ---- |
Supermarket to mosque | 940 m | ---- |
Authors | Technique | Method | Average Accuracy |
---|---|---|---|
Tan et al. [16] | Interaural phase difference (IPD) | Convolutional Neural Network and Regression Model | 98.96%–98.31% |
Pang et al. [17] | ITD, IPD, and Microphone-array geometry | Time–frequency convolutional neural network (TF-CNN) | 90% |
Yiwere et al. [34] | Labeling of audio data | Deep Learning: An Image Classification | 88.23% |
Hu et al. [35] | Relative harmonic coefficients | Semi-supervised multi-source algorithm | 50% to 90% |
MA et al. [36] | Phased microphone array | Deep Learning | 70.8%–100% |
Rahman et al. [37] | Ultrasonic and PIR motion sensor | Obstacle and Fall Detection based on Bluetooth | 98.34% |
Sipos et al. [38] | RFID reader | Obstacle detection | 40%–99% |
AL-Madani et al. [39] | Bluetooth Low Energy (BLE) beacons | Fuzzy Logic Type-2 | 98.2% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Bakouri, M.; Alyami, N.; Alassaf, A.; Waly, M.; Alqahtani, T.; AlMohimeed, I.; Alqahtani, A.; Samsuzzaman, M.; Ismail, H.F.; Alharbi, Y. Sound-Based Localization Using LSTM Networks for Visually Impaired Navigation. Sensors 2023, 23, 4033. https://doi.org/10.3390/s23084033
Bakouri M, Alyami N, Alassaf A, Waly M, Alqahtani T, AlMohimeed I, Alqahtani A, Samsuzzaman M, Ismail HF, Alharbi Y. Sound-Based Localization Using LSTM Networks for Visually Impaired Navigation. Sensors. 2023; 23(8):4033. https://doi.org/10.3390/s23084033
Chicago/Turabian StyleBakouri, Mohsen, Naif Alyami, Ahmad Alassaf, Mohamed Waly, Tariq Alqahtani, Ibrahim AlMohimeed, Abdulrahman Alqahtani, Md Samsuzzaman, Husham Farouk Ismail, and Yousef Alharbi. 2023. "Sound-Based Localization Using LSTM Networks for Visually Impaired Navigation" Sensors 23, no. 8: 4033. https://doi.org/10.3390/s23084033
APA StyleBakouri, M., Alyami, N., Alassaf, A., Waly, M., Alqahtani, T., AlMohimeed, I., Alqahtani, A., Samsuzzaman, M., Ismail, H. F., & Alharbi, Y. (2023). Sound-Based Localization Using LSTM Networks for Visually Impaired Navigation. Sensors, 23(8), 4033. https://doi.org/10.3390/s23084033