Translating Speech to Indian Sign Language Using Natural Language Processing
Abstract
:1. Introduction
1.1. Indian Sign Language
- To train people to use Indian Sign Language (ISL) and educate and do research on the language, including bilingualism.
- To encourage hard-of-hearing students in primary, intermediate, and higher education to use Indian Sign Language as a form of instruction.
- To educate and train diverse groups, such as government officials, teachers, professionals, community leaders, and the public, on Indian Sign Language and how to use it.
- To promote and propagate Indian Sign Language in collaboration with hard-of-hearing groups and other institutions working on disabilities.
1.2. HamNoSys V/S ISL Gestures
2. Literature Survey
2.1. Sign Language in English
2.2. Sign Language in Other Languages
3. Comparison
4. Proposed Work
- Audio-to-text conversion if the input is audio.
- The tokenization of English text into words.
- Parsing the English text into phrase structure trees.
- The reordering of sentences based on Indian Sign Language grammar rules.
- Using lemmatization along with part-of-speech tagging so that synonyms of words or the root form of a word can be used if the exact word is not present in the database.
- Indian Sign Language video output.
5. System Architecture
6. Hidden Markov Model
7. Methodology
- Tokenization
- The removal of stop words
- Parsing
- Lemmatization
- Part-of-speech tagging
8. Performance Evaluation
- ➢
- Promoters: responses from 9 to 10.
- ➢
- Passives: responses from 7 to 8.
- ➢
- Detractors: responses from 0 to 6.
- ➢
- Total number of people who participated in the survey: 30.
- ➢
- Total number of promoters: 26.
- ➢
- Percentage of promoters: 86.6%.
- ➢
- Total number of passives: 3.
- ➢
- Percentage of passives: 10%.
- ➢
- Total number of detractors: 1.
- ➢
- Percentage of detractors: 3.33%.
- ➢
- Net promoter score = total percentage of promoters − total percentage of detractors.
- ➢
- Net promoter score = 86.6 − 3.33 = 83.27~83.
- ➢
- A net promoter score above 50 is considered excellent by the creators of the NPS score.
9. Results and Discussion
- “New Delhi” is shown in one video, as the video for the same is present in the database. This is an example of how the system identifies multiple words/phrases in the sentence for which videos are present in the database.
- The keywords “national” and “capital” are broken down into letters and a sign language video for each letter is shown to the user as the video, for neither “national” nor “capital” is present in the database.
- The videos for keywords “of” and “India” are shown to the user, as the sign language videos for both are present in the database and there is no need to further break them into letters.
- “Kangaroo” is shown in one video, as the sign language video for the entire word is present in the database.
- “animal” is broken into letters and videos for individual letters which are shown, as the sign language video for “animal” is not present in the database.
10. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Hanke, T. HamNoSys-Representing Sign Language Data in Language Resources and Language Processing Contexts. LREC 2004, 4, 1–6. [Google Scholar]
- Prikhodko, A.; Grif, M.; Bakaev, M. Sign Language Recognition Based on Notations and Neural Networks. In Proceedings of the Communications in Computer and Information Science, Valletta, Malta, 25–27 February 2020; Volume 1242, pp. 463–478. [Google Scholar] [CrossRef]
- Sonawane, P.; Shah, K.; Patel, P.; Shah, S.; Shah, J. Speech to Indian Sign Language (ISL) Translation System. In Proceedings of the IEEE 2021 International Conference on Computing, Communication, and Intelligent Systems, ICCCIS, Greater Noida, India, 4–5 November 2021; pp. 92–96. [Google Scholar] [CrossRef]
- Tewari, Y.; Soni, P.; Singh, S.; Turlapati, M.S.; Bhuva, A. Real Time Sign Language Recognition Framework for Two Way Communication. In Proceedings of the International Conference on Communication, Information and Computing Technology, ICCICT, Mumbai, India, 25–27 June 2021. [Google Scholar] [CrossRef]
- Kunjumon, J.; Megalingam, R.K. Hand gesture recognition system for translating indian sign language into text and speech. In Proceedings of the 2019 International Conference on Smart Systems and Inventive Technology (ICSSIT), Tirunelveli, India, 27–29 November 2019; pp. 14–18. [Google Scholar]
- Gangadia, D.; Chamaria, V.; Doshi, V.; Gandhi, J. Indian Sign Language Interpretation and Sentence Formation. In Proceedings of the 2020 IEEE Pune Section International Conference, PuneCon 2020, Pune, India, 16–18 December 2020; pp. 71–76. [Google Scholar] [CrossRef]
- Shangeetha, R.K.; Valliammai, V.; Padmavathi, S. Computer vision based approach for Indian sign language character recognition. In Proceedings of the 2012 International Conference on Machine Vision and Image Processing, MVIP, Coimbatore, India, 14–15 December 2012; pp. 181–184. [Google Scholar] [CrossRef]
- Sawant, S.N.; Kumbhar, M.S. Real time sign language recognition using PCA. In Proceedings of the 2014 IEEE International Conference on Advanced Communication, Control and Computing Technologies, ICACCCT 2014, Ramanathapuram, India, 8–10 May 2014; pp. 1412–1415. [Google Scholar] [CrossRef]
- Papadogiorgaki, M.; Grammalidis, N.; Tzovaras, D.; Strintzis, M.G. Text-to-sign language synthesis tool; Text-to-sign language synthesis tool. In Proceedings of the 2005 13th European Signal Processing Conference, Antalya, Turkey, 4–8 September 2005. [Google Scholar]
- Sonare, B.; Padgal, A.; Gaikwad, Y.; Patil, A. Video-based sign language translation system using machine learning. In Proceedings of the 2021 2nd International Conference for Emerging Technology, INCET, Belagavi, Inida, 21 May 2021. [Google Scholar] [CrossRef]
- Kanvinde, A.; Revadekar, A.; Tamse, M.; Kalbande, D.R.; Bakereywala, N. Bidirectional Sign Language Translation. In Proceedings of the International Conference on Communication, Information and Computing Technology, ICCICT, Mumbai, India, 25–27 June 2021. [Google Scholar] [CrossRef]
- Qi, J.; Wang, D.; Jiang, Y.; Liu, R. Auditory features based on gammatone filters for robust speech recognition. In Proceedings of the 2013 IEEE International Symposium on Circuits and Systems (ISCAS), Beijing, China, 19–23 May 2013. [Google Scholar]
- Nair, M.S.; Nimitha, A.P.; Idicula, S.M. Conversion of Malayalam text to Indian sign language using synthetic animation. In Proceedings of the 2016 International Conference on Next Generation Intelligent Systems (ICNGIS), Kottayam, India, 1–3 September 2016. [Google Scholar]
- Qi, J.; Wang, D.; Xu, J.; Tejedor Noguerales, J. Bottleneck features based on gammatone frequency cepstral coefficients. In Proceedings of the International Speech Communication Association, Lyon, France, 5–29 August 2013. [Google Scholar]
- Grif, M.; Manueva, Y. Semantic analyses of text to translate to Russian sign language. In Proceedings of the 2016 11th Inter-national Forum on Strategic Technology, IFOST, Novosibirsk, Russia, 1–3 June 2016; pp. 286–289. [Google Scholar] [CrossRef]
- Dhanjal, A.S.; Singh, W. Comparative Analysis of Sign Language Notation Systems for Indian Sign Language. In Proceedings of the 2019 Second International Conference on Advanced Computational and Communication Paradigms (ICACCP), Sikkim, India, 25–28 February 2019. [Google Scholar]
- Varghese, M.; Nambiar, S.K. English to SiGML conversion for sign language generation. In Proceedings of the 2018 Interna-tional Conference on Circuits and Systems in Digital Enterprise Technology (ICCSDET), Kottayam, India, 21–22 December 2018. [Google Scholar]
- Raghavan, R.J.; Prasad, K.A.; Muraleedharan, R. Animation System for Indian Sign Language Communication using LOTS Notation. In Proceedings of the 2013 International Conference on Emerging Trends in Communication, Control, Signal Processing and Computing Applications (C2SPCA), Bangalore, India, 10–11 October 2013. [Google Scholar]
- Dhivyasri, S.; Krishnaa Hari, K.B.; Akash, M.; Sona, M.; Divyapriya, S.; Krishnaveni, V. An efficient approach for interpretation of Indian sign language using machine learning. In Proceedings of the 2021 3rd International Conference on Signal Processing and Communication, ICPSC, Coimbatore, India, 13–14 May 2021; pp. 130–133. [Google Scholar] [CrossRef]
- Allen, J.M.; Foulds, R.A. An approach to animating sign language: A spoken english to sign english translator system. In Proceedings of the IEEE Annual Northeast Bioengineering Conference, NEBEC, Troy, NY, USA, 7–19 April 2015; Volume 30, pp. 43–44. [Google Scholar] [CrossRef]
- Patel, B.D.; Patel, H.B.; Khanvilkar, M.A.; Patel, N.R.; Akilan, T. ES2ISL: An Advancement in Speech to Sign Language Trans-lation using 3D Avatar Animator. In Proceedings of the Canadian Conference on Electrical and Computer Engineering, London, ON, Canada, 30 August–2 September 2020. [Google Scholar] [CrossRef]
- Agarwal, A. Generating HamNoSys signs from the user’s input. In Proceedings of the 2015 1st International Conference on Next Generation Computing Technologies (NGCT), Dehradun, India, 4–5 September 2015. [Google Scholar]
- Priya, L.; Sathya, A.; Raja, S.K.S. Indian and English Language to Sign Language Translator-an Automated Portable Two Way Communicator for Bridging Normal and Deprived Ones. In Proceedings of the ICPECTS 2020–IEEE 2nd International Con-ference on Power, Energy, Control and Transmission Systems, Proceedings, Chennai, India, 10 December 2020. [Google Scholar] [CrossRef]
- Saija, K.; Sangeetha, S.; Shah, V. WordNet Based Sign Language MachineTranslation: From English Voice to ISL Gloss. In Proceedings of the 2019 IEEE 16th India Council International Conference (INDICON), Rajkot, India, 13–15 December 2019. [Google Scholar]
- Ahire, P.G.; Tilekar, K.B.; Jawake, T.A.; Warale, P.B. Two-way communicator between deaf and dumb people and normal people. In Proceedings of the 1st International Conference on Computing, Communication, Control and Automation, IC-CUBEA 2015, Pune, India, 26–27 February 2015; pp. 641–644. [Google Scholar] [CrossRef]
- Jamil, T. Design and Implementation of an Intelligent System to translate Arabic Text into Arabic Sign Language. In Proceedings of the Canadian Conference on Electrical and Computer Engineering, London, ON, Canada, 30 August–2 September 2020. [Google Scholar] [CrossRef]
- Suleiman, D.; Awajan, A.; Al Etaiwi, W. The Use of Hidden Markov Model in Natural ARABIC Language Processing: A survey. Procedia Comput. Sci. 2017, 113, 240–247. [Google Scholar] [CrossRef]
One Hand | Open Hand | Zero Hand | Five Hand |
---|---|---|---|
U-hand | L-hand | Y-hand | C-hand |
V-hand | C-hand | Full U-hand | Full C-hand |
Fist hand | Claw hand | Closed two hands | Closed four hands |
Hand Moving UP | Hand Moving DOWN |
---|---|
Moving left | Moving right |
Hand moving forward in a semicircle | Hand moving backward in a semicircle |
Moving left to right (slant) | Moving right to left (slant) |
Moving in a vertical circle | Moving in a horizontal circle |
Hand moving UP and DOWN (repeatedly) | Hand moving side by side (repeatedly) |
Source | Objective | Methodology | Conclusion |
---|---|---|---|
Sign Languages in English | |||
[17] | Converting the English language to SIGML representation. | The system uses NLTK for the direct mapping of English text to the HamNoSys string. | The system successfully converts input to a small set of HamNoSys strings and then represents them using SIGML. |
[18] | Converting English text to ISL using LOTS notation. | The system utilizes various NLTK processes to carry out the conversion. | The system is successful in showing English text in LOTS notation. |
[19] | Converting Indian Sign Language to English text. | The system uses various convolutional neural network techniques. | The system is able to convert sign language to the English language for a very small dataset. |
[20] | Converting English to animated sign language as per American Sign Language. | The system takes input by using various speech recognition software and then uses various algorithms to produce animated signs. | The proposed system is still in development and will map each word directly. |
[21] | A system to translate English to ISL. | The authors use various NLP and Google APIs. | The system takes input in English and with a 77% accuracy and it converts the input to SIGML animation, but the system is only shown to convert words or letters. |
[22] | Generating HamNoSys signs from the user’s input. | The authors use the direct mapping of only 100 words to generate the HamNoSys signs. | The system successfully generates SIGML animation for 100 words in the ISL dictionary. |
[23] | A two-way communication system between English and ISL. | The authors use neural network techniques and the HMM model. | The authors only describe the English to ISL in this paper and the system only generates output if the input is an exact match with their database. |
[24] | WordNet-based English to ISL gloss. | The authors utilize various NLP techniques and the hidden Markov model. | The system removes the words for which it does not find any replacement in the current database. In addition, the system is the only text to text conversion. |
[25] | A two-way communication system between hard-of-hearing and hearing people. | The authors utilize various machine learning algorithms and NLP techniques. | The system only shows ISL signs for those words which are stored in the database and skips any other word. |
Sign Languages in Other Languages | |||
[26] | Converting Arabic text to Arabic Sign Language. | The system uses various text parsing and word processing techniques. | The system successfully converts Arabic text to Arabic sign language with 87% efficiency and then shows the corresponding sign language animatedly. |
State | Next State | |||
---|---|---|---|---|
Verb | Noun | Adj | ||
Previous (current) state | Verb | P(verb|verb) = 0.3 | P(Noun|verb) = 0.96 | P(Adj|verb) = 0.01 |
Noun | P(verb|Noun) = 0.45 | P(Noun|Noun) = 0.05 | P(Adj|Noun) = 0.5 | |
Adj | P(verb|Adj) = 0.02 | P(Noun|Adj) = 0.95 | P(Adj|Adj) = 0.3 |
Word | Lemma |
---|---|
He | He |
was | was |
playing | playing |
and | and |
eating | eating |
at | at |
same | same |
time | time |
Word | Lemma (after Integrating POS Tags) |
---|---|
He | He |
was | be |
playing | play |
and | and |
eating | eat |
at | at |
same | same |
time | time |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sharma, P.; Tulsian, D.; Verma, C.; Sharma, P.; Nancy, N. Translating Speech to Indian Sign Language Using Natural Language Processing. Future Internet 2022, 14, 253. https://doi.org/10.3390/fi14090253
Sharma P, Tulsian D, Verma C, Sharma P, Nancy N. Translating Speech to Indian Sign Language Using Natural Language Processing. Future Internet. 2022; 14(9):253. https://doi.org/10.3390/fi14090253
Chicago/Turabian StyleSharma, Purushottam, Devesh Tulsian, Chaman Verma, Pratibha Sharma, and Nancy Nancy. 2022. "Translating Speech to Indian Sign Language Using Natural Language Processing" Future Internet 14, no. 9: 253. https://doi.org/10.3390/fi14090253
APA StyleSharma, P., Tulsian, D., Verma, C., Sharma, P., & Nancy, N. (2022). Translating Speech to Indian Sign Language Using Natural Language Processing. Future Internet, 14(9), 253. https://doi.org/10.3390/fi14090253