Reducing the Energy Consumption of sEMG-Based Gesture Recognition at the Edge Using Transformers and Dynamic Inference
Abstract
:1. Introduction
- We introduce bioformers, a set of efficient transformer architectures, which can achieve state-of-the-art accuracy on a popular sEMG-based gesture recognition dataset [7]. We perform an extensive network architecture exploration, varying several key hyper-parameters of our bioformers, such as the number of initial convolutional layers, the dimension of the input signal patches passed to the attention layers, the number of attention blocks, the dimension and number of attention heads, etc. We obtain several Pareto-optimal configurations, achieving different trade-offs in terms of accuracy versus computational complexity. Specifically, the accuracy ranges from 62.4% to 69.8%, outperforming the 66.7% obtained by a state-of-the-art CNN [24] on the same data and with the same training protocol.
- We propose a novel multi-stage dynamic inference scheme to achieve further energy reductions and to improve the flexibility of our gesture recognition system. Specifically, in a first stage, a lightweight RF separates inputs relative to a gesture from those corresponding to a rest condition (no gesture). Only when a gesture is predicted, a small bioformer is invoked to classify it. Then, based on a measure of the classification’s confidence, the process is either stopped at this second stage, or continued, invoking an additional, larger bioformer.
- When deployed on the GAP8 ultra-low-power SoC [15], bioformers achieve an execution time of 0.36–2.80 ms per classification, while consuming 19–143 μJ, and requiring at most 104 kB of memory. Bioformer configurations that achieve a higher quantized accuracy compared to the CNN of [24] consume 7.8×–44.5× less energy per inference. Moreover, thanks to the proposed dynamic inference scheme, we obtain a system that can be configured at runtime to work in 10 s of different operating points, spanning an ample accuracy range (60.9-69.8%). On GAP8, dynamic solutions further reduce the average energy consumption per classification by 1.03×–1.35× at iso-accuracy compared to static bioformers.
2. Background
2.1. Attention and Transformers
2.2. Surface Electromyographic Signal
3. Related Work
4. Material and Methods
4.1. Target Dataset
4.2. Bioformer Architectures
4.2.1. MHSA Layer Details
4.2.2. Architecture Exploration
4.3. Training Protocol
4.4. Dynamic Inference
4.4.1. Rest Detector
4.4.2. Big/Little Bioformers
4.4.3. Tuning Parameters and Overheads
5. Experimental Results
5.1. Setup
5.2. Architecture Exploration
- Model 0: , , 1 block with 1 Conv layer in the frontend;
- Model 1: , , 1 block with 1 Conv layer in the frontend;
- Model 2: , , 1 block with 2 Conv layers in the frontend;
5.3. Dynamic Inference
5.4. Deployment on GAP8
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Sun, F.T.; Morrell, M.J.; Wharen, R.E. Responsive cortical stimulation for the treatment of epilepsy. Neurotherapeutics 2008, 5, 68–74. [Google Scholar] [CrossRef]
- Daghero, F.; Jahier Pagliari, D.; Poncino, M. Energy-Efficient Deep Learning Inference on Edge Devices. In Hardware Accelerator Systems for Artificial Intelligence and Machine Learning; Kim, S., Deka, G.C., Eds.; Elsevier: Amsterdam, The Netherlands, 2021; Volume 122, pp. 247–301. [Google Scholar] [CrossRef]
- Meattini, R.; Benatti, S.; Scarcia, U.; Gregorio, D.D.; Benini, L.; Melchiorri, C. An sEMG-Based Human–Robot Interface for Robotic Hands Using Machine Learning and Synergies. IEEE Trans. Components Packag. Manuf. Technol. 2018, 8, 1149–1158. [Google Scholar] [CrossRef]
- Zheng, Z.; Wang, Q.; Yang, D.; Wang, Q.; Huang, W.; Xu, Y. L-sign: Large-vocabulary sign gestures recognition system. IEEE Trans. Hum.-Mach. Syst. 2022, 52, 290–301. [Google Scholar] [CrossRef]
- Sharma, S.; Singh, S. Vision-based hand gesture recognition using deep learning for the interpretation of sign language. Expert Syst. Appl. 2021, 182, 115657. [Google Scholar] [CrossRef]
- Sarma, D.; Bhuyan, M.K. Methods, databases and recent advancement of vision-based hand gesture recognition for hci systems: A review. SN Comput. Sci. 2021, 2, 140053. [Google Scholar] [CrossRef]
- Palermo, F.; Cognolato, M.; Gijsberts, A.; Muller, H.; Caputo, B.; Atzori, M. Repeatability of grasp recognition for robotic hand prosthesis control based on sEMG data. In Proceedings of the 2017 International Conference on Rehabilitation Robotics (ICORR), London, UK, 17–20 July 2017; pp. 1154–1159. [Google Scholar] [CrossRef]
- Atzori, M.; Gijsberts, A.; Castellini, C.; Caputo, B.; Hager, A.G.M.; Elsig, S.; Giatsidis, G.; Bassetto, F.; Müller , H. Electromyography data for non-invasive naturally-controlled robotic hand prostheses. Sci. Data 2014, 1, 140053. [Google Scholar] [CrossRef]
- Kaufmann, P.; Englehart, K.; Platzner, M. Fluctuating EMG signals: Investigating long-term effects of pattern matching algorithms. In Proceedings of the 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology, Buenos Aires, Argentina, 31 August–4 September 2010; pp. 6357–6360. [Google Scholar]
- Benatti, S.; Casamassima, F.; Milosevic, B.; Farella, E.; Schönle, P.; Fateh, S.; Burger, T.; Huang, Q.; Benini, L. A Versatile Embedded Platform for EMG Acquisition and Gesture Recognition. IEEE Trans. Biomed. Circuits Syst. 2015, 9, 620–630. [Google Scholar] [CrossRef]
- Milosevic, B.; Farella, E.; Benatti, S. Exploring Arm Posture and Temporal Variability in Myoelectric Hand Gesture Recognition. In Proceedings of the 2018 7th IEEE International Conference on Biomedical Robotics and Biomechatronics (Biorob), Enschede, The Netherlands, 26–29 August 2018; pp. 1032–1037. [Google Scholar] [CrossRef]
- Hu, Y.; Wong, Y.; Wei, W.; Du, Y.; Kankanhalli, M.; Geng, W. A novel attention-based hybrid CNN-RNN architecture for sEMG-based gesture recognition. PLoS ONE 2018, 13, e0206049. [Google Scholar] [CrossRef]
- Tsinganos, P.; Cornelis, B.; Cornelis, J.; Jansen, B.; Skodras, A. Deep Learning in EMG-based Gesture Recognition. In Proceedings of the 5th International Conference on Physiological Computing Systems, Seville, Spain, 19–21 September 2018. [Google Scholar]
- Tsinganos, P.; Cornelis, B.; Cornelis, J.; Jansen, B.; Skodras, A. Improved gesture recognition based on sEMG signals and TCN. In Proceedings of the ICASSP 2019—2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 12–17 May 2019; pp. 1169–1173. [Google Scholar]
- Flamand, E.; Rossi, D.; Conti, F.; Loi, I.; Pullini, A.; Rotenberg, F.; Benini, L. GAP-8: A RISC-V SoC for AI at the Edge of the IoT. In Proceedings of the 2018 IEEE 29th International Conference on Application-Specific Systems, Architectures and Processors (ASAP), Milan, Italy, 10–12 July 2018; pp. 1–4. [Google Scholar]
- Betthauser, J.L.; Krall, J.T.; Kaliki, R.R.; Fifer, M.S.; Thakor, N.V. Stable Electromyographic Sequence Prediction during Movement Transitions using Temporal Convolutional Networks. In Proceedings of the 2019 9th International IEEE/EMBS Conference on Neural Engineering (NER), San Francisco, CA, USA, 20–23 March 2019. [Google Scholar] [CrossRef]
- Risso, M.; Burrello, A.; Jahier Pagliari, D.; Benatti, S.; Macii, E.; Benini, L.; Poncino, M. Robust and Energy-efficient PPG-based Heart-Rate Monitoring. In Proceedings of the 2021 IEEE International Symposium on Circuits and Systems (ISCAS), Daegu, Republic of Korea, 22–28 May 2021; pp. 1–5. [Google Scholar]
- Jacob, B.; Kligys, S.; Chen, B.; Zhu, M.; Tang, M.; Howard, A.; Adam, H.; Kalenichenko, D. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 2704–2713. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention is All You Need. arXiv 2017, arXiv:1706.03762. [Google Scholar]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Brown, T.B.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language models are few-shot learners. arXiv 2020, arXiv:2005.14165. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Burrello, A.; Morghet, F.B.; Scherer, M.; Benatti, S.; Benini, L.; Macii, E.; Poncino, M.; Jahier Pagliari, D. Bioformers: Embedding transformers for ultra-low power sEMG-based gesture recognition. In Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE), Antwerp, Belgium, 14–23 March 2022; pp. 1443–1448. [Google Scholar]
- Zanghieri, M.; Benatti, S.; Burrello, A.; Kartsch, V.; Conti, F.; Benini, L. Robust real-time embedded emg recognition framework using temporal convolutional networks on a multicore iot processor. IEEE Trans. Biomed. Circuits Syst. 2019, 14, 244–256. [Google Scholar] [CrossRef]
- Wei, W.; Dai, Q.; Wong, Y.; Hu, Y.; Kankanhalli, M.; Geng, W. Surface-electromyography-based gesture recognition by multi-view deep learning. IEEE Trans. Biomed. Eng. 2019, 66, 2964–2973. [Google Scholar] [CrossRef]
- Zou, Y.; Cheng, L. A Transfer Learning Model for Gesture Recognition Based on the Deep Features Extracted by CNN. IEEE Trans. Artif. Intell. 2021, 2, 447–458. [Google Scholar] [CrossRef]
- Han, L.; Zou, Y.; Cheng, L. A Convolutional Neural Network With Multi-scale Kernel and Feature Fusion for sEMG-based Gesture Recognition. In Proceedings of the 2021 IEEE International Conference on Robotics and Biomimetics (ROBIO), Sanya, China, 27–31 December 2021; pp. 774–779. [Google Scholar]
- Hudgins, B.; Parker, P.; N. Scott, R. A new strategy for multifunction myoelectric control. IEEE Trans. Bio-Med. Eng. 1993, 40, 82–94. [Google Scholar] [CrossRef]
- Englehart, K.; Hudgins, B. A robust, real-time control scheme for multifunction myoelectric control. IEEE Trans. Biomed. Eng. 2003, 50, 848–854. [Google Scholar] [CrossRef]
- Castellini, C.; Gruppioni, E.; Davalli, A.; Sandini, G. Fine detection of grasp force and posture by amputees via surface electromyography. J. Physiol. 2009, 103, 255–262. [Google Scholar] [CrossRef]
- Phinyomark, A.; Scheme, E.J. EMG Pattern Recognition in the Era of Big Data and Deep Learning. Big Data Cogn. Comput. 2018, 2, 21. [Google Scholar] [CrossRef]
- Benatti, S.; Farella, E.; Gruppioni, E.; Benini, L. Analysis of Robust Implementation of an EMG Pattern Recognition based Control. In Proceedings of the International Joint Conference on Biomedical Engineering Systems and Technologies—Volume 4. SCITEPRESS-Science and Technology Publications, Lda, Loire Valley, France, 3–6 March 2014; pp. 45–54. [Google Scholar]
- Cene, V.H.; Tosin, M.; Machado, J.; Balbinot, A. Open Database for Accurate Upper-Limb Intent Detection Using Electromyography and Reliable Extreme Learning Machines. Sensors 2019, 19, 1864. [Google Scholar] [CrossRef]
- Park, E.; Kim, D.; Kim, S.; Kim, Y.D.; Kim, G.; Yoon, S.; Yoo, S. Big/Little Deep Neural Network for Ultra Low Power Inference. In Proceedings of the 2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES + ISSS), Amsterdam, The Netherlands, 4–9 October 2015; pp. 124–132. [Google Scholar] [CrossRef]
- Tann, H.; Hashemi, S.; Bahar, R.I.; Reda, S. Runtime Configurable Deep Neural Networks for Energy-Accuracy Trade-Off. In Proceedings of the Eleventh IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis—CODES’16, Pittsburgh, PA, USA, 2–7 October 2016; pp. 1–10. [Google Scholar] [CrossRef]
- Yu, J.; Yang, L.; Xu, N.; Yang, J.; Huang, T. Slimmable Neural Networks. arXiv 2018, arXiv:1812.08928. [Google Scholar]
- Jahier Pagliari, D.; Macii, E.; Poncino, M. Dynamic Bit-width Reconfiguration for Energy-Efficient Deep Learning Hardware. In Proceedings of the International Symposium on Low Power Electronics and Design, Seattle, WA, USA, 23–25 July 2018; ACM: New York, NY, USA, 2018; pp. 47:1–47:6. [Google Scholar] [CrossRef]
- Parsa, M.; Panda, P.; Sen, S.; Roy, K. Staged Inference Using Conditional Deep Learning for Energy Efficient Real-Time Smart Diagnosis. In Proceedings of the 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Jeju, Republic of Korea, 11–15 July 2017; pp. 78–81. [Google Scholar] [CrossRef]
- Daghero, F.; Jahier Pagliari, D.; Poncino, M. Two-stage Human Activity Recognition on Microcontrollers with Decision Trees and CNNs. In Proceedings of the 2022 17th Conference on Ph.D Research in Microelectronics and Electronics (PRIME), Villasimius, Italy, 12–15 June 2022; pp. 173–176. [Google Scholar] [CrossRef]
- Xie, C.; Jahier Pagliari, D.; Calimera, A. Energy-efficient and Privacy-aware Social Distance Monitoring with Low-resolution Infrared Sensors and Adaptive Inference. In Proceedings of the 2022 17th Conference on Ph.D Research in Microelectronics and Electronics (PRIME), Villasimius, Italy, 12–15 June 2022; pp. 181–184. [Google Scholar] [CrossRef]
- Mullapudi, R.T.; Mark, W.R.; Shazeer, N.; Fatahalian, K. HydraNets: Specialized Dynamic Architectures for Efficient Inference. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Burrello, A.; Jahier Pagliari, D.; Rapa, P.M.; Semilia, M.; Risso, M.; Polonelli, T.; Poncino, M.; Benini, L.; Benatti, S. Embedding Temporal Convolutional Networks for Energy-Efficient PPG-Based Heart Rate Monitoring. ACM Trans. Comput. Healthc. 2022, 3, 19. [Google Scholar] [CrossRef]
- Daghero, F.; Burrello, A.; Jahier Pagliari, D.; Benini, L.; Macii, E.; Poncino, M. Energy-Efficient Adaptive Machine Learning on IoT End-Nodes With Class-Dependent Confidence. In Proceedings of the 2020 27th IEEE International Conference on Electronics, Circuits and Systems (ICECS), Glasgow, UK, 23–25 November 2020; pp. 1–4. [Google Scholar] [CrossRef]
- Burrello, A.; Scherer, M.; Zanghieri, M.; Conti, F.; Benini, L. A Microcontroller is All You Need: Enabling Transformer Execution on Low-Power IoT Endnodes. In Proceedings of the 2021 IEEE International Conference on Omni-Layer Intelligent Systems (COINS), Barcelona, Spain, 23–25 August 2021; pp. 1–6. [Google Scholar] [CrossRef]
- Garofalo, A.; Rusci, M.; Conti, F.; Rossi, D.; Benini, L. PULP-NN: Accelerating Quantized Neural Networks on Parallel Ultra-Low-Power RISC-V Processors. arXiv 2019, arXiv:1908.11263. [Google Scholar] [CrossRef]
Work | Year | Features | Algorithm | Real-Time | Emb. | Accuracy Inter % / Intra % / Random % | Energy |
---|---|---|---|---|---|---|---|
Palermo [7] | 2017 | WL | RF | No | No | 25.4 / 52.4 / n.a. | n.a. |
Cene [33] | 2019 | MAV, VAR, RMS | ELM | No | No | 41.8 / 69.8 / n.a. | n.a. |
Zanghieri [24] | 2019 | Raw Data | TCN | Yes | Yes | 65.2 (q: 61.0) / 71.8 / n.a. | 0.90 mJ |
Wei [25] | 2019 | Raw Data | Multi-View CNN | No | No | 64.1 / n.a. / n.a. | n.a. |
Zou [26] | 2021 | Raw Data | Multiscale CNN | No | No | n.a. / n.a. / (97.2, 74.5, 90.3) * | n.a. |
Han [27] | 2021 | Raw Data | Multiscale CNN | No | No | n.a. / n.a. / 98.52 * | n.a. |
Bioformer [23] | 2022 | Raw Data | Transformers | Yes | Yes | 65.7 (q: 64.7) / n.a. / n.a. | 0.139 mJ |
Our Work | 2023 | Raw Data | Transformers | Yes | Yes | 69.8 (q: 67.0) / n.a. / n.a. | 0.143 mJ |
Model | Patch Size | Memory | MMAC [#] | Lat. [ms] | Energy [mJ] | GMAC/s | GMAC/s/W | Accuracy |
---|---|---|---|---|---|---|---|---|
- 0 - | 10 | 44.35 kB | 1.37 | 1.130 | 0.058 | 1.21 | 23.78 | 67.04% |
- 1 - | 10 | 60.74 kB | 1.97 | 1.611 | 0.082 | 1.22 | 23.97 | 66.89% |
- 2 - | 10 | 94.09 kB | 3.36 | 2.797 | 0.143 | 1.20 | 23.58 | 65.28% |
- 0 - | 30 | 60.99 kB | 0.61 | 0.490 | 0.025 | 1.25 | 24.41 | 63.91% |
- 1 - | 30 | 77.38 kB | 0.79 | 0.633 | 0.032 | 1.24 | 24.34 | 64.37% |
- 0 - | 60 | 87.55 kB | 0.44 | 0.364 | 0.019 | 1.20 | 23.43 | 60.82% |
- 1 - | 60 | 103.94 kB | 0.52 | 0.438 | 0.022 | 1.19 | 23.27 | 61.25% |
[24] | n.a. | 461 kB | 16.00 | 21.828 | 1.113 | 0.73 | 14.37 | 61.95% |
Stage-1 | Stage-2 | Stage-3 | Technique | MMAC [#] | Latency [ms] | Energy [mJ] | Accuracy |
---|---|---|---|---|---|---|---|
n.a. | n.a. | - 0 - (10) | None | 1.37 | 1.130 | 0.058 | 67.04% |
n.a. | n.a. | - 2 - (10) | None | 3.36 | 2.797 | 0.143 | 65.28% |
n.a. | n.a. | - 0 - (60) | None | 0.44 | 0.364 | 0.019 | 60.82% |
RF | - 2 - (10) | n.a. | Rest. | 1.686 | 1.402 | 0.071 | 62.55% |
n.a. | - 0 - (60) | - 2 - (10) | Big/Little | 2.483 | 2.067 | 0.105 | 65.20% |
RF | - 0 - (60) | - 2 - (10) | Rest. + Big/Litte | 1.274 | 1.061 | 0.054 | 62.54% |
RF | - 0 - (10) | n.a. | Rest. | 0.685 | 0.570 | 0.029 | 64.69% |
n.a. | - 0 - (60) | - 0 - (10) | Big/Little | 1.319 | 1.098 | 0.056 | 67.02% |
RF | - 0 - (60) | - 0 - (10) | Rest. + Big/Litte | 0.631 | 0.525 | 0.027 | 64.72% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xie, C.; Burrello, A.; Daghero, F.; Benini, L.; Calimera, A.; Macii, E.; Poncino, M.; Jahier Pagliari, D. Reducing the Energy Consumption of sEMG-Based Gesture Recognition at the Edge Using Transformers and Dynamic Inference. Sensors 2023, 23, 2065. https://doi.org/10.3390/s23042065
Xie C, Burrello A, Daghero F, Benini L, Calimera A, Macii E, Poncino M, Jahier Pagliari D. Reducing the Energy Consumption of sEMG-Based Gesture Recognition at the Edge Using Transformers and Dynamic Inference. Sensors. 2023; 23(4):2065. https://doi.org/10.3390/s23042065
Chicago/Turabian StyleXie, Chen, Alessio Burrello, Francesco Daghero, Luca Benini, Andrea Calimera, Enrico Macii, Massimo Poncino, and Daniele Jahier Pagliari. 2023. "Reducing the Energy Consumption of sEMG-Based Gesture Recognition at the Edge Using Transformers and Dynamic Inference" Sensors 23, no. 4: 2065. https://doi.org/10.3390/s23042065
APA StyleXie, C., Burrello, A., Daghero, F., Benini, L., Calimera, A., Macii, E., Poncino, M., & Jahier Pagliari, D. (2023). Reducing the Energy Consumption of sEMG-Based Gesture Recognition at the Edge Using Transformers and Dynamic Inference. Sensors, 23(4), 2065. https://doi.org/10.3390/s23042065