Accelerating 3D Convolutional Neural Network with Channel Bottleneck Module for EEG-Based Emotion Recognition
Abstract
:1. Introduction
- We propose a novel 3D CNN model integrated with the channel bottleneck module (CNN-BN) based on the constructed 3D EEG representation.
- Extensive experiments are conducted on the DEAP dataset for the valence and arousal classification tasks. The experimental results show that our CNN-BN model outperforms baseline and state-of-the-art-models and significantly reduces computational complexity.
- Our CNN-BN model with a better parameter efficiency has an excellent potential for accelerating CNN-based emotion recognition without losing classification performance.
2. Related Work
3. Methodology
3.1. Model Overview
3.2. 3D Representation
3.3. Convolutional Neural Network with Channel Bottleneck Module
3.3.1. Convolution Block
3.3.2. Bottleneck Module
3.3.3. Dense Block
4. Experiments
4.1. Dataset
4.2. Baseline Models
4.2.1. Regular Convolutional Neural Network
4.2.2. Long Short-Term Memory (LSTM)
4.3. Experimental Settings
4.4. Performance Evaluation Metrics
5. Results and Discussion
5.1. Classification Performance
5.2. Effects of Channel Bottleneck Blocks in CNN
5.3. Performance Comparison with the State-of-the-Art Models
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- James, W. What is an Emotion? Mind 1884, 9, 188–205. [Google Scholar] [CrossRef]
- Beedie, C.J.; Terry, P.C.; Lane, A.M.; Devonport, T.J. Differential assessment of emotions and moods: Development and validation of the Emotion and Mood Components of Anxiety Questionnaire. Personal. Individ. Differ. 2011, 50, 228–233. [Google Scholar] [CrossRef]
- Poria, S.; Cambria, E.; Bajpai, R.; Hussain, A. A review of affective computing: From unimodal analysis to multimodal fusion. Inf. Fusion 2017, 37, 98–125. [Google Scholar] [CrossRef]
- Kumar, N.; Kumar, J. Measurement of Cognitive Load in HCI Systems Using EEG Power Spectrum: An Experimental Study. Procedia Comput. Sci. 2016, 84, 70–78. [Google Scholar] [CrossRef]
- Miniussi, C.; Thut, G. Combining TMS and EEG offers new prospects in cognitive neuroscience. Brain Topogr. 2010, 22, 249–256. [Google Scholar] [CrossRef] [PubMed]
- Adolphs, R.; Tranel, D.; Damasio, H.; Damasio, A. Impaired recognition of emotion in facial expressions following bilateral damage to the human amygdala. Nature 1994, 372, 669–672. [Google Scholar] [CrossRef]
- Marin-Morales, J.; Llinares, C.; Guixeres, J.; Alcaniz, M. Emotion Recognition in Immersive Virtual Reality: From Statistics to Affective Computing. Sensors 2020, 20, 5163. [Google Scholar] [CrossRef]
- Rattanyu, K.; Ohkura, M.; Mizukawa, M. Emotion Monitoring from Physiological Signals for Service Robots in the Living Space. In Proceedings of the ICCAS 2010, Goyang, Gyeonggi-do, Korea, 27–30 October 2010; pp. 580–583. [Google Scholar]
- Huang, X.H.; Kortelainen, J.; Zhao, G.Y.; Li, X.B.; Moilanen, A.; Seppanen, T.; Pietikainen, M. Multi-modal emotion analysis from facial expressions and electroencephalogram. Comput. Vis. Image Underst. 2016, 147, 114–124. [Google Scholar] [CrossRef]
- Chatterjee, M.; Zion, D.J.; Deroche, M.L.; Burianek, B.A.; Limb, C.J.; Goren, A.P.; Kulkarni, A.M.; Christensen, J.A. Voice emotion recognition by cochlear-implanted children and their normally-hearing peers. Hear. Res. 2015, 322, 151–162. [Google Scholar] [CrossRef]
- Ross, P.D.; Polson, L.; Grosbras, M.H. Developmental changes in emotion recognition from full-light and point-light displays of body movement. PLoS ONE 2012, 7, e44815. [Google Scholar] [CrossRef]
- Wu, G.; Liu, G.; Hao, M. The Analysis of Emotion Recognition from GSR Based on PSO. In Proceedings of the 2010 International Symposium on Intelligence Information Processing and Trusted Computing, Huanggang, China, 28–29 October 2010; pp. 360–363. [Google Scholar]
- Goshvarpour, A.; Abbasi, A.; Goshvarpour, A. An accurate emotion recognition system using ECG and GSR signals and matching pursuit method. Biomed. J. 2017, 40, 355–368. [Google Scholar] [CrossRef] [PubMed]
- Abadi, M.K.; Kia, M.; Subramanian, R.; Avesani, P.; Sebe, N. Decoding Affect in Videos Employing the MEG Brain Signal. In Proceedings of the 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), Shanghai, China, 22–26 April 2013; pp. 1–6. [Google Scholar]
- Alhagry, S.; Fahmy, A.A.; El-Khoribi, R.A. Emotion recognition based on EEG using LSTM recurrent neural network. Int. J. Adv. Comput. Sci. Appl. 2017, 8, 355–358. [Google Scholar] [CrossRef]
- Wang, F.; Wu, S.; Zhang, W.; Xu, Z.; Zhang, Y.; Wu, C.; Coleman, S. Emotion recognition with convolutional neural network and EEG-based EFDMs. Neuropsychologia 2020, 146, 107506. [Google Scholar] [CrossRef] [PubMed]
- Yin, Z.; Zhao, M.; Wang, Y.; Yang, J.; Zhang, J. Recognition of emotions using multimodal physiological signals and an ensemble deep learning model. Comput. Methods Programs Biomed. 2017, 140, 93–110. [Google Scholar] [CrossRef]
- Fang, Y.; Yang, H.; Zhang, X.; Liu, H.; Tao, B. Multi-Feature Input Deep Forest for EEG-Based Emotion Recognition. Front. Neurorobot. 2020, 14, 617531. [Google Scholar] [CrossRef]
- Sharma, R.; Pachori, R.B.; Sircar, P. Automated emotion recognition based on higher order statistics and deep learning algorithm. Biomed. Signal Process. Control. 2020, 58, 101867. [Google Scholar] [CrossRef]
- An, Y.; Hu, S.; Duan, X.; Zhao, L.; Xie, C.; Zhao, Y. Electroencephalogram Emotion Recognition Based on 3D Feature Fusion and Convolutional Autoencoder. Front. Comput. Neurosci. 2021, 15, 743426. [Google Scholar] [CrossRef]
- Islam, M.R.; Islam, M.M.; Rahman, M.M.; Mondal, C.; Singha, S.K.; Ahmad, M.; Awal, A.; Islam, M.S.; Moni, M.A. EEG Channel Correlation Based Model for Emotion Recognition. Comput. Biol. Med. 2021, 136, 104757. [Google Scholar] [CrossRef]
- Liu, Y.; Ding, Y.; Li, C.; Cheng, J.; Song, R.; Wan, F.; Chen, X. Multi-channel EEG-based emotion recognition via a multi-level features guided capsule network. Comput. Biol. Med. 2020, 123, 103927. [Google Scholar] [CrossRef]
- Sartipi, S.; Torkamani-Azar, M.; Cetin, M. EEG Emotion Recognition via Graph-based Spatio-Temporal Attention Neural Networks. In Proceedings of the 2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Mexico, 1–5 November 2021; pp. 571–574. [Google Scholar]
- Yin, Y.; Zheng, X.; Hu, B.; Zhang, Y.; Cui, X. EEG emotion recognition using fusion model of graph convolutional neural networks and LSTM. Appl. Soft Comput. 2021, 100, 106954. [Google Scholar] [CrossRef]
- Ding, Y.; Robinson, N.; Zhang, S.; Zeng, Q.; Guan, C. TSception: Capturing Temporal Dynamics and Spatial Asymmetry from EEG for Emotion Recognition. arXiv 2022, arXiv:2104.02935. [Google Scholar] [CrossRef]
- Chao, H.; Dong, L.; Liu, Y.; Lu, B. Emotion Recognition from Multiband EEG Signals Using CapsNet. Sensors 2019, 19, 2212. [Google Scholar] [CrossRef] [PubMed]
- Jia, Z.; Lin, Y.; Wang, J.; Feng, Z.; Xie, X.; Chen, C. HetEmotionNet. In Proceedings of the 29th ACM International Conference on Multimedia, Chengdu, China, 20–24 October 2021; pp. 1047–1056. [Google Scholar]
- Alarcao, S.M.; Fonseca, M.J. Emotions Recognition Using EEG Signals: A Survey. IEEE Trans. Affect. Comput. 2019, 10, 374–393. [Google Scholar] [CrossRef]
- Liu, H.; Zhang, Y.; Li, Y.; Kong, X. Review on Emotion Recognition Based on Electroencephalography. Front. Comput. Neurosci. 2021, 15, 758212. [Google Scholar] [CrossRef] [PubMed]
- Jia, Z.; Lin, Y.; Cai, X.; Chen, H.; Gou, H.; Wang, J. SST-EmotionNet: Spatial-Spectral-Temporal Based Attention 3D Dense Network for EEG Emotion Recognition. In Proceedings of the 28th ACM International Conference on Multimedia, Seattle, WA, USA, 12–16 October 2020; pp. 2909–2917. [Google Scholar]
- Cai, J.; Xiao, R.; Cui, W.; Zhang, S.; Liu, G. Application of Electroencephalography-Based Machine Learning in Emotion Recognition: A Review. Front. Syst. Neurosci. 2021, 15, 729707. [Google Scholar] [CrossRef]
- Jenke, R.; Peer, A.; Buss, M. Feature Extraction and Selection for Emotion Recognition from EEG. IEEE Trans. Affect. Comput. 2014, 5, 327–339. [Google Scholar] [CrossRef]
- Ma, J.; Tang, H.; Zheng, W.-L.; Lu, B.-L. Emotion Recognition Using Multimodal Residual LSTM Network. In Proceedings of the 27th ACM International Conference on Multimedia, Nice, France, 21–25 October 2019; pp. 176–183. [Google Scholar]
- Li, J.; Zhang, Z.; He, H. Hierarchical Convolutional Neural Networks for EEG-Based Emotion Recognition. Cogn. Comput. 2018, 10, 368–380. [Google Scholar] [CrossRef]
- Conneau, A.; Essid, S. Assessment of New Spectral Features for Eeg-Based Emotion Recognition. In Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, 4–9 May 2014; pp. 4698–4702. [Google Scholar]
- Koelstra, S.; Muhl, C.; Soleymani, M.; Lee, J.; Yazdani, A.; Ebrahimi, T.; Pun, T.; Nijholt, A.; Patras, I. DEAP: A Database for Emotion Analysis Using Physiological Signals. IEEE Trans. Affect. Comput. 2012, 3, 18–31. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Tran, D.; Bourdev, L.; Fergus, R.; Torresani, L.; Paluri, M. Learning spatiotemporal features with 3d convolutional networks. arXiv 2015, arXiv:1412.0767. [Google Scholar]
- Mao, J.; Jain, A.K. Artificial neural networks for feature extraction and multivariate data projection. IEEE Trans. Neural Netw. 1995, 6, 296–317. [Google Scholar] [PubMed]
- Haykin, S.; Lippmann, R. Neural networks, a comprehensive foundation. Int. J. Neural Syst. 1994, 5, 363–364. [Google Scholar]
- Gonzalez, R.C.; Woods, R.E.; Eddins, S.L. Digital Image Processing Using MATLAB, 3rd ed.; Gatesmark Publishing: Knoxville, TN, USA, 2020. [Google Scholar]
- Russell, J.A. A circumplex model of affect. J. Personal. Soc. Psychol. 1980, 39, 1161. [Google Scholar] [CrossRef]
- De Boer, P.-T.; Kroese, D.P.; Mannor, S.; Rubinstein, R.Y. A tutorial on the cross-entropy method. Ann. Oper. Res. 2005, 134, 19–67. [Google Scholar] [CrossRef]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. arXiv 2015, arXiv:1409.4842. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. arXiv 2016, arXiv:1512.03385. [Google Scholar]
- Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. arXiv 2017, arXiv:1608.06993. [Google Scholar]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. arXiv 2015, arXiv:1411.4038. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. arXiv 2015, arXiv:1505.04597. [Google Scholar]
- Badrinarayanan, V.; Kendall, A.; SegNet, R.C. A deep convolutional encoder-decoder architecture for image segmentation. arXiv 2015, arXiv:1511.00561. [Google Scholar] [CrossRef]
- Paszke, A.; Chaurasia, A.; Kim, S.; Culurciello, E. Enet: A deep neural network architecture for real-time semantic segmentation. arXiv 2016, arXiv:1606.02147. [Google Scholar]
- Hinton, G.E.; Salakhutdinov, R.R. Reducing the dimensionality of data with neural networks. Science 2006, 313, 504–507. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Type | Input Size (C × D × H × W) | Channel | Kernel Size | Output Size (C × D × H × W) |
---|---|---|---|---|
Convolution block | 1 × 128 × 64 × 64 | 64 | 7 × 3 × 3 | 64×128×64×64 |
64 × 128 × 64 × 64 | 2 × 1 × 1 max pooling, stride 2 × 1 × 1 | 64 × 64 × 64 × 64 | ||
Bottleneck block 1 | 64 × 64 × 64 × 64 | 16 | 1 × 1 × 1 | 16 × 64 × 64 × 64 |
64 × 64 × 64 × 64 | 16 | 7 × 3 × 3 | 16 × 64 × 64 × 64 | |
64 × 64 × 64 × 64 | 128 | 1 × 1 × 1 | 128 × 64 × 64 × 64 | |
128 × 64 × 64 × 64 | 2 × 2 × 2 max pooling, stride 2 × 2 × 2 | 128 × 32 × 32 × 32 | ||
Bottleneck block 2 | 128 × 32 × 32 × 32 | 32 | 1×1×1 | 32 × 32 × 32 × 32 |
32 × 32 × 32 × 32 | 32 | 7×3×3 | 32 × 32 × 32 × 32 | |
32 × 32 × 32 × 32 | 256 | 1×1×1 | 256 × 32 × 32 × 32 | |
256 × 32 × 32 × 32 | 2 × 2 × 2 max pooling, stride 2 × 2 × 2 | 256 × 16 × 16 × 16 | ||
Bottleneck block 3 | 256 × 16 × 16 × 16 | 64 | 1 × 1 × 1 | 64 × 16 × 16 × 16 |
64 × 16 × 16 × 16 | 64 | 7 × 3 × 3 | 64 × 16 × 16 × 16 | |
64 × 16 × 16 × 16 | 256 | 1 × 1 × 1 | 256 × 16 × 16 × 16 | |
256 × 16 × 16 × 16 | 2 × 2 × 2 max pooling, stride 2 × 2 × 2 | 256 × 8 × 8 × 8 | ||
Bottleneck block 4 | 256 × 8 × 8 × 8 | 64 | 1 × 1 × 1 | 64 × 8 × 8 × 8 |
64 × 8 × 8 × 8 | 64 | 7 × 3 × 3 | 64 × 8 × 8 × 8 | |
64 × 8 × 8 × 8 | 256 | 1 × 1 × 1 | 256 × 8 × 8 × 8 | |
256 × 8 × 8 × 8 | 2 × 2 × 2 max pooling, stride 2 × 2 × 2 | 256 × 4 × 4 × 4 | ||
Bottleneck block 5 | 256 × 4 × 4 × 4 | 64 | 1 × 1 × 1 | 64 × 4 × 4 × 4 |
64 × 4 × 4 × 4 | 64 | 7 × 3 × 3 | 64 × 4 × 4 × 4 | |
64 × 4 × 4 × 4 | 256 | 1 × 1 × 1 | 256 × 4 × 4 × 4 | |
256 × 4 × 4 × 4 | 2 × 2 × 2 max pooling, stride 2 × 2 × 2 | 256 × 2 × 2 × 2 | ||
Dense block | 2048 | 128D fully connected | 128 | |
128 | 2D fully connected | 2 |
Model | Valence | Arousal | ||||||
---|---|---|---|---|---|---|---|---|
Recall | Precision | F1-Score | Accuracy | Recall | Precision | F1-Score | Accuracy | |
LSTM | 0.6600 (0.0865) | 0.6641 (0.0281) | 0.6579 (0.0309) | 0.6651 (0.0052) | 0.6459 (0.0690) | 0.6491 (0.0678) | 0.6425 (0.0524) | 0.6508 (0.0529) |
C3D | 0.9888 (0.0004) | 0.9888 (0.0004) | 0.9888 (0.0003) | 0.9890 (0.0003) | 0.9925 (0.0011) | 0.9930 (0.0022) | 0.9928 (0.0024) | 0.9929 (0.0023) |
CNN-BN | 0.9908 (0.0019) | 0.9910 (0.0018) | 0.9909 (0.0010) | 0.9910 (0.0010) | 0.9948 (0.0011) | 0.9947 (0.0011) | 0.9947 (0.0007) | 0.9948 (0.0007) |
Model | Parameters (M) | FLOPs (G) |
---|---|---|
C3D | 16.05 | 449.29 |
CNN-BN | 1.11 | 22.74 |
Authors | Feature | Classifier | Accuracy | |
---|---|---|---|---|
Valence | Arousal | |||
Alhagry et al. [15] | Time-domain signal | LSTM | 0.8545 | 0.8565 |
Yin et al. [17] | Power, time-domain features | Multiple-fusion-layer based Ensemble classifier of Stacked AutoEncoder (MESAE) | 0.7617 | 0.7719 |
Sharma et al. [19] | Third-order cumulants (ToC) | LSTM | 0.8416 | 0.8521 |
An et al. [20] | Bandwise DE 2D representation | CNN-SAE | 0.8949 | 0.9076 |
Islam et al. [21] | Pearson’s Correlation Coefficient | CNN | 0.7822 | 0.7492 |
Liu et al. [22] | Time-domain signal | Multi-Level Feature (MLF)-CapsNet | 0.9797 | 0.9831 |
Sartipi et al. [23] | Graph Fourier Transform Spatiotemporal Attention Neural Network (GFT-STANN) | Spatiotemporal attention neural network (STANN) | 0.948 | 0.961 |
Yin et al. [24] | DE graph | GCNN + LSTM | 0.9045 | 0.9060 |
Ding et al. [25] | Time-domain signal | Temporal Spatial Inception (TSception) | 0.6227 | 0.6375 |
Chao et al. [26] | Multiband Feature Matrix | CapsNet | 0.6673 | 0.6828 |
Jia et al. [27] | Heterogeneous graph sequence | Graph Transformer Network (GTN), Graph Convolutional Network (GCN) | 0.9766 | 0.9730 |
Ours | Time-domain signal | LSTM | 0.6651 | 0.6508 |
Spatiotemporal 3D representation | C3D | 0.9890 | 0.9929 | |
Spatiotemporal 3D representation | CNN-BN | 0.9910 | 0.9948 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kim, S.; Kim, T.-S.; Lee, W.H. Accelerating 3D Convolutional Neural Network with Channel Bottleneck Module for EEG-Based Emotion Recognition. Sensors 2022, 22, 6813. https://doi.org/10.3390/s22186813
Kim S, Kim T-S, Lee WH. Accelerating 3D Convolutional Neural Network with Channel Bottleneck Module for EEG-Based Emotion Recognition. Sensors. 2022; 22(18):6813. https://doi.org/10.3390/s22186813
Chicago/Turabian StyleKim, Sungkyu, Tae-Seong Kim, and Won Hee Lee. 2022. "Accelerating 3D Convolutional Neural Network with Channel Bottleneck Module for EEG-Based Emotion Recognition" Sensors 22, no. 18: 6813. https://doi.org/10.3390/s22186813
APA StyleKim, S., Kim, T. -S., & Lee, W. H. (2022). Accelerating 3D Convolutional Neural Network with Channel Bottleneck Module for EEG-Based Emotion Recognition. Sensors, 22(18), 6813. https://doi.org/10.3390/s22186813