Singing Voice Detection: A Survey
Abstract
:1. Introduction
2. Feature Extraction
- Short-Time Fourier Transform Spectrum
- Mel-spectrogram
- Temporal Features
- Spectral features
- Division of the speech signals into frames, usually by applying a windowing function at fixed intervals [35];
- Computing the coefficients of the discrete Fourier transform on each segment of windowed signal to convert the time domain into the frequency domain;
- Taking the logarithm of the amplitude spectrum;
- Smoothing the spectrum and emphasizing perceptually meaningful frequencies [35];
- Taking the discrete cosine transform (DCT) of the list of mel log powers;
- Generating cepstrum.
3. Datasets
4. Traditional Methods
5. Deep Learning Techniques
5.1. Convolutional Neural Networks
5.2. Recurrent Neural Networks
5.3. Long Short-Term Memory
5.4. Bidirectional LSTMs
5.5. GRU-RNN
5.6. ConvLSTM or LRCN
- Input gate:
- forget gate:
- LRCN Cell:
- Output gate:
- Hidden state:
6. Conclusions and Discussion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Wong, C.H.; Szeto, W.M.; Wong, K.H. Automatic lyrics alignment for Cantonese popular music. Multimed. Syst. 2007, 12, 307–323. [Google Scholar] [CrossRef]
- Fujihara, H.; Goto, M. Lyrics-to-audio alignment and its application. In Dagstuhl Follow-Ups; Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik: Wadern, Germany, 2012; Volume 3. [Google Scholar] [CrossRef]
- Kan, M.Y.; Wang, Y.; Iskandar, D.; Nwe, T.L.; Shenoy, A. LyricAlly: Automatic synchronization of textual lyrics to acoustic music signals. IEEE Trans. Audio Speech Lang. Process. 2008, 16, 338–349. [Google Scholar] [CrossRef]
- Rigaud, F.; Radenen, M. Singing Voice Melody Transcription Using Deep Neural Networks. In Proceedings of the 17th ISMIR Conference, New York, NY, USA, 7–11 August 2016; pp. 737–743. [Google Scholar]
- Bittner, R.M.; McFee, B.; Salamon, J.; Li, P.; Bello, J.P. Deep Salience Representations for F0 Estimation in Polyphonic Music. In Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR 2017), Suzhou, China, 23–27 October 2017; pp. 63–70. [Google Scholar]
- Pardo, B.; Rafii, Z.; Duan, Z. Audio source separation in a musical context. In Springer Handbook of Systematic Musicology; Springer: Berlin/Heidelberg, Germany, 2018; pp. 285–298. [Google Scholar]
- Li, Y.; Wang, D. Separation of singing voice from music accompaniment for monaural recordings. IEEE Trans. Audio Speech Lang. Process. 2007, 15, 1475–1487. [Google Scholar] [CrossRef] [Green Version]
- Jansson, A.; Humphrey, E.; Montecchio, N.; Bittner, R.; Kumar, A.; Weyde, T. Singing voice separation with deep u-net convolutional networks. In Proceedings of the 18th International Society for Music Information Retrieval Conference, Suzhou, China, 23–27 October 2017. [Google Scholar]
- Rao, V.; Rao, P. Vocal melody extraction in the presence of pitched accompaniment in polyphonic music. IEEE Trans. Audio Speech Lang. Process. 2010, 18, 2145–2154. [Google Scholar] [CrossRef]
- Hosoya, T.; Suzuki, M.; Ito, A.; Makino, S.; Smith, L.A.; Bainbridge, D.; Witten, I.H. Lyrics Recognition from a Singing Voice Based on Finite State Automaton for Music Information Retrieval. In Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR 2005), London, UK, 11–15 September 2005; pp. 532–535. [Google Scholar]
- McVicar, M.; Ellis, D.P.; Goto, M. Leveraging repetition for improved automatic lyric transcription in popular music. In Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, 4–9 May 2014; pp. 3117–3121. [Google Scholar] [CrossRef] [Green Version]
- Zhang, T. Automatic singer identification. In Proceedings of the 2003 International Conference on Multimedia and Expo. ICME’03. Proceedings (Cat. No. 03TH8698), Baltimore, MD, USA, 6–9 July 2003. [Google Scholar]
- Berenzweig, A.L.; Ellis, D.P. Locating singing voice segments within music signals. In Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No. 01TH8575), New Platz, NY, USA, 24–24 October 2001; pp. 119–122. [Google Scholar] [CrossRef] [Green Version]
- Kim, Y.E.; Whitman, B. Singer identification in popular music recordings using voice coding features. In Proceedings of the 3rd International Conference on Music Information Retrieval, Paris, France, 13–17 October 2002; Volume 13, p. 17. [Google Scholar]
- Dittmar, C.; Lehner, B.; Prätzlich, T.; Müller, M.; Widmer, G. Cross-Version Singing Voice Detection in Classical Opera Recordings. In Proceedings of the International Conference on Music Information Retrieval (ISMIR), Malaga, Spain, 26–30 October 2015; pp. 618–624. [Google Scholar]
- Leglaive, S.; Hennequin, R.; Badeau, R. Singing voice detection with deep recurrent neural networks. In Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), South Brisbane, QLD, Australia, 19–24 April 2015; pp. 121–125. [Google Scholar]
- Schlüter, J.; Grill, T. Exploring Data Augmentation for Improved Singing Voice Detection with Neural Networks. In Proceedings of the 16th International Society for Music Information Retrieval Conference (ISMIR 2015), Malaga, Spain, 26–30 October 2015; pp. 121–126. [Google Scholar]
- You, S.D.; Liu, C.H.; Chen, W.K. Comparative study of singing voice detection based on deep neural networks and ensemble learning. Hum.-Centric Comput. Inf. Sci. 2018, 8, 34. [Google Scholar] [CrossRef]
- Ohishi, Y.; Goto, M.; Itou, K.; Takeda, K. Discrimination between singing and speaking voices. In Proceedings of the Ninth European Conference on Speech Communication and Technology, Lisboa, Portugal, 4–8 September 2005. [Google Scholar]
- Vijayan, K.; Li, H.; Toda, T. Speech-to-singing voice conversion: The challenges and strategies for improving vocal conversion processes. IEEE Signal Process. Mag. 2018, 36, 95–102. [Google Scholar] [CrossRef]
- Zhang, X.; Yu, Y.; Gao, Y.; Chen, X.; Li, W. Research on Singing Voice Detection Based on a Long-Term Recurrent Convolutional Network with Vocal Separation and Temporal Smoothing. Electronics 2020, 9, 1458. [Google Scholar] [CrossRef]
- Rani, B.; Rani, A.J.; Ravi, T.; Sree, M.D. Basic fundamental recognition of voiced, unvoiced, and silence region of a speech. Int. J. Eng. Adv. Technol. 2014, 4, 83–86. [Google Scholar]
- Li, T.; Ogihara, M.; Tzanetakis, G. Music Data Mining; CRC Press: Boca Raton, FL, USA, 2011. [Google Scholar]
- Stables, R.; Enderby, S.; De Man, B.; Fazekas, G.; Reiss, J.D. Safe: A System for Extraction and Retrieval of Semantic Audio Descriptors. In Electronic Engineering and Computer Science; Queen Mary University of London: London, UK, 2014. [Google Scholar]
- McKinney, M.; Breebaart, J. Features for audio and music classification. In Proceedings of the ISMIR2003, Baltimore, MD, USA, 27–30 October 2003. [Google Scholar]
- Gygi, B.; Kidd, G.R.; Watson, C.S. Similarity and categorization of environmental sounds. Percept. Psychophys. 2007, 69, 839–855. [Google Scholar] [CrossRef] [Green Version]
- Hoffman, M.D.; Cook, P.R. Feature-Based Synthesis: A Tool for Evaluating, Designing, and Interacting with Music IR Systems. In Proceedings of the ISMIR 2006, 7th International Conference on Music Information Retrieval, Victoria, BC, Canada, 8–12 October 2006; pp. 361–362. [Google Scholar]
- Knees, P.; Schedl, M. Music Similarity and Retrieval: An Introduction to Audio-and Web-Based Strategies; The Information Retrieval Series; Springer: Berlin/Heidelberg, Germany, 2016. [Google Scholar]
- Lee, K.; Choi, K.; Nam, J. Revisiting Singing Voice Detection: A quantitative review and the future outlook. In Proceedings of the 19th International Society for Music Information Retrieval Conference, ISMIR 2018, Paris, France, 23–27 September 2018; pp. 506–513. [Google Scholar]
- Jeong, I.Y.; Lee, K. Learning Temporal Features Using a Deep Neural Network and its Application to Music Genre Classification. In Proceedings of the 17th International Society for Music Information Retrieval Conference (ISMIR), New York, NY, USA, 7–11 August 2016; pp. 434–440. [Google Scholar]
- Gupta, H.; Gupta, D. LPC and LPCC method of feature extraction in Speech Recognition System. In Proceedings of the 2016 6th International Conference-Cloud System and Big Data Engineering (Confluence), Noida, India, 14–15 January 2016; pp. 498–502. [Google Scholar] [CrossRef]
- Rocamora, M.; Herrera, P. Comparing audio descriptors for singing voice detection in music audio files. In Proceedings of the Brazilian Symposium on Computer Music, 11th, São Paulo, Brazil, 1–3 September 2007; Volume 26, p. 27. [Google Scholar]
- Davis, S.; Mermelstein, P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Process. 1980, 28, 357–366. [Google Scholar] [CrossRef] [Green Version]
- Kim, H.G.; Sikora, T. Comparison of MPEG-7 audio spectrum projection features and MFCC applied to speaker recognition, sound classification and audio segmentation. In Proceedings of the 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, Montreal, QC, Canada, 17–21 May 2004; Volume 5. [Google Scholar]
- Logan, B. Mel frequency cepstral coefficients for music modeling. In Proceedings of the International Symposium on Music Information Retrieval, Plymouth, MA, USA, 23–25 October 2000. [Google Scholar]
- Meseguer-Brocal, G.; Cohen-Hadria, A.; Peeters, G. Dali: A large dataset of synchronized audio, lyrics and notes, automatically created using teacher-student machine learning paradigm. arXiv 2019, arXiv:1906.10606. [Google Scholar]
- Lehner, B.; Widmer, G.; Bock, S. A low-latency, real-time-capable singing voice detection method with LSTM recurrent neural networks. In Proceedings of the 2015 23rd European Signal Processing Conference (EUSIPCO), Nice, France, 31 August–4 September 2015; pp. 21–25. [Google Scholar]
- Regnier, L.; Peeters, G. Singing voice detection in music tracks using direct voice vibrato detection. In Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, Taipei, Taiwan, 19–24 April 2009; pp. 1685–1688. [Google Scholar]
- Lehner, B.; Sonnleitner, R.; Widmer, G. Towards Light-Weight, Real-Time-Capable Singing Voice Detection. In Proceedings of the 14th International Conference on Music Information Retrieval (ISMIR 2013), Curitiba, Brazil, 4–8 November 2013. [Google Scholar]
- Schlüter, J. Learning to Pinpoint Singing Voice from Weakly Labeled Examples. In Proceedings of the 17th ISMIR Conference, New York, NY, USA, 7–11 August 2016; pp. 44–50. [Google Scholar]
- Chen, Z.; Zhang, X.; Deng, J.; Li, J.; Jiang, Y.; Li, W. A Practical Singing Voice Detection System Based on GRU-RNN. In Proceedings of the 6th Conference on Sound and Music Technology (CSMT); Springer: Singapore, 2019; pp. 15–25. [Google Scholar] [CrossRef]
- Kum, S.; Nam, J. Joint detection and classification of singing voice melody using convolutional recurrent neural networks. Appl. Sci. 2019, 9, 1324. [Google Scholar] [CrossRef] [Green Version]
- Hsu, C.L.; Wang, D.; Jang, J.S.R.; Hu, K. A tandem algorithm for singing pitch extraction and voice separation from music accompaniment. IEEE Trans. Audio Speech Lang. Process. 2012, 20, 1482–1491. [Google Scholar] [CrossRef]
- Song, L.; Li, M.; Yan, Y. Automatic Vocal Segments Detection in Popular Music. In Proceedings of the 2013 Ninth International Conference on Computational Intelligence and Security, Emeishan, China, 14–15 December 2013; pp. 349–352. [Google Scholar] [CrossRef]
- Mauch, M.; Fujihara, H.; Yoshii, K.; Goto, M. Timbre and Melody Features for the Recognition of Vocal Activity and Instrumental Solos in Polyphonic Music. In Proceedings of the 12th International Society for Music Information Retrieval Conference, ISMIR 2011, Miami, FL, USA, 24–28 October 2011; pp. 233–238. [Google Scholar]
- Chan, T.S.T.; Yang, Y.H. Complex and quaternionic principal component pursuit and its application to audio separation. IEEE Signal Process. Lett. 2016, 23, 287–291. [Google Scholar] [CrossRef]
- Chan, T.S.T.; Yang, Y.H. Informed group-sparse representation for singing voice separation. IEEE Signal Process. Lett. 2017, 24, 156–160. [Google Scholar] [CrossRef] [Green Version]
- Ramona, M.; Richard, G.; David, B. Vocal detection in music with support vector machines. In Proceedings of the 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, Las Vegas, NV, USA, 31 March–4 April 2008; pp. 1885–1888. [Google Scholar]
- Goto, M.; Hashiguchi, H.; Nishimura, T.; Oka, R. RWC Music Database: Popular, Classical and Jazz Music Databases. In Proceedings of the ISMIR 2002, 3rd International Conference on Music Information Retrieval, Paris, France, 13–17 October 2002; Volume 2, pp. 287–288. [Google Scholar]
- Bittner, R.M.; Salamon, J.; Tierney, M.; Mauch, M.; Cannam, C.; Bello, J.P. Medleydb: A multitrack dataset for annotation-intensive mir research. In Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR 2014), Taipei, Taiwan, 27–31 October 2014; Volume 14, pp. 155–160. [Google Scholar]
- Hsu, C.L.; Jang, J.S.R. On the improvement of singing voice separation for monaural recordings using the MIR-1K dataset. IEEE Trans. Audio Speech Lang. Process. 2009, 18, 310–319. [Google Scholar]
- Chan, T.S.; Yeh, T.C.; Fan, Z.C.; Chen, H.W.; Su, L.; Yang, Y.H.; Jang, R. Vocal activity informed singing voice separation with the iKala dataset. In Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), South Brisbane, QLD, Australia, 19–24 April 2015; pp. 718–722. [Google Scholar]
- Maddage, N.C.; Wan, K.; Xu, C.; Wang, Y. Singing voice detection using twice-iterated composite fourier transform. In Proceedings of the 2004 IEEE International Conference on Multimedia and Expo (ICME)(IEEE Cat. No. 04TH8763), Taipei, Taiwan, 27–30 June 2004; Volume 2, pp. 1347–1350. [Google Scholar]
- Vembu, S.; Baumann, S. Separation of Vocals from Polyphonic Audio Recordings. In Proceedings of the ISMIR 2005, London, UK, 11–15 September 2005; pp. 337–344. [Google Scholar]
- Lukashevich, H.; Gruhne, M.; Dittmar, C. Effective singing voice detection in popular music using arma filtering. In Proceedings of the Workshop on Digital Audio Effects (DAFx’07), Bordeaux, France, 10–15 September 2007. [Google Scholar]
- Forney, G.D. The viterbi algorithm. Proc. IEEE 1973, 61, 268–278. [Google Scholar] [CrossRef]
- O’Shea, K.; Nash, R. An introduction to convolutional neural networks. arXiv 2015, arXiv:1511.08458. [Google Scholar]
- Huang, H.M.; Chen, W.K.; Liu, C.H.; You, S.D. Singing voice detection based on convolutional neural networks. In Proceedings of the 2018 7th International Symposium on Next Generation Electronics (ISNE), Taipei, Taiwan, 7–9 May 2018; pp. 1–4. [Google Scholar] [CrossRef]
- Gui, W.; Li, Y.; Zang, X.; Zhang, J. Exploring Channel Properties to Improve Singing Voice Detection with Convolutional Neural Networks. Appl. Sci. 2021, 11, 11838. [Google Scholar] [CrossRef]
- Krause, M.; Müller, M.; Weiß, C. Singing Voice Detection in Opera Recordings: A Case Study on Robustness and Generalization. Electronics 2021, 10, 1214. [Google Scholar] [CrossRef]
- Vu, T.H.; Wang, J.C. Acoustic scene and event recognition using recurrent neural networks. Detect. Classif. Acoust. Scenes Events 2016, 2016, 1–3. [Google Scholar]
- Sutskever, I.; Martens, J.; Hinton, G.E. Generating text with recurrent neural networks. In Proceedings of the ICML 2011, Bellevue, WA, USA, 28 June–2 July 2011. [Google Scholar]
- Vinyals, O.; Ravuri, S.V.; Povey, D. Revisiting recurrent neural networks for robust ASR. In Proceedings of the 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Kyoto, Japan, 25–30 March 2012; pp. 4085–4088. [Google Scholar]
- Hughes, T.; Mierle, K. Recurrent neural networks for voice activity detection. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 26–31 May 2013; pp. 7378–7382. [Google Scholar]
- Olah, C. Understanding LSTM Networks. 2015. Available online: http://colah.github.io/posts/2015-08-Understanding-LSTMs/ (accessed on 10 November 2021).
- Eyben, F.; Weninger, F.; Squartini, S.; Schuller, B. Real-life voice activity detection with lstm recurrent neural networks and an application to hollywood movies. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 26–31 May 2013; pp. 483–487. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Ono, N.; Miyamoto, K.; Le Roux, J.; Kameoka, H.; Sagayama, S. Separation of a monaural audio signal into harmonic/percussive components by complementary diffusion on spectrogram. In Proceedings of the 2008 16th European Signal Processing Conference, Lausanne, Switzerland, 25–29 August 2008; pp. 1–4. [Google Scholar]
- Cho, K.; Van Merriënboer, B.; Bahdanau, D.; Bengio, Y. On the properties of neural machine translation: Encoder-decoder approaches. arXiv 2014, arXiv:1409.1259. [Google Scholar]
- Xingjian, S.; Chen, Z.; Wang, H.; Yeung, D.Y.; Wong, W.K.; Woo, W.C. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. arXiv 2015, arXiv:1506.04214. [Google Scholar]
- Lehner, B.; Widmer, G.; Sonnleitner, R. On the reduction of false positives in singing voice detection. In Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, 4–9 May 2014; pp. 7480–7484. [Google Scholar]
- Paul, S.; Rao, K.S.; Das, P.P. Knowledge Distillation for Singing Voice Detection. arXiv 2021, arXiv:2011.04297. [Google Scholar]
Name | Number of Tracks | Size | Related Papers |
---|---|---|---|
Jamendo Corpus | 93 | 443 mins | [16,17,21,37,38,39,40,41] |
MedleyDB | 122 | 437 mins | [21,42] |
MIR-1k | 1000 | 113 mins | [21,43] |
RWC Popular Music | 100 | 407 mins | [17,21,37,40,41,42,44,45] |
iKala | 352 | 176 mins | [21,42,46,47] |
Year | Evaluation Measures (in [%]) | |||||
---|---|---|---|---|---|---|
Method | Author | Published | Accuracy | Precision | Recall | F-Measure |
SVM | Ramona [48] | 2008 | 82.2 | - | - | 84.3 |
GMM | Regnier et al. [38] | 2009 | - | - | - | 77 |
Random forest | Lehner et al. [39] | 2013 | 84.8 | - | - | 84.6 |
Feature Engineering | Lehner et al. [71] | 2014 | 88.2 | 88 | 86.2 | 87.1 |
LSTM-RNN (1) | Lehner et al. [37] | 2015 | 91.5 | 89.8 | 90.6 | 90.2 |
LSTM-RNN (2) | Zhang et al. [21] | 2020 | 89.5 | 89.5 | 89.6 | 88.8 |
CNN (1) | Schlüter et al. [17] | 2015 | 92.3 | - | 90.3 | - |
CNN (2) | Zhang et al. [21] | 2020 | 90.4 | 90.6 | 90.4 | 90.3 |
CNN (3) | Gui et al. [59] | 2021 | 88.9 | 91.4 | 89.9 | 90.6 |
Bi-LSTMs | Leglaive et al. [16] | 2015 | 91.5 | 89.5 | 92.6 | 91 |
Bootstrapping procedure | Dittmar et al. [15] | 2015 | 88.2 | - | - | 87 |
GRU-RNN (1) | Zhang et al. [21] | 2020 | 91 | 90.8 | 91.2 | 91.4 |
GRU-RNN (2) | Chen et al. [41] | 2019 | 88.2 | 85.39 | 92.78 | 88.93 |
GRU-RNN (3) | Chen et al. [41] | 2019 | 90.8 | 98.2 | 93.3 | 91.2 |
LRCN | Zhang et al. [21] | 2020 | 91.6 | 92.6 | 93.4 | 93 |
Year | Evaluation Measures(in [%]) | |||||
---|---|---|---|---|---|---|
Method | Author | Published | Accuracy | Precision | Recall | F-Measure |
SVM-HMM | Mauch [45] | 2011 | 87.2 | 88.7 | 92.1 | 90.4 |
Random forest | Lehner et al. [39] | 2013 | 86.8 | 87.9 | 90.6 | 89.2 |
Feature Engineering | Lehner et al. [71] | 2014 | 87.5 | 87.5 | 92.6 | 90 |
LSTM-RNN (1) | Lehner et al. [37] | 2015 | 92.3 | 93.8 | 93.4 | 93.6 |
LSTM-RNN (2) | X. Zhang et al. [21] | 2020 | 93.7 | 94.1 | 93.3 | 92.8 |
CNN (1) | Schlüter et al. [16] | 2015 | 92.7 | - | 93.5 | - |
CNN (2) | X. Zhang et al. [21] | 2020 | 94 | 93.6 | 94 | 94.2 |
CNN (3) | Gui et al. [59] | 2021 | 88.9 | 90.7 | 97.0 | 93.7 |
GRU-RNN (1) | X. Zhang et al. [21] | 2020 | 95.2 | 95.1 | 95.3 | 95.3 |
GRU-RNN (2) | Chen et al. [41] | 2019 | 92.1 | 92.7 | 95.4 | 94 |
GRU-RNN (3) | Chen et al. [41] | 2019 | 95.3 | 96.1 | 96.9 | 96.5 |
LRCN | X. Zhang et al. [21] | 2020 | 97 | 97.1 | 96.8 | 96.3 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Monir, R.; Kostrzewa, D.; Mrozek, D. Singing Voice Detection: A Survey. Entropy 2022, 24, 114. https://doi.org/10.3390/e24010114
Monir R, Kostrzewa D, Mrozek D. Singing Voice Detection: A Survey. Entropy. 2022; 24(1):114. https://doi.org/10.3390/e24010114
Chicago/Turabian StyleMonir, Ramy, Daniel Kostrzewa, and Dariusz Mrozek. 2022. "Singing Voice Detection: A Survey" Entropy 24, no. 1: 114. https://doi.org/10.3390/e24010114
APA StyleMonir, R., Kostrzewa, D., & Mrozek, D. (2022). Singing Voice Detection: A Survey. Entropy, 24(1), 114. https://doi.org/10.3390/e24010114