Figure 1.
Waveform and Spectrogram of pathological voices.
Figure 1.
Waveform and Spectrogram of pathological voices.
Figure 2.
Architecture of SincNet.
Figure 2.
Architecture of SincNet.
Figure 3.
The confusion matrix is used to show detailed detection performance of (a) CNN(1D) and (b) SincNet on the 2018 FEMH Challenge dataset.
Figure 3.
The confusion matrix is used to show detailed detection performance of (a) CNN(1D) and (b) SincNet on the 2018 FEMH Challenge dataset.
Figure 4.
Scatter plots of (a) CNN(1D) and (b) SincNet show the t-SNE performance for detecting normal and abnormal voice on the 2018 FEMH Challenge dataset.
Figure 4.
Scatter plots of (a) CNN(1D) and (b) SincNet show the t-SNE performance for detecting normal and abnormal voice on the 2018 FEMH Challenge dataset.
Figure 5.
The confusion matrix is used to show detailed detection performance of (a) CNN(1D) and (b) SincNet on the FEMH dataset.
Figure 5.
The confusion matrix is used to show detailed detection performance of (a) CNN(1D) and (b) SincNet on the FEMH dataset.
Figure 6.
Scatter plots of (a) CNN(1D) and (b) SincNet show the t-SNE performance for detecting normal and abnormal voice on the FEMH dataset.
Figure 6.
Scatter plots of (a) CNN(1D) and (b) SincNet show the t-SNE performance for detecting normal and abnormal voice on the FEMH dataset.
Figure 7.
Confusion matrices for (a) CNN(1D) and (b) SincNet are used to show detailed classification performance among Neo, Pho, and VP on the 2018 FEMH Challenge dataset.
Figure 7.
Confusion matrices for (a) CNN(1D) and (b) SincNet are used to show detailed classification performance among Neo, Pho, and VP on the 2018 FEMH Challenge dataset.
Figure 8.
Scatter plots of (a) CNN(1D) and (b) SincNet show the t-SNE performance for classifying Neo, Pho, and VP on the 2018 FEMH Challenge dataset.
Figure 8.
Scatter plots of (a) CNN(1D) and (b) SincNet show the t-SNE performance for classifying Neo, Pho, and VP on the 2018 FEMH Challenge dataset.
Figure 9.
Confusion matrices for (a) CNN(1D) and (b) SincNet are used to show detailed classification performance among Neo, Pho, and VP on the 2019 FEMH Challenge dataset.
Figure 9.
Confusion matrices for (a) CNN(1D) and (b) SincNet are used to show detailed classification performance among Neo, Pho, and VP on the 2019 FEMH Challenge dataset.
Figure 10.
Scatter plots of (a) CNN(1D) and (b) SincNet show the t-SNE performance for classifying Neo, Pho, and VP on the 2019 FEMH Challenge dataset.
Figure 10.
Scatter plots of (a) CNN(1D) and (b) SincNet show the t-SNE performance for classifying Neo, Pho, and VP on the 2019 FEMH Challenge dataset.
Figure 11.
Confusion matrices for (a) CNN(1D) and (b) SincNet are used to show detailed classification performance among Neo, Pho, and VP on the FEMH dataset.
Figure 11.
Confusion matrices for (a) CNN(1D) and (b) SincNet are used to show detailed classification performance among Neo, Pho, and VP on the FEMH dataset.
Figure 12.
Scatter plots of (a) CNN(1D) and (b) SincNet show the t-SNE performance for classifying Neo, Pho, and VP on the FEMH dataset.
Figure 12.
Scatter plots of (a) CNN(1D) and (b) SincNet show the t-SNE performance for classifying Neo, Pho, and VP on the FEMH dataset.
Figure 13.
Confusion matrices for (a) CNN(1D) and (b) SincNet are used to show detailed classification performance among FD, Neo, Pho, and VP on the 2019 FEMH Challenge dataset.
Figure 13.
Confusion matrices for (a) CNN(1D) and (b) SincNet are used to show detailed classification performance among FD, Neo, Pho, and VP on the 2019 FEMH Challenge dataset.
Figure 14.
Scatter plots of (a) CNN(1D) and (b) SincNet show the t-SNE performance for classifying FD, Neo, Pho, and VP using 2019 FEMH Challenge dataset.
Figure 14.
Scatter plots of (a) CNN(1D) and (b) SincNet show the t-SNE performance for classifying FD, Neo, Pho, and VP using 2019 FEMH Challenge dataset.
Figure 15.
Confusion matrices for (a) CNN(1D) and (b) SincNet are used to show detailed classification performance among FD, Neo, Pho, and VP on the FEMH dataset.
Figure 15.
Confusion matrices for (a) CNN(1D) and (b) SincNet are used to show detailed classification performance among FD, Neo, Pho, and VP on the FEMH dataset.
Figure 16.
Scatter plots of (a) CNN(1D) and (b) SincNet show the t-SNE performance for classifying FD, Neo, Pho, and VP using FEMH dataset.
Figure 16.
Scatter plots of (a) CNN(1D) and (b) SincNet show the t-SNE performance for classifying FD, Neo, Pho, and VP using FEMH dataset.
Figure 17.
During training SincNet and CNN(1D) models in (a) 2018 FEMH Challenge and (b) FEMH datasets, lose curves were depicted in this figure to demonstrate the training efficiency.
Figure 17.
During training SincNet and CNN(1D) models in (a) 2018 FEMH Challenge and (b) FEMH datasets, lose curves were depicted in this figure to demonstrate the training efficiency.
Figure 18.
Two channels selected from the input layer of the optimized CNN(1D) and SincNet models were used to demonstrate the filter properties. The upper row shows filters in the time domain, while the bottom row depicts the magnitude components of filters in the frequency domain. (a) CNN Flters; (b) Sinc Flters.
Figure 18.
Two channels selected from the input layer of the optimized CNN(1D) and SincNet models were used to demonstrate the filter properties. The upper row shows filters in the time domain, while the bottom row depicts the magnitude components of filters in the frequency domain. (a) CNN Flters; (b) Sinc Flters.
Figure 19.
The power spectral density of “CNN Filters” and “Sinc Filters” processed pathological voices. Two utterances individually selected from (a) 2018 FEMH Challenge and (b) FEMH datasets were involved in this test.
Figure 19.
The power spectral density of “CNN Filters” and “Sinc Filters” processed pathological voices. Two utterances individually selected from (a) 2018 FEMH Challenge and (b) FEMH datasets were involved in this test.
Table 1.
For FEMH, the size of the pathological voices.
Table 1.
For FEMH, the size of the pathological voices.
| FD | Neo | Pho | VP | Normal |
---|
/a/ sound | 100 | 101 | 718 | 124 | 100 |
Table 2.
For the 2018 FEMH-Challenge, the size of the pathological voices.
Table 2.
For the 2018 FEMH-Challenge, the size of the pathological voices.
| Neo | Pho | VP | Normal |
---|
/a/ sound | 50 | 50 | 50 | 50 |
Table 3.
For the 2019 FEMH-Challenge, the size of the pathological voices.
Table 3.
For the 2019 FEMH-Challenge, the size of the pathological voices.
| FD | Neo | Pho | VP |
---|
/a/ sound | 100 | 100 | 100 | 100 |
Table 4.
Detection performance for CNN(1D), CNN(2D), and SincNet on the 2018 FEMH Challenge dataset.
Table 4.
Detection performance for CNN(1D), CNN(2D), and SincNet on the 2018 FEMH Challenge dataset.
Model\Data | 2018 FEMH Challenge |
---|
Sensitivity | Specificity | Accuracy | UAR |
---|
CNN(1D) | 72.00% ± 7.16 | 65.00% ± 4.90 | 70.83 ± 3.84 | 68.50% ± 2.92 |
CNN(2D) | 72.88% ± 5.66 | 62.35% ± 3.50 | 67.21 ± 2.52 | 67.61% ± 2.62 |
SincNet | 80.00% ± 4.97 | 65.00% ± 2.45 | 77.50 ± 2.68 | 72.50% ± 1.87 |
Table 5.
Detection performance for CNN(1D), CNN(2D), and SincNet on the FEMH dataset.
Table 5.
Detection performance for CNN(1D), CNN(2D), and SincNet on the FEMH dataset.
Model\Data | FEMH |
---|
Sensitivity | Specificity | Accuracy | UAR |
---|
CNN(1D) | 77.88% ± 3.16 | 60.00% ± 3.90 | 76.32 ± 3.58 | 68.94% ± 2.78 |
CNN(2D) | 74.68% ± 4.66 | 58.35% ± 3.55 | 73.45 ± 2.37 | 66.52% ± 2.59 |
SincNet | 84.62% ± 2.97 | 70.00% ± 1.64 | 83.33 ± 2.42 | 77.31% ± 1.75 |
Table 6.
Classification performance of Neo, Pho, and VP for CNN(1D), CNN(2D), and SincNet on the 2018 FEMH Challenge dataset.
Table 6.
Classification performance of Neo, Pho, and VP for CNN(1D), CNN(2D), and SincNet on the 2018 FEMH Challenge dataset.
Model\Data | 2018 FEMH Challenge |
---|
Neo | Pho | VP | Accuracy | UAR |
---|
CNN(1D) | 66.67% ± 2.30 | 67.24% ± 3.56 | 75.00% ± 2.98 | 69.00 ± 3.78 | 69.64% ± 2.65 |
CNN(2D) | 60.00% ± 1.45 | 63.00% ± 2.66 | 65.00% ± 2.33 | 63.83 ± 2.72 | 62.67% ± 3.63 |
SincNet | 72.22% ± 1.55 | 70.69% ± 2.69 | 79.17% ± 2.17 | 73.00 ± 2.12 | 74.03% ± 2.57 |
Table 7.
Classification performance of Neo, Pho, and VP for CNN(1D), CNN(2D), and SincNet on the 2019 FEMH Challenge dataset.
Table 7.
Classification performance of Neo, Pho, and VP for CNN(1D), CNN(2D), and SincNet on the 2019 FEMH Challenge dataset.
Model\Data | 2019 FEMH Challenge (Remove FD) |
---|
Neo | Pho | VP | Accuracy | UAR |
---|
CNN(1D) | 60.00% ± 1.53 | 65.00% ± 2.65 | 65.00% ± 1.28 | 63.33 ± 3.42 | 63.33% ± 3.66 |
CNN(2D) | 59.00% ± 1.69 | 66.45% ± 3.27 | 64.15% ± 2.96 | 63.13 ± 4.44 | 63.20% ± 2.19 |
SincNet | 75.00% ± 1.55 | 70.00% ± 2.37 | 65.00% ± 1.55 | 70.00 ± 1.63 | 70.00% ± 1.12 |
Table 8.
Classification performance of Neo, Pho, and VP for CNN(1D), CNN(2D), and SincNet on the FEMH dataset.
Table 8.
Classification performance of Neo, Pho, and VP for CNN(1D), CNN(2D), and SincNet on the FEMH dataset.
Model\Data | FEMH (Remove FD) |
---|
Neo | Pho | VP | Accuracy | UAR |
---|
CNN(1D) | 80.00% ± 2.90 | 78.32% ± 3.16 | 75.00% ± 2.68 | 78.07 ± 2.75 | 77.77% ± 2.56 |
CNN(2D) | 78.00% ± 2.45 | 90.00% ± 3.66 | 50.00% ± 2.55 | 76.63 ± 1.58 | 72.67% ± 2.62 |
SincNet | 75.00% ± 1.35 | 81.82% ± 2.68 | 83.33% ± 2.43 | 81.28 ± 1.48 | 80.05% ± 1.29 |
Table 9.
Classification performance of FD, Neo, Pho, and VP for CNN(1D), CNN(2D), and SincNet on the 2019 FEMH Challenge dataset.
Table 9.
Classification performance of FD, Neo, Pho, and VP for CNN(1D), CNN(2D), and SincNet on the 2019 FEMH Challenge dataset.
Model\Data | 2019 FEMH Challenge |
---|
FD | Neo | Pho | VP | Accuracy | UAR |
---|
CNN(1D) | 75.00% ± 2.30 | 55.00% ± 3.56 | 70.00% ± 2.98 | 50.00% ± 2.65 | 62.50 ± 2.00 | 62.50% ± 3.43 |
CNN(2D) | 75.00% ± 1.45 | 50.00% ± 2.66 | 65.00% ± 2.33 | 50.00% ± 3.63 | 59.45 ± 2.12 | 60.00% ± 2.33 |
SincNet | 75.00% ± 1.55 | 65.00% ± 2.69 | 75.00% ± 2.17 | 60.00% ± 2.57 | 68.75 ± 1.32 | 68.75% ± 1.50 |
Table 10.
Classification performance of FD, Neo, Pho, and VP for CNN(1D), CNN(2D), and SincNet on the FEMH dataset.
Table 10.
Classification performance of FD, Neo, Pho, and VP for CNN(1D), CNN(2D), and SincNet on the FEMH dataset.
Model\Data | FEMH |
---|
FD | Neo | Pho | VP | Accuracy | UAR |
---|
CNN(1D) | 60.00% ± 2.00 | 55.00% ± 2.48 | 68.53% ± 3.70 | 54.17% ± 2.66 | 64.73 ± 3.29 | 59.43% ± 2.00 |
CNN(2D) | 55.00% ± 2.48 | 80.00% ± 2.94 | 40.00% ± 3.83 | 50.00% ± 2.32 | 60.24 ± 2.45 | 56.25% ± 2.03 |
SincNet | 50.00% ± 2.00 | 65.00% ± 1.48 | 75.52% ± 2.27 | 66.67% ± 2.48 | 71.01 ± 1.17 | 64.30% ± 1.32 |