Supervised Density-Based Metric Learning Based on Bhattacharya Distance for Imbalanced Data Classification Problems
Abstract
:1. Introduction
2. Related Works
2.1. Background and Definitions
- Pairwise cost-based approaches
- b.
- Probabilistic approaches
- c.
- Boost-Like methods
- d.
- Hybrid approaches
- e.
- Deep metric learning approaches
2.2. Learning Distance Metric in Imbalanced Applications
3. The Proposed Method
3.1. The Proposed Model Construction Method
3.1.1. The First Phase: Estimating the Density of the Classes
- Estimating the initial values of GMM parameters
- b.
- Maximum a posteriori (MAP) parameter re-estimation
3.1.2. The Second Phase: Calculating the Distance between the Gaussian Components Using the Bhattacharya Distance
3.1.3. The Third Phase: The Process of Learning the Proposed Distance Metric
3.1.4. The Fourth Phase: The Optimization Process of the Proposed Objective Function
3.2. Evaluation
4. Experiments
4.1. Specifications of the Datasets
4.2. Evaluation Criteria
4.3. Evaluation of the Proposed DMLdbIm Method
- LMNN: this method ensures that, for each training example, its k-nearest neighbors of the same class (target neighbors) are closer than examples from other classes.
- ITML: The objective of this method is to minimize the differential relative entropy between two multivariate Gaussian distributions. It utilizes LogDet regularization to minimize (or maximize) the distance between examples of the same (or different) classes.
- GMML: the goal of this method is to minimize the total distances among similar points.
- DML_eig: this method aims to maximize the minimum squared distance between dissimilar pairs while keeping an upper bound on the total squared distance for similar pairs.
- DMLMJ: the objective here is to learn a linear transformation that maximizes the Jeffrey divergence between two distributions derived from local pairwise constraints.
- IML: this method positions samples of the same class at distances of less than one and samples of different classes at distances greater than one, aiming to reduce the negative effects of class imbalance in the dataset.
- DMBK: the goal of DMBK is to learn a linear transformation that maximizes the logarithm of the geometric mean of the normalized Kullback–Leibler divergence between distributions that share the same covariance Gaussian density.
- DMLdbIm: This proposed method aims to identify components of multivariate Gaussian density with varying covariances for each class. It seeks to maximize the Bhattacharya distance between the Gaussian mixtures of different classes, increasing the distance between external components while decreasing the distance between internal components to create a wide margin between classes.
4.4. Comparison with Deep Learning
4.5. Computational Complexity Analysis
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Russell, S.J.; Norvig, P. Artificial Intelligence: A Modern Approach; Pearson Education Limited: Kuala Lumpur, Malaysia, 2016. [Google Scholar]
- Duin, R.P.; Tax, D.M.J. Statistical pattern recognition. In Handbook of Pattern Recognition and Computer Vision; World Scientific Pub Co Inc: Hackensack, NJ, USA, 2005; pp. 3–24. [Google Scholar]
- He, H.; Ma, Y. (Eds.) Imbalanced Learning: Foundations, Algorithms, and Applications; John Wiley & Sons: Hoboken, NJ, USA, 2013. [Google Scholar]
- He, H.; Garcia, E.A. Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 2008, 21, 1263–1284. [Google Scholar]
- Ali, A.; Shamsuddin, S.M.; Ralescu, A.L. Classification with class imbalance problem: A review. Int. J. Adv. Soft Comput. Its Appl. 2015, 7, 176–204. [Google Scholar]
- Nguyen, G.H.; Bouzerdoum, A.; Phung, S.L. Learning pattern classification tasks with imbalanced datasets. In Pattern Recognition; InTech: Houston, TX, USA, 2009. [Google Scholar]
- Mazurowski, M.A.; Habas, P.A.; Zurada, J.M.; Lo, J.Y.; Baker, J.A.; Tourassi, G.D. Training neural network classifiers for medical decision making: The effects of imbalanced datasets on classification performance. Neural Netw. 2008, 21, 427–436. [Google Scholar] [CrossRef] [PubMed]
- Wei, W.; Li, J.; Cao, L.; Ou, Y.; Chen, J. Effective detection of sophisticated online banking fraud on extremely imbalanced data. World Wide Web 2013, 16, 449–475. [Google Scholar] [CrossRef]
- Li, Y.; Sun, G.; Zhu, Y. Data imbalance problem in text classification. In Proceedings of the Information Processing (ISIP), 2010 Third International Symposium on Information Processing, Qingdao, China, 15–17 October 2010; IEEE: Piscataway, NJ, USA; pp. 301–305. [Google Scholar]
- Zhu, Z.B.; Song, Z.H. Fault diagnosis based on imbalance modified kernel Fisher discriminant analysis. Chem. Eng. Res. Des. 2010, 88, 936–951. [Google Scholar] [CrossRef]
- Tavallaee, M.; Stakhanova, N.; Ghorbani, A.A. Toward credible evaluation of anomaly-based intrusion-detection methods. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 2010, 40, 516–524. [Google Scholar] [CrossRef]
- Kotsiantis, S.; Kanellopoulos, D.; Pintelas, P. Handling imbalanced datasets: A review. GESTS Int. Trans. Comput. Sci. Eng. 2006, 30, 25–36. [Google Scholar]
- Xing, E.P.; Jordan, M.I.; Russell, S.J.; Ng, A.Y. Distance metric learning with application to clustering with side-information. In Advances in Neural Information Processing Systems; Mit Pr: Cambridge, MA, USA, 2003; pp. 521–528. [Google Scholar]
- Bellet, A.; Habrard, A.; Sebban, M. Metric Learning; Synthesis Lectures on Artificial Intelligence and Machine Learning; Springer: Berlin/Heidelberg, Germany, 2015; Volume 9, pp. 1–151. [Google Scholar]
- Li, D.; Tian, Y. Survey and experimental study on metric learning methods. Neural Netw. 2018, 105, 447–462. [Google Scholar] [CrossRef]
- Weinberger, K.Q.; Blitzer, J.; Saul, L.K. Distance metric learning for large margin nearest neighbor classification. In Advances in Neural Information Processing Systems; Mit Pr: Cambridge, MA, USA, 2006; pp. 1473–1480. [Google Scholar]
- Weinberger, K.Q.; Saul, L.K. Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 2009, 10, 207–244. [Google Scholar]
- Zadeh, P.; Hosseini, R.; Sra, S. Geometric mean metric learning. In Proceedings of the International Conference on Machine Learning, PMLR, New York City, NY, USA, 19–24 June 2016; pp. 2464–2471. [Google Scholar]
- Ying, Y.; Li, P. Distance metric learning with eigenvalue optimization. J. Mach. Learn. Res. 2012, 13, 1–26. [Google Scholar]
- Nguyen, B.; Morell, C.; De Baets, B. Supervised distance metric learning through maximization of the Jeffrey divergence. Pattern Recognit. 2017, 64, 215–225. [Google Scholar] [CrossRef]
- Davis, J.V.; Kulis, B.; Jain, P.; Sra, S.; Dhillon, I.S. Information-theoretic metric learning. In Proceedings of the 24th international conference on Machine learning, Corvallis, OR, USA, 17–24 June 2007; ACM: New York, NY, USA; pp. 209–216. [Google Scholar]
- Chang, C.C. A boosting approach for supervised Mahalanobis distance metric learning. Pattern Recognit. 2012, 45, 844–862. [Google Scholar] [CrossRef]
- Zhong, G.; Zheng, Y.; Li, S.; Fu, Y. SLMOML: Online Metric Learning With Global Convergence. IEEE Trans. Circuits Syst. Video Technol. 2018, 28, 2460–2472. [Google Scholar] [CrossRef]
- Liu, W.; Tsang, I.W. Large Margin Metric Learning for Multi-Label Prediction. In Proceedings of the AAAI, Austin, TX, USA, 25–30 January 2015; Volume 15, pp. 2800–2806. [Google Scholar]
- Kaya, M.; Bilge, H.Ş. Deep metric learning: A survey. Symmetry 2019, 11, 1066. [Google Scholar] [CrossRef]
- Suárez, J.L.; García, S.; Herrera, F. A tutorial on distance metric learning: Mathematical foundations, algorithms, experimental analysis, prospects and challenges. Neurocomputing 2021, 425, 300–322. [Google Scholar] [CrossRef]
- Ghojogh, B.; Ghodsi, A.; Karray, F.; Crowley, M. Spectral, Probabilistic, and Deep Metric Learning: Tutorial and Survey. arXiv 2022, arXiv:2201.09267. [Google Scholar]
- Cao, X.; Ge, Y.; Li, R.; Zhao, J.; Jiao, L. Hyperspectral imagery classification with deep metric learning. Neurocomputing 2019, 356, 217–227. [Google Scholar] [CrossRef]
- Wang, N.; Zhao, X.; Jiang, Y.; Gao, Y. Iterative Metric Learning for Imbalance Data Classification. In Proceedings of the 2018 International Joint Conference on Artificial Intelligence IJCAI, Stockholm, Sweden, 13–19 July 2018; pp. 2805–2811. [Google Scholar]
- Feng, L.; Wang, H.; Jin, B.; Li, H.; Xue, M.; Wang, L. Learning a Distance Metric by Balancing KL-Divergence for Imbalanced Datasets. IEEE Trans. Syst. Man Cybern. Syst. 2018, 49, 2384–2395. [Google Scholar] [CrossRef]
- Gautheron, L.; Habrard, A.; Morvant, E.; Sebban, M. Metric learning from imbalanced data with generalization guarantees. Pattern Recognit. Lett. 2020, 133, 298–304. [Google Scholar] [CrossRef]
- Yan, M.; Li, N. Borderline-margin loss based deep metric learning framework for imbalanced data. Appl. Intell. 2022, 53, 1487–1504. [Google Scholar] [CrossRef]
- Fattahi, M.; Moattar, M.H.; Forghani, Y. Improved cost-sensitive representation of data for solving the imbalanced big data classification problem. J. Big Data 2022, 9, 1–24. [Google Scholar] [CrossRef]
- Wang, K.F.; An, J.; Wei, Z.; Cui, C.; Ma, X.H.; Ma, C.; Bao, H.Q. Deep learning-based imbalanced classification with fuzzy support vector machine. Front. Bioeng. Biotechnol. 2022, 9, 802712. [Google Scholar] [CrossRef] [PubMed]
- UCI Machine Learning Repository. Available online: https://archive.ics.uci.edu/ml/index.php (accessed on 22 July 2024).
- Navarro, J.R.D.; Noche, J.R. Classification of Mixtures of Student Grade Distributions Based on The Gaussian Mixture Model Using The Expectation-Maximization Algorithm. 2003. Available online: https://www.researchgate.net/publication/2922541_Classification_of_Mixtures_of_Student_Grade_Distributions_Based_on_the_Gaussian_Mixture_Model_Using_the_Expectation-Maximization_Algorithm (accessed on 22 July 2024).
- Ester, M.; Kriegel, H.P.; Sander, J.; Xu, X. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the KDD’96: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, OR, USA, 2–4 August 1996; Volume 96, pp. 226–231.
- Bhattacharyya, A. On a measure of divergence between two statistical populations defined by their probability distributions. Bull. Calcutta Math. Soc. 1943, 35, 99–109. [Google Scholar]
Dataset | No. of Samples | No. of Features | No. of Minority Sample | Distribution of Classes | Imbalance Ratio | |
---|---|---|---|---|---|---|
1 | Heart (uci) | 270 | 13 | 120 | (55.56,44.44) | 1.25 |
2 | WDBC (uci) | 569 | 30 | 212 | (62.74,37.26) | 1.68 |
3 | Pima | 768 | 8 | 268 | (65.16,34.84) | 1.87 |
4 | Glass0 | 214 | 9 | 70 | (67.32,32.68) | 2.06 |
5 | Ecoli1 | 336 | 7 | 81 | (75.9,24.1) | 3.14 |
6 | Ecoli2 | 336 | 7 | 55 | (83.63,16.37) | 5.1 |
7 | newthyroid1 | 215 | 5 | 35 | (83.72,16.28) | 5.14 |
8 | Glass6 | 214 | 9 | 29 | (86.45,13.55) | 6.37 |
9 | Ecoli3 | 336 | 7 | 35 | (89.58,10.42) | 8.6 |
10 | yeast-2_vs_4 | 514 | 8 | 51 | (90.08,9.92) | 9.08 |
11 | yeast-1_vs_7 | 459 | 7 | 30 | (93.46,6.54) | 14.3 |
12 | winequality-red-8_vs_6 | 656 | 11 | 18 | (97.26,2.74) | 35.44 |
13 | winequality-red-8_vs_6-7 | 855 | 11 | 18 | (97.89,2.11) | 46.5 |
14 | winequality-white-3-9_vs_5 | 1482 | 11 | 25 | (98.31,1.69) | 58.28 |
15 | winequality-red-3_vs_5 | 691 | 11 | 10 | (98.55,1.45) | 68.1 |
Predicted | |||
---|---|---|---|
Positive | Negative | ||
Actual | Positive | TP | FN |
Negative | FP | TN |
Name | Value |
---|---|
0.00001 | |
0.00001 | |
0.1 | |
Tol | 1 × 10−3 |
Dataset | Euclidean | ITML | LMNN | DML-eig | GMML | DMLMJ | DMBK | IML | DMLdbIm |
---|---|---|---|---|---|---|---|---|---|
Heart | 75.15 (6) | 77.33 (2) | 75.09 (7) | 75.06 (8) | 73.14 (9) | 76.65 (3) | 76.08 (4) | 75.83 (5) | 77.41 (1) |
Wdbc | 91.51 (8) | 90.21 (9) | 91.94 (7) | 93.17 (5) | 92.28 (6) | 95.56 (1) | 94.15 (3) | 93.75 (4) | 94.81 (2) |
Pima | 56.34 (4) | 54.25 (9) | 55.55 (7) | 54.75 (8) | 57.77 (2) | 58.07 (1) | 55.98 (6) | 56.3 (5) | 56.55 (3) |
Glass0 | 67.61 (6) | 63.5 (9) | 64.63 (8) | 74.1 (4) | 68.62 (5) | 75.92 (1) | 75.17 (2) | 67.15 (7) | 74.30 (3) |
Ecoli1 | 73.56 (6) | 73.7 (4) | 67.9 (9) | 69.68 (8) | 72.33 (7) | 78.32 (1) | 73.58 (5) | 74.61 (2) | 74.51 (3) |
Ecoli2 | 71.43 (5) | 70.28 (8) | 71.35 (6) | 77.1 (4) | 68.65 (9) | 87.12 (1) | 78.26 (3) | 70.37 (7) | 83.93 (2) |
Newthyroid1 | 87.07 (9) | 90.25 (8) | 96.77 (1) | 93.79 (5) | 93.34 (6) | 92.52 (7) | 94.61 (3) | 93.82 (4) | 95.96 (2) |
Glass6 | 76.06 (8) | 78.53 (5) | 78.16 (7) | 81.99 (3) | 78.33 (6) | 70.73 (9) | 84.45 (2) | 80.32 (4) | 88.72 (1) |
Ecoli3 | 50 (8) | 52.98 (6) | 54.8 (3) | 52.99 (5) | 50.4 (7) | 58.73 (2) | 54.32 (4) | 45.66 (9) | 60.74 (1) |
Yeast-2_vs_4 | 69.39 (9) | 73.79 (5) | 74.45 (4) | 70.77 (8) | 71.17 (7) | 75.54 (2) | 75.08 (3) | 71.66 (6) | 78.27 (1) |
Yeast-1_vs_7 | 26.09 (7) | 23.04 (8) | 33.15 (6) | 36.51 (4) | 37.07 (3) | NAN (9) | 39.85 (2) | 34.15 (5) | 47.79 (1) |
winequality-red-8_vs_6 | NA | 14.81 (5) | NA | 32.12 (3) | 25.11 (4) | NA | 38.99 (2) | NA | 44.44 (1) |
winequality-red-8_vs_6-7 | NA | 17.25 (3) | NA | 25.00 (2) | 12.45 (5) | NA | NA | 16.19 (4) | 40.00 (1) |
winequality-white-3-9_vs_5 | NA | 22.13 (2) | NA | 20.00 (3) | 11.57 (5) | NA | NA | 17.07 (4) | 33.33 (1) |
winequality-red-3_vs_5 | NA | NA | NA | 25.00 (2) | 10.78 (3) | NA | NA | NA | 40.00 (1) |
Mean | 49.61 | 53.47 | 50.91 | 58.80 | 54.86 | 51.27 | 56.03 | 53.12 | 66.05 |
AverageRank | 6.53 | 5.8 | 5.8 | 4.8 | 5.6 | 3.93 | 3.66 | 5.06 | 1.6 |
Dataset | Euclidean | ITML | LMNN | DML-eig | GMML | DMLMJ | DMBK | IML | DMLdbIm |
---|---|---|---|---|---|---|---|---|---|
Heart | 72.07 (9) | 73.53 (8) | 74.31 (7) | 77.14 (3) | 75.30 (5) | 75.80 (4) | 78.23 (2) | 74.99 (6) | 79.06 (1) |
Wdbc | 92.61 (7) | 92.27 (9) | 92.64 (6) | 94.19 (5) | 92.58 (8) | 95.37 (2) | 95.31 (3) | 94.99 (4) | 97.67 (1) |
Pima | 58.47 (7) | 58.48 (6) | 59.57 (4) | 58.94 (5) | 58.38 (8) | 57.04 (9) | 60.42 (3) | 60.93 (2) | 63.27 (1) |
Glass0 | 70.8 (8) | 70.77 (9) | 73.02 (5) | 76.12 (1) | 71.55 (7) | 74.73 (3) | 73.01 (6) | 73.25 (4) | 75.67 (2) |
Ecoli1 | 73.68 (6) | 74.33 (5) | 72.28 (8) | 74.62 (4) | 73.11 (7) | 80.38 (3) | 81.71 (2) | 71.31 (9) | 82.76 (1) |
Ecoli2 | 85.60 (4) | 81.68 (6) | 80.41 (7) | 84.95 (5) | 78.28 (9) | 86.27 (3) | 86.68 (2) | 80.24 (89) | 89.88 (1) |
Newthyroid1 | 90.51 (9) | 97.15 (1) | 95.95 (4) | 94.56 (5) | 96.37 (3) | 93.80 (7) | 92.73 (8) | 97.14 (2) | 94.42 (6) |
Glass6 | 74.88 (8) | 75.92 (7) | 76.99 (6) | 81.57 (2) | 74.04 (9) | 79.51 (4) | 79.87 (3) | 77.25 (5) | 83.33 (1) |
Ecoli3 | 53.12 (7) | 49.32 (9) | (5) | 51.19 (8) | 53.82 (6) | 59.27 (2) | 58.77 (3) | 58.08 (4) | 60.97 (1) |
Yeast-2_vs_4 | 75.56 (8) | 77.85 (5) | 78.99 (2) | 74.76 (9) | 78.12 (4) | 76.35 (7) | 78.38 (3) | 77.32 (6) | 80.56 (1) |
Yeast-1_vs_7 | 22.22 (8) | 31.79 (5) | 28.49 (6) | 36.61 (4) | 21.76 (9) | 39.21 (2) | 39.04 (3) | 26.87 (7) | 46.38 (1) |
winequality-red-8_vs_6 | NA | 5.88 (4) | NA | 22.22 (2) | 8.00 (3) | NA | NA | NA | 39.39 (1) |
winequality-red-8_vs_6-7 | NA | NA | NA | NA | NA | NA | NA | NA | 28.58 (1) |
winequality-white-3-9_vs_5 | NA | NA | NA | 22.00 (2) | 15.78 (3) | NA | NA | 3.21 (4) | 28.57 (1) |
winequality-red-3_vs_5 | NA | NA | NA | NA | NA | NA | NA | NA | 36.71 (1) |
Mean | 51.30 | 52.59 | 52.52 | 56.59 | 53.13 | 54.51 | 54.94 | 53.03 | 65.80 |
Average Rank | 6.33 | 5.53 | 4.93 | 3.93 | 5.66 | 4 | 3.46 | 4.66 | 1.4 |
Dataset | Euclidean | ITML | LMNN | DML-eig | GMML | DMLMJ | DMBK | IML | DMLdbIm |
---|---|---|---|---|---|---|---|---|---|
Heart | 75.44 (8) | 75.72 (7) | 79.1 (3) | 79.94 (1) | 77.60 (6) | 73.48 (9) | 78.14 (5) | 79.63 (2) | 78.62 (4) |
Wdbc | 95.16 (4) | 92.12 (9) | 94.32 (8) | 94.98 (5) | 95.32 (3) | 94.43 (7) | 94.58 (6) | 95.67 (2) | 96.58 (1) |
Pima | 57.52 (8) | 61.05 (3) | 57.32 (9) | 60.48 (5) | 57.65 (6) | 57.53 (7) | 60.75 (4) | 61.78 (1) | 61.08 (2) |
Glass0 | 71.28 (7) | 66.24 (9) | 71.75 (4) | 75.85 (1) | 70.45 (8) | 71.54 (5) | 73.83 (3) | 71.46 (6) | 74.10 (2) |
Ecoli1 | 69.44 (7) | 69.69 (6) | 65 (8) | 83.12 (1) | 70.52 (5) | 82.27 (2) | 78.72 (4) | 63.62 (9) | 79.99 (3) |
Ecoli2 | 85.96 (7) | 83.01 (9) | 91.34 (1) | 86.30 (6) | 85.21 (8) | 88.45 (3) | 86.36 (5) | 87.02 (4) | 88.74 (2) |
Newthyroid1 | 84.73 (9) | 89.28 (6) | 97.24 (2) | 88.03 (8) | 92.16 (4) | 88.96 (7) | 90.92 (5) | 98.92 (1) | 92.27 (3) |
Glass6 | 74.1 (9) | 82.9 (5) | 86.37 (3) | 85.28 (4) | 89.39 (2) | 79.32 (7) | 78.06 (8) | 91.2 (1) | 79.87 (6) |
Ecoli3 | 55.69 (9) | 57.82 (8) | 60.78 (7) | 63.25 (3) | 62.56 (4) | 60.94 (6) | 62.17 (5) | 64.65 (2) | 65.43 (1) |
Yeast-2_vs_4 | 75 (8) | 69.33 (9) | 76.34 (5) | 78.01 (2) | 75.9 (6) | 75.45 (7) | 77.25 (4) | 78.24 (1) | 77.38 (3) |
Yeast-1_vs_7 | NA | 14.42 (6) | 3.83 (8) | 29.39 (3) | 23.38 (5) | 30.12 (2) | 27.14 (4) | 10.77 (7) | 35.39 (1) |
winequality-red-8_vs_6 | NA | NA | NA | NA | NA | NA | NA | NA | 28.57 (1) |
winequality-red-8_vs_6-7 | NA | NA | NA | NA | NA | NA | NA | NA | 25 (1) |
winequality-white-3-9_vs_5 | NA | NA | NA | 25 (2) | 7.4 (4) | NA | NA | 10.29 (3) | 27.14 (1) |
winequality-red-3_vs_5 | NA | 1.19 (3) | NA | NA | NA | NA | NA | 9.36 (2) | 31.08 (1) |
Mean | 49.62 | 50.85 | 52.22 | 56.64 | 53.83 | 53.49 | 53.86 | 54.84 | 62.74 |
Average Rank | 6.53 | 5.93 | 4.73 | 3.26 | 4.6 | 5 | 4.4 | 3 | 2.13 |
Evaluation Criteria | Euclidean | ITML | LMNN | DML-eig | GMML | DMLMJ | DMBK | IML | DMLdbIm |
---|---|---|---|---|---|---|---|---|---|
Average Accuracy | 91.40 | 91.28 | 91.31 | 92.04 | 90.98 | 92.37 | 92.31 | 91.51 | 92.83 |
Average Precision | 57.57 | 58.87 | 55.67 | 61.56 | 58.57 | 60.44 | 59.23 | 55.53 | 73.25 |
Average Recall | 48.7 | 50.18 | 51.36 | 54.33 | 50.96 | 52.19 | 53.45 | 55.8 | 62.3 |
Average F1 Measure | 51.30 | 52.59 | 52.52 | 56.59 | 53.13 | 54.51 | 54.94 | 53.03 | 65.80 |
Accuracy Average Rank | 5.53 | 6.13 | 6.26 | 4.6 | 6.46 | 3.33 | 3.4 | 5.73 | 3.13 |
Precision Average Rank | 4.4 | 4.8 | 5.8 | 4.53 | 5.26 | 3.01 | 3.53 | 5.53 | 3 |
Recall Average Rank | 6.26 | 5.66 | 4.66 | 4.4 | 5.4 | 4.53 | 3.6 | 3.46 | 2.13 |
F1 Average Rank | 6.33 | 5.53 | 4.93 | 3.93 | 5.66 | 4 | 3.46 | 4.66 | 1.4 |
# | Dataset | No. of Samples | No. of Feature | Imbalance Ratio | DFSVM | DMLdbIm |
---|---|---|---|---|---|---|
1 | Glass1 | 214 | 9 | 1.82 | 62.9 | 64.52 |
2 | Glass6 | 214 | 9 | 6.38 | 83.1 | 90.91 |
3 | Yeast1vs7 | 459 | 8 | 14.3 | 47.6 | 47.79 |
4 | Yeast3 | 1484 | 8 | 8.1 | 78.1 | 67.74 |
5 | Yeast6 | 1484 | 8 | 41.4 | 50.1 | 83.33 |
6 | Ecoli0147vs2356 | 336 | 7 | 10.56 | 71.6 | 86.15 |
7 | Ecoli01vs235 | 244 | 7 | 9.17 | 80.9 | 90.91 |
8 | Ecoli0267vs35 | 224 | 7 | 9.18 | 76.4 | 80.23 |
9 | Vehcle3 | 846 | 18 | 2.99 | 73.7 | 81.53 |
10 | Pageblocks0 | 5472 | 10 | 8.79 | 80.5 | 79.21 |
Average | 63.29 | 77.23 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jalali Mojahed, A.; Moattar, M.H.; Ghaffari, H. Supervised Density-Based Metric Learning Based on Bhattacharya Distance for Imbalanced Data Classification Problems. Big Data Cogn. Comput. 2024, 8, 109. https://doi.org/10.3390/bdcc8090109
Jalali Mojahed A, Moattar MH, Ghaffari H. Supervised Density-Based Metric Learning Based on Bhattacharya Distance for Imbalanced Data Classification Problems. Big Data and Cognitive Computing. 2024; 8(9):109. https://doi.org/10.3390/bdcc8090109
Chicago/Turabian StyleJalali Mojahed, Atena, Mohammad Hossein Moattar, and Hamidreza Ghaffari. 2024. "Supervised Density-Based Metric Learning Based on Bhattacharya Distance for Imbalanced Data Classification Problems" Big Data and Cognitive Computing 8, no. 9: 109. https://doi.org/10.3390/bdcc8090109
APA StyleJalali Mojahed, A., Moattar, M. H., & Ghaffari, H. (2024). Supervised Density-Based Metric Learning Based on Bhattacharya Distance for Imbalanced Data Classification Problems. Big Data and Cognitive Computing, 8(9), 109. https://doi.org/10.3390/bdcc8090109