A Novel Semi-Supervised Fuzzy C-Means Clustering Algorithm Using Multiple Fuzzification Coefficients
Abstract
:1. Introduction
2. Preliminaries
2.1. Standard Fuzzy C-Means Clustering (FCM) Algorithm
- is the membership grade of an element in the cluster with center where ;
- and , for each ;
- The larger is, the more element belongs in cluster .
- Step 1: Initialize value for , let , set and .
- Step 2: At the loop, update according to the formula:
- Step 3: Update for the next step , according to the formula:
- Step 4: If , then go to Step 5; otherwise, let , and return to Step 2.
- Step 5: End.
2.2. Semi-Supervised Standard Fuzzy C-Means Clustering (sSFCM) Algorithms
- Step 1: Initialize value for , let , set and .
- Step 2: At the loop, update according to the formula:
- Step 3: Update for the next step , according to the formula:
- Step 4: If , then go to Step 5; otherwise, let , and return to Step 2.
- Step 5: End.
- Case 1: unsupervised, , for all (standard FCM algorithm).
- Case 2: semi-supervised, attempting to place data points 9 and 10 into cluster 1, , otherwise, .
- Case 3: semi-supervised, attempting to place data points 9 and 10 into cluster 1, , otherwise, .
3. Semi-Supervised Fuzzy C-Means Clustering Algorithm with Multiple Fuzzification Coefficients (sSMC-FCM)
3.1. Derivation of the Proposed sSMC-FCM Algorithm
- for all , for unsupervised elements i.
- , and for all , for supervised elements i to belong to cluster k.
- Step 1: Initialize value for , let , and set .
- Step 2: At the loop, update according to Equation (15) for unsupervised elements, or according to Equations (17)–(20) for supervised elements.
- Step 3: Update for the next step , according to Equation (10), with calculated using Equation (8).
- Step 4: If , then go to Step 5; otherwise, let , and return to Step 2.
- Step 5: End.
3.2. Determination of the Fuzzification Coefficients for Supervised Elements
4. Numerical Examples
- Case 1: unsupervised, (standard FCM algorithm).
- Case 2: semi-supervised, attempting to place data points 9 and 10 into cluster 1, , .
- Case 3: semi-supervised, attempting to place data points 9 and 10 into cluster 1, , .
- As increases, the membership grades of the supervised elements increase. Initially, with no supervision, data points 9 and 10 were placed into cluster 2. With supervision and , their membership grades increased but not enough to move into cluster 1, while with , these two data points were successfully placed into cluster 1;
- In general, the sSMC-FCM algorithm converges after a similar number of iterations to the standard FCM algorithm. For instance, with , it required about 10 iterations;
- When is changed, the cluster centers change to suit the supervised elements. For instance, case 1 has cluster centers , , case 2 has cluster centers , , and case 3 has cluster centers , . We can observe that, from case 1 to 2, the cluster center moves further away from data points 9 and 10;
- Compared to the sSFCM algorithm, we can see that the matrices in the case of , using the sSMC-FCM algorithm shown in Table 3 are similar to the corresponding matrices in the case of using the sSFCM algorithm shown in Table 2. Both cases were able to increase the membership grade of the points to belong to cluster 1 but were not successfully in moving the points into cluster 1;
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Arora, J.; Khatter, K.; Tushir, M. Fuzzy c-Means Clustering Strategies: A Review of Distance Measures. Softw. Eng. 2018, 153–162. [Google Scholar]
- Everitt, B.S.; Landau, S.; Leese, M.; Stahl, D. Cluster Analysis, 5th ed.; John Wiley & Sons, Ltd.: Hoboken, NJ, USA, 2011. [Google Scholar]
- Havens, T.C.; Bezdek, J.C.; Leckie, C.; Hall, L.O.; Palaniswami, M. Fuzzy c-Means Algorithms for Very Large Data. IEEE Trans. Fuzzy Syst. 2012, 20, 1130–1146. [Google Scholar] [CrossRef]
- Gosain, A.; Dahiya, S. Performance Analysis of Various Fuzzy Clustering Algorithms: A Review. Procedia Comput. Sci. 2016, 79, 100–111. [Google Scholar] [CrossRef] [Green Version]
- Ruspini, E.H.; Bezdek, J.C.; Keller, J.M. Fuzzy Clustering: A Historical Perspective. IEEE Comput. Intell. Mag. 2019, 14, 45–55. [Google Scholar] [CrossRef]
- Vendramin, L.; Campello, R.J.G.B.; Hruschka, E.R. Relative Clustering Validity Criteria: A Comparative Overview. Stat. Anal. Data Min. 2010, 3, 209–235. [Google Scholar] [CrossRef]
- Casalino, G.; Castellano, G.; Mencar, C. Data stream classification by dynamic incremental semi-supervised fuzzy clustering. Int. J. Artif. Intell. Tools 2019, 28, 1960009. [Google Scholar] [CrossRef]
- Gan, H.; Fan, Y.; Luo, Z.; Huang, R.; Yang, Z. Confidence-weighted safe semi-supervised clustering. Eng. Appl. Artif. Intell. 2019, 81, 107–116. [Google Scholar] [CrossRef]
- Mai, S.D.; Ngo, L.T. Multiple kernel approach to semi-supervised fuzzy clustering algorithm for land-cover classification. Eng. Appl. Artif. Intell. 2018, 68, 205–213. [Google Scholar] [CrossRef]
- Komori, O.; Eguchi, S. A Unified Formulation of k-Means, Fuzzy c-Means and Gaussian Mixture Model by the Kolmogorov–Nagumo Average. Entropy 2021, 23, 518. [Google Scholar] [CrossRef] [PubMed]
- Son, L.H.; Tuan, T.M. Dental segmentation from X-ray images using semi-supervised fuzzy clustering with spatial constraints. Eng. Appl. Artif. Intell. 2017, 59, 186–195. [Google Scholar] [CrossRef]
- Maraziotis, I.A. A semi-supervised fuzzy clustering algorithm applied to gene expression data. Pattern Recognit. 2012, 45, 637–648. [Google Scholar] [CrossRef]
- Śmieja, M.; Struski, Ł.; Figueiredo, M.A. A classification-based approach to semi-supervised clustering with pairwise constraints. Neural Netw. 2020, 127, 193–203. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Yasunori, E.; Yukihiro, H.; Makito, Y.; Yasunori, M.S. On semi-supervised fuzzy c-means clustering. In Proceedings of the IEEE International Conference on Fuzzy Systems, Jeju Island, Korea, 20–24 August 2009. [Google Scholar]
- Hwang, C.; Rhee, F.C.-H. Uncertain Fuzzy Clustering: Interval Type-2 Fuzzy Approach to C-Means. IEEE Trans. Fuzzy Syst. 2007, 15, 107–120. [Google Scholar] [CrossRef]
- Khang, T.D.; Vuong, N.D.; Tran, M.-K.; Fowler, M. Fuzzy C-Means Clustering Algorithm with Multiple Fuzzification Coefficients. Algorithms 2020, 13, 158. [Google Scholar] [CrossRef]
- Khang, T.D.; Phong, P.A.; Dong, D.K.; Trang, C.M. Hedge Algebraic Type-2 Fuzzy Sets. In Proceedings of the Conference: FUZZ-IEEE 2010, IEEE International Conference on Fuzzy Systems, Barcelona, Spain, 18–23 July 2010. [Google Scholar]
- Nguyen, C.H.; Tran, D.K.; Nam, H.V.; Nguyen, H.C. Hedge Algebras, Linguistic-Valued Logic and Their Application to Fuzzy Reasoning. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 1999, 7, 347–361. [Google Scholar] [CrossRef]
- Phong, P.A.; Khang, T.D.; Dong, D.K. A fuzzy rule-based classification system using Hedge Algebraic Type-2 Fuzzy Sets. In Proceedings of the Annual Conference of the North American Fuzzy Information Processing Society (NAFIPS), El Paso, TX, USA, 31 October–4 November 2016; pp. 265–270. [Google Scholar]
i | (Xi1, Xi2) | i | (Xi1, Xi2) | i | (Xi1, Xi2) | i | (Xi1, Xi2) |
---|---|---|---|---|---|---|---|
1 | (0, 4.5) | 6 | (3.5, 5.5) | 11 | (9, 0) | 16 | (10, 0) |
2 | (0, 5.5) | 7 | (5.25, 4.5) | 12 | (9, 2.5) | 17 | (10, 2.5) |
3 | (1.75, 4.5) | 8 | (5.25, 5.5) | 13 | (9, 5) | 18 | (10, 5) |
4 | (1.75, 5.5) | 9 | (7, 4.5) | 14 | (9, 7.5) | 19 | (10, 7.5) |
5 | (3.5, 4.5) | 10 | (7, 5.5) | 15 | (9, 10) | 20 | (10, 10) |
i | Unsupervised | i | i | ||
---|---|---|---|---|---|
1 | (0.91, 0.09) | 1 | (0.92, 0.08) | 1 | (0.92, 0.08) |
2 | (0.91, 0.09) | 2 | (0.92, 0.08) | 2 | (0.92, 0.08) |
3 | (0.98, 0.02) | 3 | (0.98, 0.02) | 3 | (0.98, 0.02) |
4 | (0.98, 0.02) | 4 | (0.98, 0.02) | 4 | (0.98, 0.02) |
5 | (0.97, 0.03) | 5 | (0.97, 0.03) | 5 | (0.97, 0.03) |
6 | (0.97, 0.03) | 6 | (0.97, 0.03) | 6 | (0.97, 0.03) |
7 | (0.68, 0.32) | 7 | (0.70, 0.30) | 7 | (0.72, 0.28) |
8 | (0.68, 0.32) | 8 | (0.70, 0.30) | 8 | (0.72, 0.28) |
9 | (0.19, 0.81) | 9 | (0.45, 0.55) | 9 | (0.69, 0.31) |
10 | (0.19, 0.81) | 10 | (0.45, 0.55) | 10 | (0.69, 0.31) |
11 | (0.28, 0.72) | 11 | (0.28, 0.72) | 11 | (0.28, 0.72) |
12 | (0.12, 0.88) | 12 | (0.12, 0.88) | 12 | (0.12, 0.88) |
13 | (0.00, 1.00) | 13 | (0.00, 1.00) | 13 | (0.00, 1.00) |
14 | (0.12, 0.88) | 14 | (0.12, 0.88) | 14 | (0.12, 0.88) |
15 | (0.28, 0.72) | 15 | (0.28, 0.72) | 15 | (0.28, 0.72) |
16 | (0.25, 0.75) | 16 | (0.25, 0.75) | 16 | (0.25, 0.75) |
17 | (0.11, 0.89) | 17 | (0.10, 0.90) | 17 | (0.10, 0.90) |
18 | (0.02, 0.98) | 18 | (0.01, 0.99) | 18 | (0.01, 0.99) |
19 | (0.11, 0.89) | 19 | (0.10, 0.90) | 19 | (0.10, 0.90) |
20 | (0.25, 0.75) | 20 | (0.25, 0.75) | 20 | (0.25, 0.75) |
i | Unsupervised | i | i | ||
---|---|---|---|---|---|
1 | (0.91, 0.09) | 1 | (0.92, 0.08) | 1 | (0.92, 0.08) |
2 | (0.91, 0.09) | 2 | (0.92, 0.08) | 2 | (0.92, 0.08) |
3 | (0.98, 0.02) | 3 | (0.98, 0.02) | 3 | (0.98, 0.02) |
4 | (0.98, 0.02) | 4 | (0.98, 0.02) | 4 | (0.98, 0.02) |
5 | (0.97, 0.03) | 5 | (0.98, 0.02) | 5 | (0.98, 0.02) |
6 | (0.97, 0.03) | 6 | (0.98, 0.02) | 6 | (0.98, 0.02) |
7 | (0.68, 0.32) | 7 | (0.71, 0.29) | 7 | (0.71, 0.29) |
8 | (0.68, 0.32) | 8 | (0.71, 0.29) | 8 | (0.71, 0.29) |
9 | (0.19, 0.81) | 9 | (0.43, 0.57) | 9 | (0.60, 0.40) |
10 | (0.19, 0.81) | 10 | (0.43, 0.57) | 10 | (0.60, 0.40) |
11 | (0.28, 0.72) | 11 | (0.28, 0.72) | 11 | (0.28, 0.72) |
12 | (0.12, 0.88) | 12 | (0.12, 0.88) | 12 | (0.12, 0.88) |
13 | (0.00, 1.00) | 13 | (0.00, 1.00) | 13 | (0.00, 1.00) |
14 | (0.12, 0.88) | 14 | (0.12, 0.88) | 14 | (0.12, 0.88) |
15 | (0.28, 0.72) | 15 | (0.28, 0.72) | 15 | (0.28, 0.72) |
16 | (0.25, 0.75) | 16 | (0.25, 0.75) | 16 | (0.25, 0.75) |
17 | (0.11, 0.89) | 17 | (0.11, 0.89) | 17 | (0.10, 0.90) |
18 | (0.02, 0.98) | 18 | (0.01, 0.99) | 18 | (0.01, 0.99) |
19 | (0.11, 0.89) | 19 | (0.11, 0.89) | 19 | (0.10, 0.90) |
20 | (0.25, 0.75) | 20 | (0.25, 0.75) | 20 | (0.25, 0.75) |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Khang, T.D.; Tran, M.-K.; Fowler, M. A Novel Semi-Supervised Fuzzy C-Means Clustering Algorithm Using Multiple Fuzzification Coefficients. Algorithms 2021, 14, 258. https://doi.org/10.3390/a14090258
Khang TD, Tran M-K, Fowler M. A Novel Semi-Supervised Fuzzy C-Means Clustering Algorithm Using Multiple Fuzzification Coefficients. Algorithms. 2021; 14(9):258. https://doi.org/10.3390/a14090258
Chicago/Turabian StyleKhang, Tran Dinh, Manh-Kien Tran, and Michael Fowler. 2021. "A Novel Semi-Supervised Fuzzy C-Means Clustering Algorithm Using Multiple Fuzzification Coefficients" Algorithms 14, no. 9: 258. https://doi.org/10.3390/a14090258
APA StyleKhang, T. D., Tran, M. -K., & Fowler, M. (2021). A Novel Semi-Supervised Fuzzy C-Means Clustering Algorithm Using Multiple Fuzzification Coefficients. Algorithms, 14(9), 258. https://doi.org/10.3390/a14090258