NebFuzz: A New Semi-Supervised Clustering Algorithm Based on Fuzzy C-Means

  • Valmir Macário UFPE
  • Francisco de A. T. de Carvalho UFPE

Abstract


Semi-supervised clustering uses unlabeled data, combined with the labeled data, to improve the algorithm performances. This paper presents a new algorithm for semi-supervised clustering based on Fuzzy C-Means algorithm. The new algorithm was evaluated and compared against two semi-supervised clustering algorithms in the context of learning from partially labeled data. The behavior of the proposed algorithm is discussed and the results are validated using accuracy rate, corrected rand index and a 95% confidence interval. Thus, it was possible to certify the better accuracy performance of the new algorithm when a few labeled data are available.

References

Amini, M. R. and Gallinari, P. (2005). Semi-supervised learning with an imperfect supervisor. Knowledge and Information Systems, 8:385–413.

Bezdek, J. (1981). Pattern Recognition With Fuzzy Objective Function Algorithms. Plenum.

Bouchachia, A. (2007). Learning with parly data. Neural Computing and application, (16):267–293.

Bouchachia, A. and Pedrycz, W. (2006). Data clustering with partial supervision. Data Mining and Knowledge Discovery, (12):47–78.

Chapelle, O., Zien, A., and Scholkopf, B. (2006). Semi-supervised learning. MIT Press.

Costa, I. G., Carvalho, F. A. D., and de Souto, M. C. (2003). Comparative study on proximity indices for cluster analysis of gene expression time series. Journal of Intelligent & Fuzzy Systems, 13:133–142.

Hathaway, R. J., Bezdek, J., and Hu, Y. (2000). Generalized fuzzy c-means clustering strategies using lp-norm distances. IEEE Transaction on Fuzzy Systems, 8(5):576–582.

Jain, A. K. and Dubes, R. C. (1988). Algorithms for clustering data. Prentice Hall, New Jersey.

Mitchel, T. (1997). Machine Learning. McGraw Hill.

Nigam, K., McCallum, A. K., Thrun, S., and Mitchell, T. (2000). Text classification from labeled and unlabeled documents using em. Machine Learning, 39:103–134.

Pedrycz, W. and Waletzky, J. (1997). Fuzzy clustering with partial supervision. IEEE transactions on system, man and cybernetics, 27(5).

Stepp, R. E. and Michalski, R. S. (1986). Machine Learning: An Artificial Intelligence Approach, volume 2, chapter Conceptual Clustering: Inventing Goal-Oriented Classifictions of Structured Objects, pages 471–478. Morgan Kaufmann.

Zhu, X. (2008). Semi-Supervised Learning Literature Survey. Carnegie Mellon University.
Published
2011-07-19
MACÁRIO, Valmir; CARVALHO, Francisco de A. T. de. NebFuzz: A New Semi-Supervised Clustering Algorithm Based on Fuzzy C-Means. In: NATIONAL MEETING ON ARTIFICIAL AND COMPUTATIONAL INTELLIGENCE (ENIAC), 8. , 2011, Natal/RN. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2011 . p. 240-250. ISSN 2763-9061.