Analysis of the Threshold Variation of the FlexCon-C Algorithm for Semi-supervised Learning

  • Arthur C. Gorgônio UFRN
  • Cainan T. Alves UFRN
  • Amarildo J. F. Lucena UFRN
  • Flavius L. Gorgônio UFRN
  • Karliane M. O. Vale UFRN
  • Anne M. P. Canuto anne@dimap.ufrn.br

Resumo


Semi-supervised learning algorithms are able to train classifiers from a small portion of initially labeled objects. The reliability of the classification process depends on several factors that include the type of classifier used and a set of parameters that customize them. One of the most important factors is a threshold that determines which instances are included per iteration, allowing to label only instances with high confidence values. This article analyzes different values for the variation factor of the FlexCon-C algorithm and measures the impact of this change on its accuracy. The results consider thirty different databases, four classifiers and five different percentages of pre-labeled data.

Referências


Bouchachia, A. (2012). Learning with partional supervision. In Machine Learning, volume 3, pages 1880–1888. Information Science Reference, Hershey PA 17033.

Chapelle, O., Sch¨olkopf, B., and Zien, A. (2006). Semi-Supervised Learning. The MIT Press, Cambridge, Massachusetts, London.

Grandvalet, Y. and Bengio, Y. (2004). Semi-supervised learning by entropy minimization. In Proceedings of the 17th International Conference on Neural Information Processing Systems, NIPS’04, pages 529–536, Cambridge, MA, USA. MIT Press.

Hady, M. F. A. and Schwenker, F. (2013). Semi-supervised learning. In Handbook on Neural Information Processing, pages 215–239. Springer, Berlin Heidelberg.

Mitchell, T. M. (1997). Machine Learning. McGraw-Hill, Boston, Massachusetts, London.

Rodrigues, F. M., Canuto, A. M. P., and Santos, A. M. (2014). Confidence factor and feature selection for semi-supervised multi-label classification methods. International Joint Conference on Neural Networks, pages 864 – 871.

Rodrigues, F. M., Santos, A. M., and Canuto, A. M. P. (2013). Using confidence values in multi-label classification problems with semi-supervised learning. International Joint Conference on Neural Networks, pages 1 – 8.

Tanha, J., van Someren, M., and Afsarmanesh, H. (2017). Semi-supervised self-training for decision tree classifiers. International Journal of Machine Learning and Cybernetics, 8(1):355–370.

Tao, Y., Zhang, D., Cheng, S., and Tang, X. (2018). Improving semi-supervised selftraining with embedded manifold transduction. Transactions of the Institute of Measurement and Control, 40(2):363–374.

Vale, K. M. O., de P. Canuto, A. M., de Medeiros Santos, A., da Luz e Gorgônio, F., de M. Tavares, A., Gorgônio, A. C., and Alves, C. T. (2018). Automatic adjustment of confidence values in self-training semi-supervised method. International Joint Conference on Neural Networks.

Zhu, X. (2008). Semi-supervised learning literature survey. Technical Report 1530, Computer Sciences, University of Wisconsin-Madison.

Zhu, X. and Goldberg, A. B. (2009). Introduction to semi-supervised learning. Morgan & Claypool Publishers, San Rafael, California, 3 edition.

Publicado
22/10/2018
GORGÔNIO, Arthur C.; ALVES, Cainan T.; LUCENA, Amarildo J. F.; GORGÔNIO, Flavius L.; VALE, Karliane M. O.; CANUTO, Anne M. P.. Analysis of the Threshold Variation of the FlexCon-C Algorithm for Semi-supervised Learning. In: ENCONTRO NACIONAL DE INTELIGÊNCIA ARTIFICIAL E COMPUTACIONAL (ENIAC), 15. , 2018, São Paulo. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2018 . p. 775-786. ISSN 2763-9061. DOI: https://doi.org/10.5753/eniac.2018.4466.