Abstract
Semi-supervised learning is characterized by a low number of labeled instances and a high number of unlabeled instances. FlexCon-C (Flexible Confidence Classifier) is a well-known semi-supervised method that uses the self-training learning algorithm as basis to generate prediction models. The main difference between self-training and FlexCon-C is that the former uses a fixed threshold to select the unlabeled instances, while the latter has a dynamically adjusted confidence. FlexCon-C applies a confidence adjustment equation based on the classifier performance. In this sense, the classifier performance is used to select and to label unlabeled instances. In Machine Learning, it is well-known that the classifier performance can be further improved through the use of classifier ensembles. Therefore, this study proposes the use classifier ensembles in the FlexCon-C confidence adjustment equation, aiming to provide a more efficient measure to select and to label unlabeled instances. In order to assess the viability of the proposed method (FlexCon-CE), an empirical analysis will be conducted, using 20 datasets, three different classification algorithms and five different configurations of initially unlabeled data. The results indicate that the proposed method outperformed the traditional method, therewith proving itself promising for the task of automatic data selection and labeling in the semi-supervised context.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Faceli, K., Lorena, A.C., Gama, J., de Leon Ferreira de Carvalho, A.C.P.: An approach of machine learning. Artif. Intell. (2011)
Chapelle, O., Scholkopf, B., Zien, A.: Semi-supervised Learning, vol. 2. The MIT Press, Cambridge, MA (2006)
Ovidio Vale, K.M., et al.: Automatic adjustment of confidence values in self-training semi-supervised method. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2018)
Wei, W., Jiang, F., Yu, X., Du, J.: An ensemble learning algorithm based on resampling and hybrid feature selection, with an application to software defect prediction. In: 2022 7th International Conference on Information and Network Technologies (ICINT), pp. 52–56. IEEE (2022)
Safiya Parvin, A., Saleena, B.: An ensemble classifier model to predict credit scoring-comparative analysis. In: 2020 IEEE International Symposium on Smart Electronic Systems (iSES) (Formerly iNiS), pp. 27–30. IEEE (2020)
Zohaib Jan, M., Verma, B.: A novel diversity measure and classifier selection approach for generating ensemble classifiers. IEEE Access 7, 156360–156373 (2019)
Lochter, J.V., Zanetti, R.F., Reller, D., Almeida, T.A.: Short text opinion detection using ensemble of classifiers and semantic indexing. Exp. Syst. Appl. 62:243–249 (2016)
Li, X., Shi, T., Li, P., Zhou, W.: Application of bagging ensemble classifier based on genetic algorithm in the text classification of railway fault hazards. In: 2019 2nd International Conference on Artificial Intelligence and Big Data (ICAIBD), pp. 286–290. IEEE (2019)
Cichosz, P.: Data Mining Algorithms: Explained using R. John Wiley & Sons (2014)
Albalate, A., Minker, W.: Semi-supervised and Unsupervised Machine Learning: Novel Strategies. John Wiley & Sons (2013)
Zhu, X., Goldberg, A.B.: Introduction to semi-supervised learning. Synth. Lect. Artif. Intell. Mach. Learn. 3(1), 1–130 (2009)
Rodrigues, F.M., de M. Santos, A., Canuto, A.M.P.: Using confidence values in multi-label classification problems with semi-supervised learning. In: The 2013 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2013)
Nascimento, D.S.C., Coelho, A.L.V., Canuto, A.M.P.: Integrating complementary techniques for promoting diversity in classifier ensembles: a systematic study. Neurocomputing 138, 347–357 (2014)
Gharroudi, O.: Ensemble multi-label learning in supervised and semi-supervised settings. Ph.D. thesis, Université de Lyon (2017)
Rodrigues, F.M., Câmara, C.J., Canuto, A.M.P., Santos, A.M.: Confidence factor and feature selection for semi-supervised multi-label classification methods. In: 2014 International Joint Conference on Neural Networks (IJCNN), pp. 864–871. IEEE (2014)
Gorgônio, A.C., et al.: Análise da variação do limiar para o algoritmo de aprendizado semissupervisionado flexcon-c/threshold variation analysis for flexcon-c semisupervised learning algorithm. Brazil. J. Develop. 5(11), 26654–26669 (2019)
Vale, K.M.O., Gorgônio, A.C., Da Luz, E.G.F., De Paula Canuto, A.M.: An efficient approach to select instances in self-training and co-training semi-supervised methods. IEEE Access 10, 7254–7276 (2021)
Gorgônio, A.C., Alves, C.T., Lucena, A.J.F., Gorgônio, F.L., Vale, K.M.O., Canuto, A.M.P.: Analysis of the threshold variation of the flexcon-c algorithm for semi-supervised learning. In: Anais do XV Encontro Nacional de Inteligência Artificial e Computacional, pp. 775–786. SBC (2018)
Vale, K.M.O., et al.: A data stratification process for instances selection in semi-supervised learning. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2019)
Breiman, L.: Bias, variance, and arcing classifiers. Technical report, Tech. Rep. 460, Statistics Department, University of California, Berkeley (1996)
Dheeru, D., Karra Taniskidou, E.: UCI machine learning repository (2017)
Smith, J.W., Everhart, J.E., Dickson, W.C., Knowler, W.C., Johannes, R.S.: Using the adap learning algorithm to forecast the onset of diabetes mellitus. In: Proceedings of the Annual Symposium on Computer Application in Medical Care, p. 261. American Medical Informatics Association (1988)
Pölsterl, S.: scikit-survival: a library for time-to-event analysis built on top of scikit-learn. J. Mach. Learn. Res. 21(1), 8747–8752 (2020)
Bisong, E.: Building Machine Learning and Deep Learning Models on Google Cloud Platform. Apress, Berkeley (2019). https://doi.org/10.1007/978-1-4842-4470-8
Harris, C.R., et al.: Array programming with numpy. Nature 585(7825), 357–362 (2020)
Nelli, F.: Python data analytics with pandas, numpy, and matplotlib (2018)
Araújo, Y.N., et al.: A data stratification process for instances selection applied to co-training semi-supervised learning algorithm. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2021)
Vale, K.M.O., Gorgônio, F.L., Araújo, Y.N., Gorgônio, A.C., de P Canuto, A.M.: A co-training-based algorithm using confidence values to select instances. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–7. IEEE (2020)
Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms. John Wiley & Sons (2014)
Theodorsson-Norheim, E.: Friedman and quade tests: basic computer program to perform nonparametric two-way analysis of variance and multiple comparisons on ranks of several related samples. Comput. Biol. Med. 17(2), 85–99 (1987)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Medeiros, A., Gorgônio, A.C., Vale, K.M.O., Gorgônio, F.L., Canuto, A.M.d.P. (2023). FlexCon-CE: A Semi-supervised Method with an Ensemble-Based Adaptive Confidence. In: Naldi, M.C., Bianchi, R.A.C. (eds) Intelligent Systems. BRACIS 2023. Lecture Notes in Computer Science(), vol 14197. Springer, Cham. https://doi.org/10.1007/978-3-031-45392-2_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-45392-2_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-45391-5
Online ISBN: 978-3-031-45392-2
eBook Packages: Computer ScienceComputer Science (R0)