Augmented Chains to Ensemble of Classifier Chains
Resumo
Multi-label classification (MLC) problems, where instances are associated with multiple labels, are commonly employed in everyday applications. There are several approaches to solving MLC problems and the ensemble of classifier chains (ECC) is one such method used as the basis of this article. ECC uses a binary classifier for each label and creates a chain of these classifiers in a specific sequence. However, the method has issues related to the order of the chain and the number of labels. Many studies try to find the best chain order or reduce the number of labels to improve results. This article aims to evaluate whether the insertion of meta-labels, created from combinations of the original labels, can enhance ECC prediction results. The approach involves creating combinations of labels through similarity correlation, selecting the most relevant labels based on these correlations, incorporating them into the dataset, and subsequently evaluating the model and prediction results. Results obtained in experiments with 19 well-known multi-label datasets and evaluated with 12 different measures show that the proposed approach improves Micro-Precision, Precision, Hamming-Loss, and Subset-Accuracy.
Referências
Charte, F., Rivera, A., del Jesus, M. J., and Herrera, F. A first approach to deal with imbalance in multi-label datasets. In Hybrid Artificial Intelligent Systems, J.-S. Pan, M. M. Polycarpou, M. Woźniak, A. C. P. L. F. de Carvalho, H. Quintián, and E. Corchado (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 150–160, 2013.
Demsar, J. Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. vol. 7, pp. 1–30, Dec., 2006.
Gonçalves, E., Plastino, A., and Freitas, A. Simpler is better: a novel genetic algorithm to induce compact multi-label chain classifiers, 2015.
Huang, J., Li, G., Wang, S., Zhang, W., and Huang, Q. Group sensitive classifier chains for multi-label classification. In 2015 IEEE International Conference on Multimedia and Expo (ICME), 2015.
Jaccard, P. The distribution of the flora in the alpine zone.1. New Phytologist 11 (2): 37–50, 1912. Li, N., Pan, Z., and Zhou, X. Classifier chain algorithm based on multi-label importance rank. vol. 29, pp. 567–575, 06, 2016.
Moyano, J., Gibaja, E., Cios, K., and Ventura, S. Review of ensembles of multi-label classifiers: Models, experimental study and prospects. Information Fusion vol. 44, pp. 2018, 11, 2018.
Read, J., Pfahringer, B., Holmes, G., and Frank, E. Classifier chains for multi-label classification. Machine Learning vol. 85, pp. 254–269, 08, 2009.
Read, J., Pfahringer, B., Holmes, G., and Frank, E. Classifier chains: A review and perspectives, 2019.
Silva, P., Gonçalves, E., Plastino, A., and Freitas, A. Distinct chains for different instances: An effective strategy for multi-label classifier chains, 2014.
Sun, L. and Kudo, M. Multi-label classification by polytree-augmented classifier chains with label-dependent features.
Pattern Analysis and Applications 22 (3): 1029–1049, 2019. The final publication is available at [link].
Tsoumakas, G. and Katakis, I. Multi-label classification: An overview. International Journal of Data Warehousing and Mining vol. 3, pp. 1–13, 09, 2009.
Tsoumakas, G., Katakis, I., and Vlahavas, I. Effective and efficient multilabel classification in domains with large number of labels. , 01, 2008.
Zhang, M.-L., Li, Y.-K., Liu, X.-Y., and Geng, X. Binary relevance for multi-label learning: an overview. Frontiers of Computer Science vol. 12, 11, 2017.