Classificação Associativa Utilizando Seleção e Construção de Regras: um Estudo Comparativo
Resumo
Classificação associativa é uma abordagem híbrida que tem se mostrado bastante competitiva com outros classificadores simbólicos. Nessa abordagem, regras de associação com o atributo classe como conseqüente são utilizadas como classificador. Uma limitação dessa abordagem é o grande número de regras geradas, sendo muitas delas redundantes. Para contornar essa limitação, foram propostos dois métodos de seleção de regras baseado na análise ROC: ROCCER, baseado em busca no espaço ROC; e GARSS, que aplica algoritmo genético para selecionar um subconjunto de regras que maximize a medida AUC. Neste trabalho, apresentamos o MORLEA, que utiliza um algoritmo genético multi-objetivo para aprimorar as regras em vez de selecionálas. Resultados experimentais mostram que o MORLEA é capaz de induzir um classificador com número de regras inferior quando comparado com o classificador constituído de todas as regras de associação de classificação, e ao mesmo tempo que apresenta precisão semelhante a do algoritmo C4.5Referências
Agrawal, R., ImielinskiP., T., and Swami, A. (1993). Mining association rules between sets of items in large databases. In Proceedings of the International Conference on Management of Data, SIGMOD, pages 207–216.
Batista, G. E. A. P. A., Milaré, C. R., Prati, R. C., and Monard, M. C. (2006). A comparison of methods for rule subset selection applied to associative classification. Inteligencia Artificial, (32):29–35.
Blake, C. L. and Merz, C. J. (1998). UCI repository of machine learning databases. [link].
Borgelt, C. and Kruse, R. (2002). Induction of association rules: A priori implementation. In 15th Conf. on Computational Statistics, pages 395–400. Physica-Verlag.
Clark, P. and Boswell, R. (1991). Rule induction with CN2: Some recent improvements. In Proc. 5th European Conf. on Machine Learning, volume 482 of LNAI, pages 151–163. Springer-Verlag.
Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7:1–30.
Domingos, P. (1996). Unifying instance-based and rule-based induction. Machine Learning, 24(2):141–168.
Freitas, A. (2002). Data Mining and Knowledge Discovery with Evolutionary Algorithms. Springer-Verlag.
Fürnkranz, J. and Flach, P. (2005). ROC’n’rule learning – toward a better understanding of rule covering algorithms. Machine Learning, 58(1):39–77.
Goldberg, D. E. (1998). Genetic Algorithms in Search, Optimization & Machine Learning. Addison Wesley.
Javanoski, V. and Lavrač, N. (2001). Classification rule learning with Apriori-C. In Proc. 10th Portuguese Conf. on Artificial Intelligence, volume 2258 of LNAI, pages 44–52, Porto, Portugal. Springer-Verlag.
Kavsek, B. and Lavrac, N. (2006). Apriori-SD: Adapting association rule learning to subgroup discovery. Applied Artificial Intelligence, 20(7):543–583.
Lavrac, N., Flach, P. A., and Zupan, B. (1999). Rule evaluation measures: A unifying view. In International Workshop on Inductive Logic Programming, pages 174–185.
Lavrac, N., Kavsek, B., Flach, P. A., and Todorovski, L. (2004). Subgroup discovery with CN2-SD. Journal of Machine Learning Research, 5:153–188.
Li, W., Han, J., and Pei, J. (2001). Cmar: Accurate and efficient classification based on multiple class-association rules. In Proceedings of the 2001 IEEE International Conference on Data Mining, pages 369–376. IEEE Computer Society.
Liu, B., Hsu, W., and Ma, Y. (1998). Integrating classification and association rule mining. In Proc. 4th Int. Conf. on Knowledge Discovery and Data Mining, pages 80–86, New York, USA.
Milaré, C. R., Batista, G. E. A. P. A., Carvalho, A. C. P. L. F., and Monard, M. C. (2004). Applying genetic and symbolic learning algorithms to extract rules from artificial neural neworks. In Proc. Mexican International Conference on Artificial Intelligence, volume 2972 of LNAI, pages 833–843. Springer-Verlag.
Pila, A. D. (2007). Computação Evolutiva para a Construção de Regras de Conhecimento com Propriedades Específicas. Tese de Doutorado, ICMC-USP.
Pila, A. D., Giusti, R., Prati, R. C., and Monard, M. C. (2006). A multi-objective evolutionary algorithm to build knowledge classification rules with specific properties. In 6th International Conference on Hybrid Intelligent Systems (HIS 2006), Auckland, New Zealand. IEEE Computer Society. publicado em CD-ROM.
Prati, R. C. and Flach, P. A. (2005). ROCCER: An algorithm for rule learning based on ROC analysis. In Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence (IJCAI’2005), pages 823–828, Edinburgh, Scotland, UK.
Provost, F. and Fawcett, T. (2001). Robust classification for imprecise environments. Machine Learning, 42(3):203–231.
Quinlan, J. R. (1993). C4.5 Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA.
Veloso, A. and Jr., W. M. (2005). Rule generation and rule selection techniques for cost-sensitive associative classification. In 20 Simpósio Brasileiro de Bancos de Dados, pages 295–309.
Yin, X. and Han, J. (2003). CPAR: Classification based on predictive association rules. In Proc. of the 3rd SIAM Int. Conf. on Data Mining, San Francisco, CA. SIAM.
Batista, G. E. A. P. A., Milaré, C. R., Prati, R. C., and Monard, M. C. (2006). A comparison of methods for rule subset selection applied to associative classification. Inteligencia Artificial, (32):29–35.
Blake, C. L. and Merz, C. J. (1998). UCI repository of machine learning databases. [link].
Borgelt, C. and Kruse, R. (2002). Induction of association rules: A priori implementation. In 15th Conf. on Computational Statistics, pages 395–400. Physica-Verlag.
Clark, P. and Boswell, R. (1991). Rule induction with CN2: Some recent improvements. In Proc. 5th European Conf. on Machine Learning, volume 482 of LNAI, pages 151–163. Springer-Verlag.
Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7:1–30.
Domingos, P. (1996). Unifying instance-based and rule-based induction. Machine Learning, 24(2):141–168.
Freitas, A. (2002). Data Mining and Knowledge Discovery with Evolutionary Algorithms. Springer-Verlag.
Fürnkranz, J. and Flach, P. (2005). ROC’n’rule learning – toward a better understanding of rule covering algorithms. Machine Learning, 58(1):39–77.
Goldberg, D. E. (1998). Genetic Algorithms in Search, Optimization & Machine Learning. Addison Wesley.
Javanoski, V. and Lavrač, N. (2001). Classification rule learning with Apriori-C. In Proc. 10th Portuguese Conf. on Artificial Intelligence, volume 2258 of LNAI, pages 44–52, Porto, Portugal. Springer-Verlag.
Kavsek, B. and Lavrac, N. (2006). Apriori-SD: Adapting association rule learning to subgroup discovery. Applied Artificial Intelligence, 20(7):543–583.
Lavrac, N., Flach, P. A., and Zupan, B. (1999). Rule evaluation measures: A unifying view. In International Workshop on Inductive Logic Programming, pages 174–185.
Lavrac, N., Kavsek, B., Flach, P. A., and Todorovski, L. (2004). Subgroup discovery with CN2-SD. Journal of Machine Learning Research, 5:153–188.
Li, W., Han, J., and Pei, J. (2001). Cmar: Accurate and efficient classification based on multiple class-association rules. In Proceedings of the 2001 IEEE International Conference on Data Mining, pages 369–376. IEEE Computer Society.
Liu, B., Hsu, W., and Ma, Y. (1998). Integrating classification and association rule mining. In Proc. 4th Int. Conf. on Knowledge Discovery and Data Mining, pages 80–86, New York, USA.
Milaré, C. R., Batista, G. E. A. P. A., Carvalho, A. C. P. L. F., and Monard, M. C. (2004). Applying genetic and symbolic learning algorithms to extract rules from artificial neural neworks. In Proc. Mexican International Conference on Artificial Intelligence, volume 2972 of LNAI, pages 833–843. Springer-Verlag.
Pila, A. D. (2007). Computação Evolutiva para a Construção de Regras de Conhecimento com Propriedades Específicas. Tese de Doutorado, ICMC-USP.
Pila, A. D., Giusti, R., Prati, R. C., and Monard, M. C. (2006). A multi-objective evolutionary algorithm to build knowledge classification rules with specific properties. In 6th International Conference on Hybrid Intelligent Systems (HIS 2006), Auckland, New Zealand. IEEE Computer Society. publicado em CD-ROM.
Prati, R. C. and Flach, P. A. (2005). ROCCER: An algorithm for rule learning based on ROC analysis. In Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence (IJCAI’2005), pages 823–828, Edinburgh, Scotland, UK.
Provost, F. and Fawcett, T. (2001). Robust classification for imprecise environments. Machine Learning, 42(3):203–231.
Quinlan, J. R. (1993). C4.5 Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA.
Veloso, A. and Jr., W. M. (2005). Rule generation and rule selection techniques for cost-sensitive associative classification. In 20 Simpósio Brasileiro de Bancos de Dados, pages 295–309.
Yin, X. and Han, J. (2003). CPAR: Classification based on predictive association rules. In Proc. of the 3rd SIAM Int. Conf. on Data Mining, San Francisco, CA. SIAM.
Publicado
30/06/2007
Como Citar
BATISTA, Gustavo E. A. P. A.; PRATI, Ronaldo C.; MONARD, Maria Carolina; GIUSTI, Rafael; MILAR, Claudia R..
Classificação Associativa Utilizando Seleção e Construção de Regras: um Estudo Comparativo. In: ENCONTRO NACIONAL DE INTELIGÊNCIA ARTIFICIAL E COMPUTACIONAL (ENIAC), 6. , 2007, Rio de Janeiro/RJ.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2007
.
p. 1321-1330.
ISSN 2763-9061.
