Encontrando Regras de Associação sem Especificar Suporte Mínimo e Confiança Mínima
Resumo
The extraction of information and knowledge in databases has been assuming a relevant role in aiding decision making. One of the main areas of research is association rule mining. This area makes possible to capture the relationships among the attributes present in a database. Most algorithms used to extract association rules use support and confidence as parameters. Support represents the proportion of a given rule in the database and confidence represents the validity of this rule. Thus, professionals responsible for data analysis need to identify and define support and confidence thresholds (minimum support and minimum confidence, respectively) to obtain association rules. However, in certain contexts, it is difficult to identify good values for support and confidence in order to obtain the desired rules. In these situations, it may be necessary to run several queries with different values of support and confidence until the desired rules are obtained. The purpose of this research is to examine association rules mining techniques and algorithms capable of obtaining association rules without the need of specifying support and confidence, to propose new algorithms and analyze these algorithms in terms of performance and quality of the rules obtained.
Referências
Agrawal, R., Srikant, R., et al. (1994). Fast algorithms for mining association rules. In VLDB.
Devi, M. R. (2012). Applications of association rule mining in different databases. Journal of Global Research in Computer Science.
Djenouri, Y. and Comuzzi, M. (2017). Combining Apriori heuristic and bio-inspired algorithms for solving the frequent itemsets mining problem. Information Sciences.
Fournier-Viger, P., Lin, J. C.-W., Vo, B., Chi, T. T., Zhang, J., and Le, H. B. (2017). A survey of itemset mining. Wiley Interdisciplinary Reviews: DMKD.
Galvão, N. D. and Marin, H. d. F. (2009). Técnica de mineração de dados: uma revisão da literatura. Acta Paulista de Enfermagem.
Huang, Y.-P. and Kao, L.-J. (2004). Using fuzzy support and confidence setting to mine interesting association rules. In AMFI. IEEE.
Kameya, Y. and Sato, T. (2012). Rp-growth: top-k mining of relevant patterns with minimum support raising. In SIAM.
Moslehi, F. and Haeri, A. (2020). A genetic algorithm-based framework for mining quantitative association rules without specifying minimum support and minimum confidence. Scientia Iranica.
Nguyen, L. T., Vo, B., Nguyen, L. T., Fournier-Viger, P., and Selamat, A. (2018). ETARM: an efficient top-k association rule mining algorithm. Applied Intelligence.
Qodmanan, H. R., Nasiri, M., and Minaei-Bidgoli, B. (2011). Multi objective association rule mining with genetic algorithm without specifying minimum support and minimum confidence. ESWP.
Riondato, M. and Upfal, E. (2015). Mining frequent itemsets through progressive sampling with rademacher averages. In SIGKDD.
Santos, M. A. S. d. (2017). Estudo comparativo de algoritmos exaustivos para mineração de padrões discriminativos em bases de dados biomédicas. Master’s thesis, UFP.
Vlachou, A., Doulkeridis, C., Rocha-Junior, J. B., and Nørvag, K. (2022). On decisive skyline queries. In ICBDAKD. Springer.
Zaki, M. J., Parthasarathy, S., Li, W., and Ogihara, M. (1997). Evaluation of sampling for data mining of association rules. In International Workshop on Research Issues in Data Engineering. High Performance Database Management for Large-Scale Applications. IEEE.