Efficient Data Mining for Frequent Itemsets in Dynamic and Distributed Databases

  • Adriano Veloso UFMG
  • Wagner Meira Jr UFMG

Resumo


Data Mining is one of the central activities associated with understanding and exploiting the world of digital data. It is the mechanized process of modeling large databases by means of discovering useful patterns. A frequent itemset is a pattern describing a relevant subset of the data, and a collection of frequent itemsets is particularly useful because it is an extremely compact model of the database. Discovering frequent itemsets in large databases is usually a hard computational task, which can be even harder when data is dynamic and distributed. Applying traditional algorithms in such data results in high communication overhead, excessive wastage of CPU and I/O resources, privacy violations, and often does not meet the stringent rapid response times, to essentially an interactive process of exploiting the data. Hence, there is an urgent need for non-trivial algorithms that can effectively mine frequent itemsets in dynamic and distributed databases. Such algorithms are presented in this master thesis.

Referências

R. Agrawal et al., Mining Association Rules between Sets of Items in Large Databases, In Proc. of the Int. Conf. on Management of Data, SIGMOD, 207-216, ACM, 1993.

M. Zaki et al., New Algorithms for Fast Discovery of Association Rules, In Proc. of the Int. Conf. on Knowledge Discovery and Data Mining, SIGKDD, 283-290, 1997.

S. Thomaz et al., An Efficient Algorithm for Incremental Updation of Association Rules, In Proc. of the Int. Conf. on Knowledge Discovery and Data Mining, SIGKDD, 263-266, 1997.

D. Cheung et al., Maintenance of Discovered Association Rules in Large Databases: Incremental Updating Technique, In Proc. of the Int. Conf. on Data Eng., ICDE, 86-96, 1996.

M. Otey and A. Veloso and W. Meira, Mining Frequent Itemsets in Distributed and Dynamic Databases, In Proc. of the Int. Conf. on Data Mining, ICDM, IEEE, 617-620, 2003.

A. Veloso et al., Real World Association Mining, In Advances in Databases, BNCOD, Lecture Notes in Computer Science: 2405, 77-89, Springer, 2002.

A. Veloso et al., Geração Eficiente de Regras de Associação em Bases de Dados Dinâmicas, In Proc. of the Brazilian Computer Society Conference, SBC, Florianópolis, Brazil, July, 2002.

A. Veloso et al., Mining Reliable Models of Associations in Dynamic Databases, In Proc. of the Brazilian Symp. on Databases, SBBD, 263-277, Brazil, October, 2002.

A. Veloso et al., Mining Frequent Itemsets in Evolving Databases, In Proc. of the Int. Conf. on Data Mining, SDM, April, SIAM, 31-41, 2002.

A. Veloso et al., Efficiently Mining Approximate Models of Associations in Databases, Principles of Data Mining and Knowledge Discovery, PKDD, 435-448, 2002.

A. Veloso et al., New Parallel Algorithms for Frequent Itemset Mining in Large Databases, In Proc. of the Symp. on Comp. Arch. and High Perf. Comp., SBAC, IEEE, 158-166, 2003.

A. Veloso et al., Efficient, Accurate and Privacy-Preserving Mining for Frequent Itemsets in Distributed Databases, In Proc. of the Braz. Symp. on Databases, SBBD, 281-292, 2003.

A. Veloso et al., Parallel and Distributed Frequent Itemset Mining on Dynamic Datasets, In Proc. of the High Perf. Comp. Conf., HiPC, India, IEEE/ACM-SIGARCH, 184-193, 2003.

A. Veloso et al., Parallel and Distributed Methods for Incremental Frequent Itemset Mining, In Transactions on Systems, Men and Cybernetics, IEEE, 2004.

A. Veloso, Efficient Data Mining for Frequent Itemsets in Dynamic and Distributed Databases, Master Thesis, December, 2003.
Publicado
31/07/2004
VELOSO, Adriano; MEIRA JR, Wagner. Efficient Data Mining for Frequent Itemsets in Dynamic and Distributed Databases. In: CONCURSO DE TESES E DISSERTAÇÕES (CTD), 17. , 2004, Salvador/BA. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2004 . p. 84-88. ISSN 2763-8820.