Anthill: a scalable run-time environment for data mining applications
Resumo
Data mining techniques are becoming increasingly more popular as a reasonable means to collect summaries from the rapidly growing datasets in many areas. However, as the size of the raw data increases, parallel data mining algorithms are becoming a necessity. In this paper, we present a run-time support system that was designed to allow the efficient implementation of data-mining algorithms on heterogeneous distributed environments. We believe that the runtime framework is suitable for a broader class of applications, beyond data mining. We also present a parallelization strategy that is supported by the run-time system. We show scalability results of three different data-mining algorithms that were parallelized using our approach and our run-time support. All applications scale almost linearly up to a large number of nodes.
Palavras-chave:
Runtime environment, Data mining, Application software, Clustering algorithms, Algorithm design and analysis, Scalability, Computer science, Costs, Memory, Data analysis
Publicado
24/10/2005
Como Citar
FERREIRA, R. A. et al.
Anthill: a scalable run-time environment for data mining applications. In: INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 17. , 2005, Rio de Janeiro/RJ.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2005
.
p. 159-166.
