ParTriCluster: A Scalable Parallel Algorithm for Gene Expression Analysis
Resumo
Analyzing gene expression patterns is becoming a highly relevant task in the bio informatics area. This analysis makes it possible to determine the behavior patterns of genes under various conditions, a fundamental information for treating diseases, among other applications. An advance in this area is the tricluster algorithm, which is the first algorithm capable of determining 3D clusters, that is, it determines clusters of sets of genes that behave similarly in a set of samples and set of time stamps. However, while biological experiments collect an increasing amount of data to be analyzed and correlated, the triclustering problem is NP-complete, and its parallelization seems to be an essential step towards obtaining feasible solutions. In this paper we propose and evaluate the implementation of a parallel version of the tricluster algorithm using the filter-labeled-stream paradigm supported by the Anthill parallel programming environment. The results show that our parallelization scales linearly with the data size. Further, the parallelization strategy is applicable to any depth-first searches
Palavras-chave:
Parallel algorithms, Gene expression, Algorithm design and analysis, Pattern analysis, Clustering algorithms, Informatics, Information analysis, Diseases, Data analysis, Parallel programming
Publicado
18/10/2006
Como Citar
TRIELLI, Guilherme; ORAIR, Gustavo; MEIRA, Wagner; FERREIRA, Renato; GUEDES, Dorgival.
ParTriCluster: A Scalable Parallel Algorithm for Gene Expression Analysis. In: INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 18. , 2006, Ouro Preto/MG.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2006
.
p. 3-10.
