Clustering Multivariate Data Streams by Correlating Attributes using Fractal Dimension


  • Christian C. Bones No affiliation declared
  • Luciana A. S. Romani Embrapa Agricultural Informatics
  • Elaine P. M. de Sousa ICMC - USP



Data Mining, Data Streams, Clustering, Fractal Dimension


A data stream is a flow of data produced continuously along the time.
Storing and analyzing such information become challenging due to exponential growth of the data volume collected.
Recently, some algorithms have been proposed to cluster data streams as a whole, but just few of them deal with multivariate data streams.
Even so, these algorithms merely aggregate the attributes without touching upon the correlation among them.
Aiming to overcome this issue, we propose a new framework to cluster multivariate data streams based on their evolving behavior over time, exploring the correlations among their attributes by computing the fractal dimension.
In order to evaluate our framework we used real multisource and multidimensional climate data streams.
Our results show that the clusters' quality and compactness can be improved compared to the competing methods, leading to the thoughtfulness that attributes correlations cannot be put aside.
In fact, the clusters' compactness are 14 to 25 times better using our method.
Also our framework was 3 to 20 times faster than our competitors.
Our framework also proves to be an useful tool to assist meteorologists in understanding the climate behavior along a period of time.


Download data is not yet available.

Author Biography

Christian C. Bones, No affiliation declared





How to Cite

Bones, C. C., A. S. Romani, L., & P. M. de Sousa, E. (2017). Clustering Multivariate Data Streams by Correlating Attributes using Fractal Dimension. Journal of Information and Data Management, 7(3), 249.



SBBD 2015