Profiling for Confidence: Debugging Relationships among Urban Spatio-Temporal Datasets

  • Laís M. A. Rocha UFMG
  • Mirella M. Moro UFMG
  • Juliana Freire NYU

Resumo


We aim to help users identify potential issues in spatio-temporal data and thus gain trust in the results they derive from such data -- a crucial benefit in the era of data science and big data. We propose a framework for profiling spatio-temporal relationships that automatically identifies data slices that deviate from what is expected, which can be further analyzed for quality issues and/or potential effects on analysis' results. We describe the profiling methodology and present cases studies using real urban datasets, then emphasizing the need for spatio-temporal profiling to build trust on data analysis' results.


 
Palavras-chave: Urban Data, Complex Relationships, Profiling

Referências

Alin, A. (2010). Simpson’s paradox. Wiley Interdisciplinary Reviews: Computational Statistics, 2(2):247-250.

Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B (Methodological), 57(1):289-300.

Kalpic, D., Hlupic, N., and Lovric, M. (2011). Student’s t-tests. International encyclopedia of statistical science, pages 1559-1563.

Rocha, L. M., Bessa, A., Chirigati, F., OFriel, E., Moro, M. M., and Freire, J. (2019). Understanding spatio-temporal urban processes. In IEEE Big Data, pages 563-572.
Publicado
30/06/2020
ROCHA, Laís M. A.; MORO, Mirella M.; FREIRE, Juliana. Profiling for Confidence: Debugging Relationships among Urban Spatio-Temporal Datasets. In: CONCURSO DE TESES E DISSERTAÇÕES (CTD), 33. , 2020, Cuiabá. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2020 . p. 91-96. ISSN 2763-8820. DOI: https://doi.org/10.5753/ctd.2020.11375.