Differentially Private Group-by Data Releasing Algorithm

  • Iago Chaves UFC
  • Javam Machado UFC


Privacy concerns are growing fast because of data protection regulations around the world. Many works have built private algorithms avoiding sensitive information leakage through data publication. Differential privacy, based on formal definitions, is a strong guarantee for individual privacy and the cutting edge for designing private algorithms. This work proposes a differentially private group-by algorithm for data publication under the exponential mechanism. Our method publishes data groups according to a specified attribute while maintaining the desired privacy level and trustworthy utility results.

Palavras-chave: differential privacy, privacy, group-by


Chen, R., Mohammed, N., Fung, B. C., Desai, B. C., and Xiong, L. (2011). Publishing setvalued data via differential privacy. Proceedings of the VLDB Endowment, 4(11):1087-1098. DOI: http://vldb.org/pvldb/vol4/p1087-chen.pdf

Dua, D. and Graff, C. (2017). UCI machine learning repository.

Dwork, C. (2008). Differential privacy: A survey of results. In Agrawal, M., Du, D., Duan, Z., and Li, A., editors, Theory and Applications of Models of Computation, pages 1–19, Berlin, Heidelberg. Springer Berlin Heidelberg. DOI: https://doi.org/10.1007/978-3-540-79228-4_1

Dwork, C. (2011). Differential privacy. Encyclopedia of Cryptography and Security, pages 338–340. DOI: https://doi.org/10.1007/978-1-4419-5906-5_752

Harmanci, A. and Gerstein, M. (2016). Quantification of private information leakage from phenotype-genotype data: linking attacks. Nature methods, 13(3):251. DOI: https://doi.org/10.1038/nmeth.3746

McSherry, F. and Talwar, K. (2007). Mechanism design via differential privacy. In FOCS, volume 7, pages 94–103. DOI: https://doi.org/10.1109/focs.2007.66

McSherry, F. D. (2009). Privacy integrated queries: an extensible platform for privacy preserving data analysis. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data, pages 19–30. ACM. DOI: https://doi.org/10.1145/1559845.1559850

Mendonça, A. L., Brito, F. T., Linhares, L. S., and Machado, J. C. (2017). Dipcoding: A differentially private approach for correlated data with clustering. In Proceedings of the 21st International Database Engineering & Applications Symposium, pages 291–297. ACM. DOI: https://doi.org/10.1145/3105831.3105861

Powers, D. M. (2011). Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. DOI: https://doi.org/10.9735/2229-3981
CHAVES, Iago; MACHADO, Javam. Differentially Private Group-by Data Releasing Algorithm. In: SIMPÓSIO BRASILEIRO DE BANCO DE DADOS (SBBD), 34. , 2019, Fortaleza. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2019 . p. 271-276. ISSN 2763-8979. DOI: https://doi.org/10.5753/sbbd.2019.8835.