Local Dampening: Differential Privacy for Non-numeric Queries via Local Sensitivity

  • Victor Aguiar Evangelista de Farias Universidade Federal do Ceará (UFC)
  • Javam de Castro Machado Universidade Federal do Ceará (UFC)

Resumo


Differential privacy is the state-of-the-art formal definition for data release under strong privacy guarantees. We present the local dampening mechanism a differentially private mechanism for non-numeric queries. Our approach is the first to leverage the notion of local sensitivity to reduce noise injected to the output. We develop a theoretical accuracy analysis to show the conditions that our approach performs accurately and we conduct an experimental evaluation with competitors on diverse problems. Those contributions were published on VDLB conference and on the special issue of the VLDB journal. Non-related contributions were published in SBBD, SBRC, CLOSER and FGCS. This work was carried out in cooperation with AT&T Labs Research - USA.

Palavras-chave: Differential privacy, Data anonymization, Graph analysis, Decision Trees

Referências

Blake, C. L. and Merz, C. J. (1998). Uci repository of machine learning databases.

Brasil (2018). Lei geral de proteção de dados pessoais (lgpd).

Cavalcante, D. M., de Farias, V. A., Sousa, F. R., Paula, M. R. P., Machado, J. C., and de Souza, J. N. (2018). Popring: A popularity-aware replica placement for distributed key-value store. CLOSER, 2018:440–447.

Commission, E. (2018). 2018 reform of eu data protection rules.

de Farias, V. A. E., Brito, F. T., Flynn, C., Machado, J. C., Majumdar, S., and Srivastava, D. (2020). Local dampening: Differential privacy for non-numeric queries via local sensitivity. Proc. VLDB Endow., 14(4):521–533.

Dwork, C. (2011). Differential privacy. Encyclopedia of Cryptography and Security, pages 338–340

Dwork, C., Kenthapadi, K., McSherry, F., Mironov, I., and Naor, M. (2006a). Our data, ourselves: Privacy via distributed noise generation. In Annual International Conference on the Theory and Applications of Cryptographic Techniques, pages 486–503. Springer.

Dwork, C., McSherry, F., Nissim, K., and Smith, A. (2006b). Calibrating noise to sensitivity in private data analysis. In Theory of cryptography conference, pages 265–284. Springer.

Farias, V. (2021). Local dampening: differential privacy for non-numeric queries via local sensitivity. PhD thesis, Universidade Federal do Ceará.

Farias, V., Pinheiro, P., Sousa, F., Gomes, J., and Machado, J. (2017). Online performance modeling for nosql databases using extreme learning machines. In Anais do XXXII Simpósio Brasileiro de Bancos de Dados, pages 276–281, Porto Alegre, RS, Brasil. SBC.

Farias, V. A., Brito, F. T., Flynn, C., Machado, J. C., Majumdar, S., and Srivastava, D. (2023). Local dampening: Differential privacy for non-numeric queries via local sensitivity. The VLDB Journal, pages 1–24.

Farias, V. A., Sousa, F. R., Maia, J. G. R., Gomes, J. P. P., and Machado, J. C. (2018). Regression based performance modeling and provisioning for nosql cloud databases. Future Generation Computer Systems, 79:72–81.

Friedman, A. and Schuster, A. (2010). Data mining with differential privacy. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 493–502

Kotsiantis, S. B., Zaharakis, I., and Pintelas, P. (2007). Supervised machine learning: A review of classification techniques. Emerging artificial intelligence applications in computer engineering, 160:3–24.

Leskovec, J. and Krevl, A. (2014). SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data

Lima, M. I., de Farias, V. A., Praciano, F. D., and Machado, J. C. (2018). Workload-aware parameter selection and performance prediction for in-memory databases. In Anais do XXXIII Simpósio Brasileiro de Banco de Dados, pages 169–180. SBC.

Ma, H., Yang, H., Lyu, M. R., and King, I. (2008). Mining social networks using heat diffusion processes for marketing candidates selection. In Proceedings of the 17th ACM Conference on Information and Knowledge Management, pages 233–242.

Machanavajjhala, A., He, X., and Hay, M. (2017). Differential privacy in the wild: A tutorial on current practices & open challenges. In Proceedings of the 2017 ACM SIGMOD International Conference on Management of data, pages 1727–1730. ACM.

Manton, K. G. (2010). National long-term care survey: 1982, 1984, 1989, 1994, 1999, and 2004. Inter-university Consortium for Political and Social Research.

McKenna, R. and Sheldon, D. R. (2020). Permute-and-flip: A new mechanism for differentially private selection. Advances in Neural Information Processing Systems, 33.

McSherry, F. and Talwar, K. (2007). Mechanism design via differential privacy. In 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS’07), pages 94–103.

Nissim, K., Raskhodnikova, S., and Smith, A. (2007). Smooth sensitivity and sampling in private data analysis. In Proceedings of the thirty-ninth annual ACM symposium on Theory of computing, pages 75–84. ACM.

Paula, M. R. P., Rodrigues, E., Farias, V. A., Sousa, F. R., and Machado, J. C. (2017). Bacos: A dynamic load balancing strategy for cloud object storage. In Anais do XXXV Simpósio Brasileiro de Redes de Computadores e Sistemas Distribuídos. SBC.

Quinlan, J. R. (1986). Induction of decision trees. Machine learning, 1(1):81–106.

Series, I. P. U. M. (2015). Version 6.0. Minneapolis: University of.

Zhang, J., Cormode, G., Procopiuc, C. M., Srivastava, D., and Xiao, X. (2015). Private release of graph statistics using ladder functions. In Proceedings of the 2015 ACM SIGMOD international conference on management of data, pages 731–745. ACM.
Publicado
25/09/2023
Como Citar

Selecione um Formato
DE FARIAS, Victor Aguiar Evangelista; MACHADO, Javam de Castro. Local Dampening: Differential Privacy for Non-numeric Queries via Local Sensitivity. In: CONCURSO DE TESES E DISSERTAÇÕES (CTDBD) - SIMPÓSIO BRASILEIRO DE BANCO DE DADOS (SBBD), 38. , 2023, Belo Horizonte/MG. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 285-299. DOI: https://doi.org/10.5753/sbbd_estendido.2023.232504.