The Impact of Privacy Regulations on DB Systems
Keywords:Databases, GDPR, Purpose, Data Access
Personal data usage and collection are activities that used to grow unrestricted. However, several laws in the physical world ensure rights to people regarding their privacy and information usage. In the last years, legislators passed many laws, regulations, and acts to replicate these rights to the digital world. By doing so, new constraints, rights, and duties appear on every component of the data usage and collection workflow. In this paper, we discuss legislations’ implications, identifying impacts that these regulations introduce to current DBMS, and survey recent works that aim to solve the problems raised by these impacts, highlighting research opportunities and identifying how solutions can be achieved for the problems.
A29WPT, A. . W. P. T. Article 29 Working Party. https://ec.europa.eu/newsroom/article29/news-overview.cfm, 2018. [Online; accessed 03-Mars-2021].
Byun, J. and Li, N. Purpose based access control for privacy protection in relational database systems. VLDB J. 17 (4): 603–619, 2008.
Cohen, A. and Nissim, K. Towards formalizing the GDPR’s notion of singling out. Proceedings of the National Academy of Sciences 117 (15): 8344–8352, 2020.
Dalibo. PostgreSQL Anonymizer. https://postgresql-anonymizer.readthedocs.io/en/stable/, 2020. [Online; accessed 14-Mars-2021].
Deeds, K., Hentschel, B., and Idreos, S. Stacked filters: Learning to filter by structure. Proceedings of the VLDB Endowment 14 (4): 600 – 612, 2021.
Deshpande, A. Sypse: Privacy-first Data Management through Pseudonymization and Partitioning . In CIDR. www.cidrdb.org, 2021.
Domingo-Ferrer, J., Sánchez, D., and Soria-Comas, J. Database anonymization: Privacy models, data utility, and microaggregation-based inter-model connections. Synthesis Lectures on Information Security, Privacy, & Trust 8 (1): 1–136, 2016.
Dwork, C. Differential privacy. In Automata, Languages and Programming, 33rd International Colloquium, ICALP 2006, Venice, Italy, July 10-14, 2006, Proceedings, Part II. Lecture Notes in Computer Science, vol. 4052. Springer, pp. 1–12, 2006.
Fung, B. C. M., Wang, K., Chen, R., and Yu, P. S. Privacy-preserving data publishing: A survey of recent developments. ACM Computing Surveys 42 (4): 1–53, June, 2010.
General Data Protection Regulation. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46. Official Journal of the European Union vol. 59, pp. 1–88, 2016.
Graefe, G., Guy, W., and Sauer, C. Instant Recovery with Write-Ahead Logging: Page Repair, System Restart, Media Restore, and System Failover, Second Edition. Synthesis Lectures on Data Management. Morgan & Claypool Publishers, 2016.
Haubenschild, M., Sauer, C., Neumann, T., and Leis, V. Rethinking logging, checkpoints, and recovery for high-performance storage engines. In SIGMOD Conference. ACM, pp. 877–892, 2020.
Kessler, S., Hoff, J., and Freytag, J. SAP HANA goes private - from privacy research to privacy aware enterprise analytics. Proc. VLDB Endow. 12 (12): 1998–2009, 2019.
Kotsogiannis, I., Tao, Y., Machanavajjhala, A., Miklau, G., and Hay, M. Architecting a differentially private SQL engine. In CIDR. www.cidrdb.org, 2019.
Kraska, T., Stonebraker, M., Brodie, M. L., Servan-Schreiber, S., and Weitzner, D. J. SchengenDB: A data protection database proposal. In Heterogeneous Data Management, Polystores, and Analytics for Healthcare - VLDB 2019 Workshops, Poly and DMAH, Los Angeles, CA, USA, August 30, 2019. Lecture Notes in Computer Science, vol. 11721. Springer, pp. 24–38, 2019.
Machado, J. C. and Amora, P. R. P. How can db systems be ready for privacy regulations. In SBBD. SBC, 2020.
Machanavajjhala, A., Kifer, D., Gehrke, J., and Venkitasubramaniam, M. L-diversity: Privacy beyond k-anonymity. TKDD, 2007.
MariaDB. MaxScale Filters Masking. https://mariadb.com/kb/en/mariadb-maxscale-21-masking/, 2019. [Online; accessed 06-Mars-2021].
McSherry, F. Privacy integrated queries: an extensible platform for privacy-preserving data analysis. Communications of the ACM 53 (9): 89–97, 2010.
Microsoft. Dynamic data masking. https://docs.microsoft.com/en-us/azure/azure-sql/database/ dynamic-data-masking-overview, 2021. [Online; accessed 25-Feb-2021].
Mishra, M. and Somani, A. K. On-disk data processing: Issues and future directions. CoRR vol. abs/1709.02718, 2017.
Oracle Inc. Oracle Audit Vault, 2021. Accessed: 2021-03-16.
Pappachan, P., Yus, R., Mehrotra, S., and Freytag, J. Sieve: A middleware approach to scalable access control for database management systems. Proc. VLDB Endow. 13 (11): 2424–2437, 2020.
Pena, E. H. M., Filho, E. R. L., de Almeida, E. C., and Naumann, F. Efficient detection of data dependency violations. In CIKM. ACM, pp. 1235–1244, 2020.
Riggs, S., Menon-Sen, A., Barwick, I., and Steele, D. PGAudit, 2021. Accessed: 2021-03-16.
Rizvi, S., Mendelzon, A. O., Sudarshan, S., and Roy, P. Extending query rewriting techniques for fine-grained access control. In SIGMOD Conference. ACM, pp. 551–562, 2004.
Sarkar, S., Papon, T. I., Staratzis, D., and Athanassoulis, M. Lethe: A tunable delete-aware LSM engine. In SIGMOD Conference. ACM, pp. 893–908, 2020.
Schwarzkopf, M., Kohler, E., Kaashoek, M. F., and Morris, R. T. Position: GDPR compliance by construction. In Poly/DMAH@VLDB. Lecture Notes in Computer Science, vol. 11721. Springer, pp. 39–53, 2019.
Shah, A., Banakar, V., Shastri, S., Wasserman, M., and Chidambaram, V. Analyzing the impact of GDPR on storage systems. In HotStorage. USENIX Association, 2019.
Shastri, S., Banakar, V., Wasserman, M., Kumar, A., and Chidambaram, V. Understanding and benchmarking the impact of GDPR on database systems. Proc. VLDB Endow. 13 (7): 1064–1077, 2020.
Sweeney, L. k-anonymity: A model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 10 (5): 557–570, 2002.
Zhang, H., Liu, X., Andersen, D. G., Kaminsky, M., Keeton, K., and Pavlo, A. Order-preserving key compression for in-memory search trees. In SIGMOD Conference. ACM, pp. 1601–1615, 2020.