Achieving GDPR Compliance through Provenance: An Extended Model

Abstract


The approval of the General Data Protection Regulation (GDPR) brought a revolution in the way we treat data produced in digital media. The GDPR increases individuals’ participation in the treatment of their data, and it also introduces technical challenges, whose failure can lead to a fine of 4% of the organization’s annual revenue. Among many approaches that aim to contribute to the solutions of challenges introduced by GDPR, there is a research branch promoting the use of data provenance as a means to make transparent the increasingly complex workflows of systems. However, existing provenance models are not fully compliant with the GDPR. In this paper, we aim to contribute to the evolution of the GDPR data provenance model proposed by Ujcich et al.. We suggest eleven new changes that make the model more apparent and more compatible with the GDPR text. We also present two design patterns that should guide us in using these changes in real contexts.

Keywords: GDPR, Provenance

References

Aldeco Perez, R. and Moreau, L. (2008). Provenance-based auditing of private data use. In BCS International Academic Conference.

Bartolini, C., Muthuri, R., and Santos, C. (2015). Using ontologies to model data protection requirements in workflows. In JSAI International Symposium on Artificial Intelligence, pages 233–248. Springer.

Basin, D., Debois, S., and Hildebrandt, T. (2018). On purpose and by necessity: compliance under the gdpr. In International Conference on Financial Cryptography and Data Security, pages 20–37. Springer.

Bates, A., Tian, D. J., Butler, K. R., and Moyer, T. (2015). Trustworthy whole-system provenance for the linux kernel. In 24th {USENIX} Security Symposium ({USENIX} Security 15), pages 319–334.

Bier, C. (2013). How usage control and provenance tracking get together-a data protection perspective. In 2013 IEEE Security and Privacy Workshops, pages 13–17. IEEE.

Bonatti, P., Kirrane, S., Polleres, A., and Wenning, R. (2017). Transparent personal data processing: The road ahead. In International Conference on Computer Safety, Reliability, and Security, pages 337–349. Springer.

Council of European Union (2016). Council regulation (EU) no 2016/679. https://eur-lex.europa.eu/eli/reg/2016/679/oj.

Freire, J., Koop, D., Santos, E., and Silva, C. T. (2008). Provenance for computational tasks: A survey. Computing in Science & Engineering, 10(3):11–21.

Garijo, D. and Gil, Y. (2013). P-Plan: The P-Plan ontology. W3C recommendation, W3C. https://www.opmw.org/model/p-plan17092013/.

GDPR.EU (2019). 2019 GDPR Small Business Survey: Insights from European small business leaders one year into the General Data Protection Regulation. https://gdpr.eu/wp-content/uploads/2019/05/2019-GDPR. EU-Small-Business-Survey.pdf.

Gjermundrød, H., Dionysiou, I., and Costa, K. (2016). privacytracker: a privacy-bydesign gdpr-compliant framework with verifiable data traceability controls. In International Conference on Web Engineering, pages 3–15. Springer.

Kuner, C. (2012). The european commission’s proposed data protection regulation: A copernican revolution in european data protection law. Bloomberg BNA Privacy and Security Law Report (2012) February, 6(2012):1–15.

Martin, A. P., Lyle, J., and Namiluko, C. (2012). Provenance as a security control. In TaPP.

Moreau, L. and Missier, P. (2013). PROV-dm: The PROV data model. W3C recommendation, W3C. http://www.w3.org/TR/2013/REC-prov-dm-20130430/.

Ozsoyoglu, G. and Snodgrass, R. T. (1995). Temporal and real-time databases: A survey. IEEE Transactions on Knowledge and Data Engineering, 7(4):513–532.

Pandit, H. J. and Lewis, D. (2017). Modelling provenance for gdpr compliance using linked open data vocabularies. In PrivOn@ ISWC.

Pandit, H. J., O’Sullivan, D., and Lewis, D. (2019). Test-driven approach towards gdpr compliance. In Acosta, M., Cudré-Mauroux, P., Maleshkova, M., Pellegrini, T., Sack, H., and Sure-Vetter, Y., editors, Semantic Systems. The Power of AI and Knowledge Graphs, pages 19–33, Cham. Springer International Publishing.

Pasquier, T. F.-M., Singh, J., Eyers, D., and Bacon, J. (2015). Camflow: Managed datasharing for cloud services. IEEE Transactions on Cloud Computing, 5(3):472–484.

Pohly, D. J., McLaughlin, S., McDaniel, P., and Butler, K. (2012). Hi-fi: collecting highfidelity whole-system provenance. In Proceedings of the 28th Annual Computer Security Applications Conference on, pages 259–268.

Shastri, S., Banakar, V., Wasserman, M., Kumar, A., and Chidambaram, V. (2019). Understanding and benchmarking the impact of gdpr on database systems. arXiv preprint arXiv:1910.00728.

Tankard, C. (2016). What the gdpr means for businesses. Network Security, 2016(6):5–8.

Ujcich, B. E., Bates, A., and Sanders, W. H. (2018). A provenance model for the european union general data protection regulation. In International Provenance and Annotation Workshop, pages 45–57. Springer.

Wang, L., Near, J. P., Somani, N., Gao, P., Low, A., Dao, D., and Song, D. (2019). Data capsule: A new paradigm for automatic compliance with data privacy regulations. In Heterogeneous Data Management, Polystores, and Analytics for Healthcare, pages 3–23. Springer.
Published
2020-09-28
CAMPAGNA, Daniel Prett; DA SILVA, Altigran Soares; BRAGANHOLO, Vanessa. Achieving GDPR Compliance through Provenance: An Extended Model. In: BRAZILIAN SYMPOSIUM ON DATABASES (SBBD), 35. , 2020, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2020 . p. 13-24. ISSN 2763-8979. DOI: https://doi.org/10.5753/sbbd.2020.13621.