Systematic Mapping of Data Quality in the Public Sector: Dimensions, Metrics and Practices
Abstract
Data quality plays a critical role in ensuring the reliability and effectiveness of information systems in public sector organizations. However, fragmented data ecosystems, legacy systems, and a lack of standardized assessment practices challenge the implementation of consistent data quality strategies. This study presents a comprehensive systematic mapping of data quality dimensions, metrics, practices, and challenges across public institutions. A total of 53 peer-reviewed studies were analyzed, revealing the most frequently used data quality dimensions and associated metrics. The research highlights a wide spectrum of techniques, ranging from traditional validation rules to emerging applications of machine learning for anomaly detection and data imputation. In addition, it identifies key barriers — including limited automation, lack of governance, and scarce resources — and proposes actionable recommendations for public sector entities. The findings serve as a conceptual and practical foundation for improving data quality management in complex government environments, with the ultimate goal of supporting the implementation of data quality assessment strategies in a state-level public finance institution in Brazil, the Ceará State Treasury Department (Sefaz-CE).
References
Claudia Mihaela Balan. 2014. E-government quality of data. In 2014 First International Conference on eDemocracy & eGovernment (ICEDEG). IEEE, 91–95. DOI: 10.1109/ICEDEG.2014.6819958
Carlo Batini, Cinzia Cappiello, Chiara Francalanci, and Andrea Maurino. 2009. Methodologies for data quality assessment and improvement. ACM Comput. Surv. DOI: 10.1145/1541880.1541883
Donald J. Berndt, James A. McCart, Dezon K. Finch, and Stephen L. Luther. 2015. A Case Study of Data Quality in Text Mining Clinical Progress Notes. ACM Transactions on Management Information System (2015). DOI: 10.1145/2669368
Mario Bochicchio and Antonella Longo. 2002. An Effective Approach to Reduce the ihAvalanche Effectln in the Management of Fiscal Data in Local Public Administration. Software Maintenance, IEEE International Conference on 0 (10 2002), 0560.
Christian Bors, Theresia Gschwandtner, Simone Kriglstein, Silvia Miksch, and Margit Pohl. 2018. Visual Interactive Creation, Customization, and Analysis of Data Quality Metrics. J. Data and Information Quality (2018). DOI: 10.1145/3190578
Isabelle Boydens. 2011. Strategic Issues Relating to Data Quality for E-Government: Learning from an Approach Adopted in Belgium. Springer New York, New York, NY, 113–130. DOI: 10.1007/978-1-4419-7533-1_7
Sapa Chanyachatchawan, Krich Nasingkun, Patipat Tumsangthong, Porntiwa Chata, Marut Buranarach, and Monsak Socharoentum. 2023. Design and Implementation of a Data Governance Framework and Platform: A Case Study of a National Research Organization of Thailand. In 2023 20th International Joint Conference on Computer Science and Software Engineering. DOI: 10.1109/JCSSE58229.2023.10201972
Abiola Paterne Chokki, Antoine Clarinval, Anthony Simonofski, and Benoît Vanderose. 2023. Evaluating a Conversational Agent for Open Government Data Quality Assessment. In 29th Annual Americas Conference on Information Systems, AMCIS 2023.
Roland Croft,MAli Babar, andMMehdi Kholoosi. 2023. Data quality for software vulnerability datasets. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). IEEE, 121–133. DOI: 10.1109/ICSE48619.2023.00022
Ian Davidson, Ashish Grover, Ashwin Satyanarayana, and Giri K Tayi. 2004. A general approach to incorporate data quality matrices into data mining algorithms. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. 794–798. DOI: 10.1145/1014052.1016916
Jeremy Debattista, Sören Auer, and Christoph Lange. 2016. Luzzu—A Methodology and Framework for Linked Data Quality Assessment. ACM Journal of Data and Information Quality (2016). DOI: 10.1145/2992786
Ilie Cristian Dorobăt, and Vlad Posea. 2021. Open Data Indicator: An Accumulative Methodology for Measuring the Quality of Open Government Data. In 2021 13th International Conference on Electronics, Computers and Artificial Intelligence. DOI: 10.1109/ECAI52376.2021.9515147
Mohammad Reza Effendy, Tien Fabrianti Kusumasari, and Muhammad Azani Hasibuan. 2019. Star Schema Implementation For Monitoring in Data Quality Management Tool (A Case Study at A Government Agency). In Proceedings of 2019 4th International Conference on Informatics and Computing, ICIC 2019. DOI: 10.1109/ICIC47613.2019.8985695
Widad Elouataoui, Saida El Mendili, and Youssef Gahi. 2024. Active Metadata and Machine Learning based Framework for Enhancing Big Data Quality. NISS 2024, April 18, 19, 2024, MEKNES, AA, Morocco (2024). DOI: 10.1145/3659677.3659707
Gregor Endler. 2012. Data quality and integration in collaborative environments. SIGMOD/PODS’12 PhD Symposium (2012). DOI: 10.1145/2213598.2213606
Wenfei Fan. 2015. Data Quality: From Theory to Practice. SIGMOD Rec. (2015). DOI: 10.1145/2854006.2854008
International Organization for Standardization. 2008. ISO/IEC 25012:2008 — Software engineering — Software product Quality Requirements and Evaluation (SQuaRE) — Data quality model. Technical Report ISO/IEC 25012:2008. International Organization for Standardization.
International Organization for Standardization. 2015. ISO/IEC 25024:2015 — Systems and software engineering — Systems and software Quality Requirements and Evaluation (SQuaRE) — Measurement of data quality. Technical Report ISO/IEC 25024:2015. International Organization for Standardization.
Julien Freudiger, Shantanu Rane, Alejandro E Brito, and Ersin Uzun. 2014. Privacy preserving data quality assessment for high-fidelity data sharing. In Proceedings of the 2014 ACM workshop on information sharing & collaborative security. 21–29. DOI: 10.1145/2663876.2663885
Jerry Gao, Chunli Xie, and Chuanqi Tao. 2016. Big Data Validation and Quality Assurance – Issuses, Challenges, and Needs. In 2016 IEEE Symposium on Service-Oriented System Engineering. DOI: 10.1109/SOSE.2016.63
Bethany L Hedt-Gauthier, Lyson Tenthani, Shira Mitchell, Frank M Chimbwandira, Simon Makombe, Zengani Chirwa, Erik J Schouten, Marcello Pagano, and Andreas Jahn. 2012. Improving data quality and supervision of antiretroviral therapy sites in Malawi: An application of Lot Quality Assurance Sampling. BMC Health Services Research (2012). DOI: 10.1186/1472-6963-12-196
Mohamad Taha Ijab, Ely Salwana Mat Surin, and Norshita Mat Nayan. 2019. Conceptualizing big data quality framework from a systematic literature review perspective. Malaysian Journal of Computer Science (2019). DOI: 10.22452/mjcs.sp2019no1.2
DAMA International. 2017. The DAMA Guide to the Data Management Body of Knowledge (DAMA-DMBOK) (2nd ed.). Technics Publications.
Sean Kandel, Ravi Parikh, Andreas Paepcke, Joseph M Hellerstein, and Jeffrey Heer. 2012. Profiler: Integrated statistical analysis and visualization for data quality assessment. In Proceedings of the International Working Conference on Advanced Visual Interfaces. 547–554. DOI: 10.1145/2254556.2254659
Sesillia Fajar Kristyanti, Tien Fabrianti Kusumasari, and Ekky Novriza Alam. 2020. Operational Dashboard Development as A Data Quality Monitoring Tools Using Data Deduplication Profiling Result. In Proceedings - 2020 6th International Conference on Science and Technology, ICST 2020. DOI: 10.1109/ICST50505.2020.9732870
Gómez-Omella Meritxell, Basilio Sierra, and Susana Ferreiro. 2022. On the Evaluation, Management and Improvement of Data Quality in Streaming Time Series. IEEE Access (2022). DOI: 10.1109/ACCESS.2022.3195338
P. Missier, G. Lalk, V. Verykios, F. Grillo, T. Lorusso, and P. Angeletti. 2003. Improving data quality in practice: A case study in the italian public administration. Distributed and Parallel Databases (2003). DOI: 10.1023/A:1021548024224
Per Myrseth, Jørgen Stang, and Vibeke Dalberg. 2011. A data quality framework applied to e-government metadata: A prerequsite to establish governance of interoperable e-services. In 2011 International Conference on E-Business and EGovernment, ICEE2011 - Proceedings. DOI: 10.1109/ICEBEG.2011.5881298
Fred Nsubuga, Henry Luzze, Immaculate Ampeire, Simon Kasasa, Opar Bernard Toliva, and Alex Ario Riolexus. 2018. Factors that affect immunization data quality in Kabarole District, Uganda. PLoS ONE (2018). DOI: 10.1371/journal.pone.0203747
Leo L. Pipino, Yang W. Lee, and Richard Y. Wang. 2002. Data quality assessment. Commun. ACM (2002). DOI: 10.1145/505248.506010
Maria Priestley, Fionntán O’donnell, and Elena Simperl. 2023. A Survey of Data Quality Requirements That Matter in ML Development Pipelines. J. Data and Information Quality (2023). DOI: 10.1145/3592616
Arie Purwanto, Anneke Zuiderwijk, and Marijn Janssen. 2020. Citizens’ trust in open government data: a quantitative study about the effects of data quality, system quality and service quality. In Proceedings of the 21st Annual International Conference on Digital Government Research. 310–318. DOI: 10.1145/3396956.3396958
Muhammad Badriansyah Putra, Fahmi Alaydrus, Ira Sulistyowati, Teguh Raharjo, and Riko Wijayanto. 2022. Issues and Challenges of the Data Analytics Development Project in The Center of Information System and Financial Technology. In 2022 1st International Conference on Information System & Information Technology. DOI: 10.1109/ICISIT54091.2022.9872715
Thomas C Redman. 1998. The impact of poor data quality on the typical enterprise. Commun. ACM 41, 2 (1998), 79–82.
Fedri Ruluwedrata Rinawan, Afina Faza, Ari Indra Susanti, Wanda Gusdya Purnama, Noormarina Indraswari, Didah, Dani Ferdian, Siti Nur Fatimah, Ayi Purbasari, Arief Zulianto, Atriany Nilam Sari, Intan Nurma Yulita, Muhammad Fiqri Abdi Rabbi, and Riki Ridwana. 2022. Posyandu Application for Monitoring Children Under-Five: A 3-Year Data Quality Map in Indonesia. ISPRS International Journal of Geo-Information (2022). DOI: 10.3390/ijgi11070399
Mujiono Sadikin, Purwanto S. Katidjan, Arif Rifai Dwiyanto, Nurfiyah, Ajif Yunizar Pratama Yusuf, and Adi Trisnojuwono. 2025. Improving the MSMEs data quality assurance comprehensive framework with deep learning technique. Indonesian Journal of Electrical Engineering and Computer Science (2025). DOI: 10.11591/ijeecs.v37.i1.pp613-626
Tanapat Samakit, Chutiporn Anutariya, and Marut Buranarach. 2023. QUALYST: Data Quality Assessment System for Thailand Open Government Data. In Proceedings of JCSSE 2023 - 20th International Joint Conference on Computer Science and Software Engineering. DOI: 10.1109/JCSSE58229.2023.10202060
Mariutsi Alexandra Osorio Sanabria, Ferney Orlando Amaya Fernández, and Mayda Patricia González Zabala. 2018. Colombian Case Study for the Analysis of Open Data Government: a Data Quality Approach. In ICEGOV ’18, April 4–6, 2018, Galway, Ireland. Association for Computing Machinery. DOI: 10.1145/3209415.3209474
Flavia Serra, Verónika Peralta, Adriana Marotta, and Patrick Marcel. 2024. Use of Context in Data Quality Management: A Systematic Literature Review. J. Data and Information Quality (2024). DOI: 10.1145/3672082
Christian Sillaber, Clemens Sauerwein, Andrea Mussmann, and Ruth Breu. 2016. Data Quality Challenges and Future Research Directions in Threat Intelligence Sharing Practice. WISCS’16, October 24 2016 (2016). DOI: 10.1145/2994539.2994546
Ahmet Soylu, Óscar Corcho, Brian Elvesæter, Carlos Badenes-Olmedo, Francisco Yedro-Martínez, Matej Kovacic, Matej Posinkovic, Mitja Medvešček, Ian Makgill, Chris Taggart, Elena Simperl, Till C. Lech, and Dumitru Roman. 2022. Data Quality Barriers for Transparency in Public Procurement. Information (Switzerland) (2022). DOI: 10.3390/info13020099
Justin St-Maurice and Catherine Burns. 2017. An Exploratory Case Study to Understand Primary Care Users and Their Data Quality Tradeoffs. In J. Data and Information Quality. Association for Computing Machinery. DOI: 10.1145/3058750
Ikbal Taleb, Mohamed Adel Serhani, Chafik Bouhaddioui, and Rachida Dssouli. 2021. Big data quality framework: a holistic approach to continuous quality management. Journal of Big Data (2021). DOI: 10.1186/s40537-021-00468-0
Jaak Tepandi, Mihkel Lauk, Janar Linros, Priit Raspel, Gunnar Piho, Ingrid Pappel, and Dirk Draheim. 2017. The data quality framework for the Estonian public sector and its evaluation: establishing a systematic process-oriented viewpoint on cross-organizational data quality. In Transactions on Large-Scale Data-and Knowledge-Centered Systems XXXV. Springer, 1–26. DOI: 10.1007/978-3-662-56121-8_1
David Tien. 2010. Project management and data quality control. In Proceedings - 2010 IEEE International Conference on Emergency Management and Management Sciences, ICEMMS 2010. DOI: 10.1109/ICEMMS.2010.5563378
Marco Torchiano, Antonio Vetrò, and Francesca Iuliano. 2017. Preserving the Benefits of Open Government Data by Measuring and Improving Their Quality: An Empirical Study. 2017 IEEE 41st Annual Computer Software and Applications Conference (2017). DOI: 10.1109/COMPSAC.2017.192
Strong Diane M Wang, Richard Y. 1996. Beyond accuracy: What data quality means to data consumers. Journal of Management Information Systems 12, 4 (1996), 5–33. DOI: 10.1080/07421222.1996.11518099
Xinhao Wang, Lulin Xu, Qin Zhang, Da Zhang, and Xiliang Zhang. 2022. Evaluating the data quality of continuous emissions monitoring systems in China. Journal of Environmental Management (2022). DOI: 10.1016/j.jenvman.2022.115081
Claes Wohlin, Per Runeson, Martin Host, Magnus C Ohlsson, Bjorn Regnell, and Anders Wesslen. 2012. Experimentation in Software Engineering (2012 ed.). Springer, Berlin, Germany.
Hongjiang Xu. 2015. What Are the Most Important Factors for Accounting Information Quality and Their Impact on AIS Data Quality Outcomes? ACM Journal of Data and Information Quality (2015). DOI: 10.1145/2700833
Li Ya, Song Heliang, and Xu Yingcheng. 2020. Method for Calculating theWeights of Internet + Government Service Data Quality Assessment Indexes Based on Analytic Hierarchy Process. In Journal of Physics: Conference Series. DOI: 10.1088/1742-6596/1584/1/012043
Peter Z. Yeh and Colin A. Puri. 2010. An Efficient and Robust Approach for Discovering Data Quality Rules. In Proceedings of the 2010 22nd IEEE International Conference on Tools with Artificial Intelligence - Volume 01 (ICTAI ’10). IEEE Computer Society, USA, 248–255. DOI: 10.1109/ICTAI.2010.43
Alivia Yulfitri. 2016. Modeling operational model of data governance in government: Case study: Government agency X in Jakarta. In 2016 International Conference on Information Technology Systems and Innovation. DOI: 10.1109/ICITSI.2016.7858207
Zahirah Zainuddin and Emelia Akashah P. Akhir. 2024. Systematic Literature Review of Data Quality in Open Government Data: Trend, Methods, and Applications. IEEE Access (2024). DOI: 10.1109/ACCESS.2024.3475577
Pengcheng Zhang, Xuewu Zhou, Wenrui Li, and Jerry Gao. 2017. A Survey on Quality Assurance Techniques for Big Data Applications. In 2017 IEEE Third International Conference on Big Data Computing Service and Applications. DOI: 10.1109/BigDataService.2017.42
Ruojing Zhang, Marta Indulska, and Shazia Sadiq. 2019. Discovering Data Quality Problems: The Case of Repurposed Data. Business and Information Systems Engineering (2019). DOI: 10.1007/s12599-019-00608-0
Zheng Zhu, Yingjie Tian, and Hongshan Yang. 2024. Research on Power Data Quality Analysis Method Based on Verification Rules in Big Data Environment. CIBDA 2024 - China (2024). DOI: 10.1145/3671151.3671359
