Theoretical Analysis of the Impact of Missing Data in Sensitive Attributes on the p%-rule Fairness Metric

  • Dimas Cassimiro Nascimento Federal University of Agreste of Pernambuco
  • Daliton da Silva Federal University of Agreste of Pernambuco
  • Luis Filipe Alves Pereira Federal University of Agreste of Pernambuco

Abstract


Algorithmic fairness assessment has become a central topic in the analysis of automated decision-making systems. In this paper, we investigate how the use of imputation techniques to fill in missing sensitive data can affect the value of the p%-rule metric, depending on the degree of imputation error. We propose a formal mathematical model to quantify this impact, considering symmetric and bidirectional imputation errors. Additionally, we analyze the minimum number of manual corrections required to achieve a desired improvement in the fairness metric. The results provide a quantitative foundation for understanding the trade-offs between the cost of manual intervention and the quality of imputed data in algorithmic fairness analysis.

Keywords: fairness, missing data, sensitive attributes

References

Agarwal, S. and Mishra, S. (2021). Responsible AI. Springer.

Barocas, S. and Selbst, A. D. (2016). Big data’s disparate impact. California Law Review, 104(3):671–732.

Feldman, M., Friedler, S. A., Moeller, J., Scheidegger, C., and Venkatasubramanian, S. (2015). Certifying and removing disparate impact. In proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pages 259–268.

Fernando, M.-P., Cèsar, F., David, N., and José, H.-O. (2021). Missing the missing values: The ugly duckling of fairness in machine learning. International Journal of Intelligent Systems, 36(7):3217–3258.

Little, R. J. and Rubin, D. B. (2019). Statistical Analysis with Missing Data. John Wiley & Sons.

Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., and Galstyan, A. (2021). A survey on bias and fairness in machine learning. ACM computing surveys (CSUR), 54(6):1–35.

Mitchell, S. e. a. (2021). Algorithmic fairness: Choices, assumptions, and definitions. Communications of the ACM, 64(5):58–66.

Oliveira, T. A., Oliveira, J. V., Farias, T. P., Cruz, E. W., Andrade, L. J., and Pita, R. (2024). Estudo experimental sobre justiça algorítmica aplicada em modelos de análise de crédito. In Simpósio Brasileiro de Banco de Dados (SBBD), pages 29–36. SBC.

Wang, Y. and Singh, L. (2021). Analyzing the impact of missing values and selection bias on fairness. International Journal of Data Science and Analytics, 12(2):101–119.

Zafar, M. B., Valera, I., Rogriguez, M. G., and Gummadi, K. P. (2017). Fairness constraints: Mechanisms for fair classification. In Artificial intelligence and statistics, pages 962–970. PMLR.
Published
2025-09-29
NASCIMENTO, Dimas Cassimiro; SILVA, Daliton da; PEREIRA, Luis Filipe Alves. Theoretical Analysis of the Impact of Missing Data in Sensitive Attributes on the p%-rule Fairness Metric. In: BRAZILIAN SYMPOSIUM ON DATABASES (SBBD), 40. , 2025, Fortaleza/CE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 767-773. ISSN 2763-8979. DOI: https://doi.org/10.5753/sbbd.2025.247565.