Fairness-Oriented Entity Resolution Tool for Streaming Data
Resumo
Entity Resolution (ER) plays a crucial role, facilitating the integration of knowledge bases and identifying similarities among entities from different sources. In this work, we address the following challenges: streaming data, incremental processing, and fairness. There is a lack of studies involving fairness and ER, which is related to the absence of discrimination or bias. Considering this context, this work presents TREATS, a fairness-aware ER tool able to deal with streaming and incremental data, which goes beyond matching based on the similarity scores and also applies to target fairness constraints. Overall, our contributions aim to advance the field of ER by offering a matching tool that considers both technical challenges and ethical considerations.
Palavras-chave:
Entity Resolution, Machine Learning, Fairness, Data Quality
Referências
Araújo, T. B., Stefanidis, K., Pires, C. E. S., Nummenmaa, J., and da Nóbrega, T. P. (2022). Incremental entity blocking over heterogeneous streaming data. Information, 13(12):568.
Chen, R., Shen, Y., and Zhang, D. (2021). Gnem: a generic one-to-set neural entity matching framework. In Proceedings of the Web Conference 2021, pages 1686–1694.
Christophides, V., Efthymiou, V., Palpanas, T., Papadakis, G., and Stefanidis, K. (2021). An overview of end-to-end entity resolution for big data. ACM Comput. Surv., 53(6):127:1–127:42.
Christophides, V., Efthymiou, V., and Stefanidis, K. (2015). Entity resolution in the web of data. Synthesis Lectures on the Semantic Web, 5(3):1–122.
Efthymiou, V., Stefanidis, K., Pitoura, E., and Christophides, V. (2021). FairER: entity resolution with fairness constraints. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pages 3004–3008.
Li, Y., Li, J., Suhara, Y., Doan, A., and Tan, W.-C. (2020). Deep entity matching with pre-trained language models. Proceedings of the VLDB Endowment, 14(1):50–60.
Pitoura, E., Stefanidis, K., and Koutrika, G. (2022). Fairness in rankings and recommendations: an overview. The VLDB Journal, pages 1–28.
Shahbazi, N., Danevski, N., Nargesian, F., Asudeh, A., and Srivastava, D. (2023). Through the fairness lens: Experimental analysis and evaluation of entity matching. Proceedings of the VLDB Endowment, 16(11):3279–3292.
Chen, R., Shen, Y., and Zhang, D. (2021). Gnem: a generic one-to-set neural entity matching framework. In Proceedings of the Web Conference 2021, pages 1686–1694.
Christophides, V., Efthymiou, V., Palpanas, T., Papadakis, G., and Stefanidis, K. (2021). An overview of end-to-end entity resolution for big data. ACM Comput. Surv., 53(6):127:1–127:42.
Christophides, V., Efthymiou, V., and Stefanidis, K. (2015). Entity resolution in the web of data. Synthesis Lectures on the Semantic Web, 5(3):1–122.
Efthymiou, V., Stefanidis, K., Pitoura, E., and Christophides, V. (2021). FairER: entity resolution with fairness constraints. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pages 3004–3008.
Li, Y., Li, J., Suhara, Y., Doan, A., and Tan, W.-C. (2020). Deep entity matching with pre-trained language models. Proceedings of the VLDB Endowment, 14(1):50–60.
Pitoura, E., Stefanidis, K., and Koutrika, G. (2022). Fairness in rankings and recommendations: an overview. The VLDB Journal, pages 1–28.
Shahbazi, N., Danevski, N., Nargesian, F., Asudeh, A., and Srivastava, D. (2023). Through the fairness lens: Experimental analysis and evaluation of entity matching. Proceedings of the VLDB Endowment, 16(11):3279–3292.
Publicado
14/10/2024
Como Citar
ARAÚJO, Tiago Brasileiro; EFTHYMIOU, Vasilis; STEFANIDIS, Kostas; GUERRA, Rafael de Souza.
Fairness-Oriented Entity Resolution Tool for Streaming Data. In: DEMONSTRAÇÕES E APLICAÇÕES - SIMPÓSIO BRASILEIRO DE BANCO DE DADOS (SBBD), 39. , 2024, Florianópolis/SC.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2024
.
p. 119-124.
DOI: https://doi.org/10.5753/sbbd_estendido.2024.242809.