MER: Multimodal Entity Resolution

  • Paulo Henrique Santos Lima Federal University of Goiás (UFG)
  • Leonardo Andrade Ribeiro Federal University of Goiás (UFG)

Abstract


The task of Entity Resolution (ER) consists of identifying records that refer to the same real-world entity. While traditional approaches focus on textual data, the growth of multimodal sources demands solutions capable of handling this diversity. This paper proposes an approach for Multimodal Entity Resolution (MER), in which each record is composed of both textual and visual information. The architecture is based on the CLIP model, extended with multimodal fusion mechanisms and loss functions inspired by strategies from Multimodal Entity Linking (MEL). Preliminary results on two public datasets delivered F1-scores of up to 93.6%, indicating that the proposed approach is promising.
Keywords: Data cleaning, information filtering, and publishing, Information integration and interoperability, Machine Learning, AI, data management and data systems

References

Barlaug, N. and Gulla, J. A. (2021). Neural Networks for Entity Matching: A Survey. ACM Trans. Knowl. Discov. Data, 15(3):52:1–52:37.

Caldeira, L. and Ferreira, A. (2018). Melhorias no Processo de Blocagem para Resolução de Entidades Baseadas na Relevância dos Termos. In Proceedings of the Brazilian Symposium on Databases, pages 61–72.

Chen, D. and Zhang, R. (2024). Building Multimodal Knowledge Bases With Multimodal Computational Sequences and Generative Adversarial Networks. Trans. Multi., 26:2027–2040.

Elmagarmid, A. K., Ipeirotis, P. G., and Verykios, V. S. (2007). Duplicate Record Detection: A Survey. IEEE Trans. Knowl. Data Eng., 19(1):1–16.

Freire, J., Fan, G., Feuer, B., Koutras, C., Liu, Y., Peña, E., Santos, A. S. R., Silva, C. T., and Wu, E. (2025). Large Language Models for Data Discovery and Integration: Challenges and Opportunities. IEEE Data Engineering Bulletin, 49(1):3–31.

Li, Y., Li, J., Suhara, Y., Doan, A., and Tan, W. (2023). Effective Entity Matching with Transformers. VLDB Journal, 32(6):1215–1235.

Lima, P. H. S., Santana, D. R., Martins, W. S., and Ribeiro, L. A. (2023). Evaluation of Deep Learning Techniques for Entity Matching. In International Conference on Enterprise Information Systems, pages 247–254.

Liu, Q., He, Y., Xu, T., Lian, D., Liu, C., Zheng, Z., and Chen, E. (2024). UniMEL: A Unified Framework for Multimodal Entity Linking with Large Language Models. In Proceedings of CIKM, pages 1909–1919.

Mudgal, S., Li, H., Rekatsinas, T., Doan, A., Park, Y., Krishnan, G., Deep, R., Arcaute, E., and Raghavendra, V. (2018). Deep Learning for Entity Matching: A Design Space Exploration. In Proceedings of the SIGMOD Conference, pages 19–34. ACM.

Newcombe, H. B., Kennedy, J. M., Axford, S. J., and James, A. P. (1959). Automatic Linkage of Vital Records. Science, 130(3381):954–959.

Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., and Sutskever, I. (2021). Learning Transferable Visual Models From Natural Language Supervision. CoRR, abs/2103.00020.

Santana, D. R., Lima, P., and Ribeiro, L. (2025). EM-Join: Efficient Entity Matching Using Embedding-Based Similarity Join. In International Conference on Enterprise Information Systems, pages 402–409.

Song, S., Zhao, S., Wang, C., Yan, T., Li, S., Mao, X., and Wang, M. (2024). A Dual-Way Enhanced Framework from Text Matching Point of View for Multimodal Entity Linking. In Proceedings of AAAI, pages 19008–19016.

Sun, W., Fan, Y., Guo, J., Zhang, R., and Cheng, X. (2022). Visual Named Entity Linking: A New Dataset and A Baseline. In Goldberg, Y., Kozareva, Z., and Zhang, Y., editors, Proceedings of EMNLP, pages 2403–2415.

Zhou, X., Wang, P., Li, G., Xie, J., and Wu, J. (2021). Weibo-MEL, Wikidata-MEL and Richpedia-MEL: Multimodal Entity Linking Benchmark Datasets. pages 315–320.
Published
2025-09-29
SANTOS LIMA, Paulo Henrique; ANDRADE RIBEIRO, Leonardo. MER: Multimodal Entity Resolution. In: WORKSHOP ON THESIS AND DISSERTATION (WTDBD) - BRAZILIAN SYMPOSIUM ON DATABASES (SBBD), 40. , 2025, Fortaleza/CE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 196-202. DOI: https://doi.org/10.5753/sbbd_estendido.2025.247723.