A Survey on the State of Practice in the Adoption of Schema Matching in Brazilian Companies

  • Ricardo H. D. Borges UFG
  • Leonardo Andrade Ribeiro UFG
  • Valdemar V. Graciano Neto UFG

Resumo


Context: Data integration is a fundamental activity for achieving full interoperability between Information Systems (IS). One of the most critical steps in this process is schema matching, which involves identifying correspondences between schema elements to ensure that equivalent information is correctly associated. Problem: The schema matching activity can be extremely time-consuming, expensive, repetitive, labor-intensive, and error-prone, potentially leading to significant losses. In this context, companies can employ tools to support or partially/fully automate this activity. However, the current state of such practices in Brazil remains unclear. Solution: Understanding the state of practice in Brazil can help guide research and development efforts by revealing how data integration is currently conducted in companies. This provides evidence of practical approaches and informs the structuring of training programs and the development of technologies to support such initiatives. IS Theory: We rely on Socio-technical systems theory, since we consider social and technical aspects as interdependent parts of a complex system. Method: A survey research was carried out. Summarization of Results: Thirty-five data professionals from 12 states and the Federal District participated in the survey. The results show that, despite familiarity with the concept, practical adoption faces challenges such as schema complexity, a lack of tools, and the absence of specific training. Nevertheless, for those who apply the technique, the benefits are clear: (i) more efficient data integration, (ii) error reduction, and (iii) increased agility. Contributions and Impact on the IS Area: The results contribute by providing evidence that can help guide research efforts and technological advancements to support schema matching in Brazil. This aims to enable full interoperability among information systems in the future, addressing Brazil’s Grand Challenges in Information Systems, such as Systems-of-Systems, Smart Cities, and Full Interoperability, which serves as an enabler for the other two.

Referências

Nicolas Atmatzides, Marcos Bedo, and Daniel de Oliveira. 2022. Adoção de SGBDs NoSQL em Empresas Brasileiras: um Levantamento Preliminar. In SBBD. SBC, Búzios, 385–390. DOI: 10.5753/sbbd.2022.226015

Matheus Batista, Andréa Magdaleno, and Marcos Kalinowski. 2017. A Survey on the use of Social BPM in Practice in Brazilian Organizations. In Anais do XIII Simpósio Brasileiro de Sistemas de Informação (Lavras). SBC, Porto Alegre, RS, Brasil, 436–443. DOI: 10.5753/sbsi.2017.6073

Philip A. Bernstein and Laura M. Haas. 2008. Information Integration in the Enterprise. Commun. ACM 51, 9 (2008), 72–79.

Flavia Cristina Berrnardini, José Viterbo, Dalessandro Vianna, Carlos Bazilio Martins, Adriana Pereira Medeiros, Edwin Meza, Patrick Moratori, and Carlos Alberto Malcher Bastos. 2017. Grand Challenges for Information Systems in Brazil for the Decade 2016-2026. SBC, Chapter General Features of Smart City Approaches from Information Systems Perspective and Its Challenges.

Júlio Campos, Vitor Almeida, Elvismary Armas, Geiza Silva, Eduardo Corseuil, and Fernando Gonzalez. 2023. INSIDE: an Ontology-based Data Integration System Applied to the Oil and Gas Sector. In Anais do SBSI 2023.

Roberto Dias, Rodrigo Zacarias, Jorge Luis Varella, and Rodrigo dos Santos. 2022. Investigating Information Security in Systems-of-Systems. In Anais do XVIII Simpósio Brasileiro de Sistemas de Informação (Curitiba). SBC, Porto Alegre, RS, Brasil. [link]

AnHai Doan, Alon Y. Halevy, and Zachary G. Ives. 2012. Principles of Data Integration. Morgan Kaufmann.

Ana Carolina Ferronato, Fernanda Pires, and Flavia Bernardini. 2016. Um Modelo para Integração e Disponibilização de Dados na Área de Saúde Governamental. In Anais do SBSI 2016. Florianópolis, 124–127.

Valdemar Vicente Graciano Neto, Flavio Oquendo, and Elisa Yumi Nakagawa. 2017. Grand Challenges for Information Systems in Brazil for the Decade 2016-2026. SBC, Chapter Smart Systems-of-Information Systems: Foundations and an Assessment Model for Research Development.

Laura M. Haas. 2007. Beauty and the Beast: The Theory and Practice of Information Integration. In Proc. of the International Conference on Database Theory. 28–43.

Laura M. Haas, Mauricio A. Hernández, Howard Ho, Lucian Popa, and Mary Roth. 2005. Clio Grows Up: From Research Prototype to Industrial Tool. In Proceedings of the SIGMOD Conference. 805–810.

Alon Y. Halevy. 2005. Why Your Data Won’t Mix. ACM Queue 3, 8 (2005), 50–58.

Ahmed A. Harby and Farhana H. Zulkernine. 2022. From Data Warehouse to Lakehouse: A Comparative Review. In Proceedings of the IEEE BigData. IEEE, 389–395.

Mark Kasunic. 2005. Designing an effective survey. Carnegie Mellon University, Software Engineering Institute Pittsburgh, PA (01 2005), 142.

Guoliang Li. 2017. Human-in-the-loop Data Integration. Proceedings of the VLDB Endowment 10, 12 (2017), 2006–2017.

Johan Linaker, Sardar Muhammad Sulaman, Martin Höst, and Rafael Maiani de Mello. 2015. Guidelines for conducting surveys in software engineering v. 1.1. Lund University 50 (2015).

Rita Suzana P. Maciel and Regina Braga José Maria N. David, Daniela Barreiro Claro. 2017. Full Interoperability: Challenges and Opportunities for Future Information Systems. In I GranDSI-BR – Grand Research Challenges in Information Systems in Brazil 2016-2026, Clodis Boscarioli, Renata M. Araujo, and Rita Suzana P. Maciel (Eds.). SBC, 107–118. DOI: 978-85-7669-384-0 Capítulo de eBook.

Jefferson Seide Molléri, Kai Petersen, and Emilia Mendes. 2016. Survey guidelines in software engineering: An annotated review. In Proc. of the 10th ACM/IEEE ESEM. 1–6.

Eduardo Soares Paiva, Kate Cerqueira Revoredo, and Fernanda Araujo Baião. 2016. DW-CGU: Integração dos Dados do Portal da Transparência do Governo Federal Brasileiro. iSys 9, 1 (2016), 6–32. DOI: 10.5753/isys.2016.298

Ana Paula Perin, Deivid Silva, and Natasha Valentim. 2022. Investigating the Teaching of Block Programming in High School. In Anais do XVIII Simpósio Brasileiro de Sistemas de Informação (Curitiba). SBC, Porto Alegre, RS, Brasil. [link]

Erhard Rahm and Philip A. Bernstein. 2001. A Survey of Approaches to Automatic Schema Matching. The VLDB Journal 10, 4 (2001), 334–350.

Edward Smith, Robert Loftin, Emerson Murphy-Hill, Christian Bird, and Thomas Zimmermann. 2013. Improving developer participation rates in surveys. In 6th CHASE. IEEE, 89–92.

Claes Wohlin, Per Runeson, Martin Höst, Magnus C Ohlsson, Björn Regnell, and Anders Wesslén. 2012. Experimentation in software engineering. Springer Science & Business Media.

Yunjia Zhang, Avrilia Floratou, Joyce Cahoon, Subru Krishnan, Andreas C. Müller, Dalitso Banda, Fotis Psallidas, and Jignesh M. Patel. 2023. Schema Matching using Pre-Trained Language Models. In Proceedings of the ICDE Conference. IEEE, 1558–1571.
Publicado
19/05/2025
BORGES, Ricardo H. D.; RIBEIRO, Leonardo Andrade; GRACIANO NETO, Valdemar V.. A Survey on the State of Practice in the Adoption of Schema Matching in Brazilian Companies. In: SIMPÓSIO BRASILEIRO DE SISTEMAS DE INFORMAÇÃO (SBSI), 21. , 2025, Recife/PE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 673-682. DOI: https://doi.org/10.5753/sbsi.2025.246615.

Artigos mais lidos do(s) mesmo(s) autor(es)