Combining Semantic Graph Features and a Common Data Model to Exploit the Interoperability of Patient Databases

Resumo


Given a set of Electronic Health Records (EHRs), how can we semantically model the available concepts and provide tools for data analysis? EHRs following a common data model (CDM) usually provide meaningful organization and vocabulary to health-related databases, prompting data interoperability. However, hidden relationships among attributes within the CDM bring the need for CDM-tailored analysis tools regarding exploratory tasks. We propose GraFOCAL for analyzing CDM-based databases considering semantic graph features. GraFOCAL combines pairs of attributes with semantic descriptions in graph edges and node features. Preliminary results show the usefulness of GraFOCAL’s features and visual tools in spotting findings in a real-world dataset. In future work, we aim to extend the proposed approach with automatic knowledge inference for the semantic linkage between variables.

Palavras-chave: Data mining, knowledge graph, electronic health records, common data model, interoperabiliy

Referências

Andrade, M. J. and Medeiros, C. B. (2023). Linking heterogeneous health data sources in brazil centered on drug leaflet processing. In SBBD 2023, pages 366–371. SBC. DOI: 10.5753/sbbd.2023.233356.

Cazzolato, M. T. et al. (2023). Exploratory data analysis in electronic health records graphs: Intuitive features and visualization tools. In CBMS 2023, pages 117–122. IEEE. DOI: 10.1109/CBMS58004.2023.00202.

da Costa, F. J. et al. (2022). Dikw4iot: Uma abordagem baseada na hierarquia DIKW para a construção de grafos de conhecimento para integração de dados de iot. In SBBD 2022, pages 190–202. SBC. DOI: 10.5753/sbbd.2022.224648.

de Lima, D. M. et al. (2019). Transforming two decades of ePR data to OMOP CDM for clinical research. In MEDINFO 2019, volume 264, pages 233–237. IOS Press. DOI: 10.3233/SHTI190218.

de Souza, E. M. F. et al. (2022). Visualização interativa da evolução de grafos de conhecimento. In SBBD 2022, pages 343–354. SBC. DOI: 10.5753/sbbd.2022.224301.

Fidalgo, P. et al. (2022). Star-bridge: a topological multidimensional subgraph analysis to detect fraudulent nodes and rings in telecom networks. In Big Data 2022, pages 2239–2242. DOI: 10.1109/BigData55660.2022.10020714.

Gupta, N. et al. (2018). Beyond outlier detection: Lookout for pictorial explanation. In ECML PKDD 2018, volume 11051 of LNCS, pages 122–138. Springer. DOI: 10.1007/978-3-030-10925-7, 8.

Nouri, M. et al. (2021). VISEMURE: A visual analytics system for making sense of multimorbidity using electronic medical record data. J. Data, 6(8):85. DOI: 10.3390/DATA6080085.

OHDSI (2024). The Book of OHDSI — observational health data sciences and informatics. [link]. Last accessed in 27-06-2024.

Overhage, J. M. et al. (2011). Validation of a common data model for active safety surveillance research. In Journal JAMIA, volume 19, pages 54–60. DOI: 10.1136/amiajnl-2011-000376.

Stang, P. et al. (2010). Advancing the science for active surveillance: Rationale and design for the observational medical outcomes partnership. In Annals of internal medicine, volume 153, pages 600–6. DOI: 10.1059/0003-4819-153-9-201011020-00010.

Wang, Y., Peng, Y., and Guo, J. (2024). Enhancing knowledge graph embedding with structure and semantic features. In Appl. Intell., volume 54, pages 2900–2914. DOI: 10.1007/S10489-024-05315-2.

Xiao, G. et al. (2023). FHIR-Ontop-OMOP: Querying OMOP clinical databases as fhir-compliant clinical knowledge graphs. volume 3415 of CEUR Workshop, pages 165–166. CEUR-WS.org. DOI: 10.1016/j.jbi.2022.104201.

Yang, P. et al. (2024). LMKG: A large-scale and multi-source medical knowledge graph for intelligent medicine applications. Knowl. Based Syst., 284:111323. DOI: 10.1016/J.KNOSYS.2023.111323.
Publicado
14/10/2024
CONRADO, Rafael C. G.; GUTIERREZ, Marco A.; TRAINA JR., Caetano; TRAINA, Agma J. M.; CAZZOLATO, Mirela T.. Combining Semantic Graph Features and a Common Data Model to Exploit the Interoperability of Patient Databases. In: SIMPÓSIO BRASILEIRO DE BANCO DE DADOS (SBBD), 39. , 2024, Florianópolis/SC. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 701-707. ISSN 2763-8979. DOI: https://doi.org/10.5753/sbbd.2024.243153.