ARTERIAL: A Natural Language Processing Model for Prevention of Information Leakage from Electronic Health Records
Resumo
Over the past decade, there has been a steady increase in health security breaches. Therefore, healthcare organizations must protect their sensitive information such as test results, diagnoses, prescriptions, research, and customer personal information. A leak of sensitive data can result in significant economic loss and damage to the organization’s image. In this sense, Data Leakage Prevention (DLP) systems can help to identify, monitor, protect, and reduce the risks of leaking sensitive data. However, state-of-the-art DLP solutions only use signature comparisons and static comparisons. Therefore, we propose to develop the ARTERIAL model based on Natural Language Processing (NLP), Entity Recognition (NER), and Artificial Neural Networks (ANN) to be more assertive in extracting information and recognizing entities from Electronic Health Records (EHR). Different from the current literature, ARTERIAL considers semantic features present in the EHR. Three approaches were implemented and tested, two based on ANN and the following based on machine learning algorithms. As a result, the approach taken in its implementation using a machine learning algorithm reached 98.0% of Precision, 86.0% of Recall, and 91.0% of F1-Score.
Palavras-chave:
component, formatting, style, styling, insert
Publicado
21/11/2023
Como Citar
GOLDSCHMIDT, Guilherme; ZEISER, Felipe André; RIGHI, Rodrigo Da Rosa; COSTA, Cristiano André Da.
ARTERIAL: A Natural Language Processing Model for Prevention of Information Leakage from Electronic Health Records. In: SIMPÓSIO BRASILEIRO DE ENGENHARIA DE SISTEMAS COMPUTACIONAIS (SBESC), 13. , 2023, Porto Alegre/RS.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2023
.
p. 97-102.
ISSN 2237-5430.