Abstract
Advances in textual classification can foster quality in existing clinical systems. Our research explored experimentally text classification methods applied in non-synthetic oncology clinical notes corpora. The experiments were performed in a dataset with 3,308 medical notes. Experiments evaluated the following machine learning and deep learning classification methods: Multilayer Perceptron Neural network, Logistic Regression, Decision Tree classifier, Random Forest classifier, K-nearest neighbors classifier, and Long-Short Term Memory. An experiment evaluated the influence of the corpora preprocessing step on the results, allowing us to identify that the classifier’s mean accuracy was leveraged from 26.1% to 86.7% with the per-clinical-event corpus and 93.9% with the per-patient corpus. The best-performing classifier was the Multilayer Perceptron, which achieved 93.90% accuracy, a Macro F1 score of 93.61%, and a Weighted F1 score of 93.99%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Sabra, S., Alobaidi, M., Malik, K.M., Sabeeh, V.: Performance evaluation for semantic-based risk factors extraction from clinical narratives. In: IEEE 8th Annual Computing and Communication Workshop and Conference (CCWC). Las Vegas, NV, USA, vol. 2018, pp. 695–701 (2018). https://doi.org/10.1109/CCWC.2018.8301742
Glatzer, M., Panje, C.M., Sirén, C., Cihoric, N., Putora, P.M.: Decision making criteria in oncology. Oncology 98(6), 370–378 (2020). Epub 2018 Sep 18. PMID: 30227426. https://doi.org/10.1159/000492272
Reyes-Ortiz, J.A., González-Beltrán, B.A., Gallardo-López, L.: Clinical decision support systems: a survey of NLP-based approaches from unstructured data. In: 26th International Workshop on Database and Expert Systems Applications (DEXA). Valencia, Spain vol. 2015, pp. 163–167 (2015). https://doi.org/10.1109/DEXA.2015.47
Alemzadeh, H., Devarakonda, M.: An NLP-based cognitive system for disease status identification in electronic health records. In: 2017 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI), Orlando, FL, USA, pp. 89–92 (2017). https://doi.org/10.1109/BHI.2017.7897212
Meskó, B.: The guide to the future of medicine: technology and the human touch. In: Webicina KFT (2014)
Zhang, R., Ma, S., Shanahan, L., Munroe, J., Horn, S., Speedie, S.: Automatic methods to extract New York heart association classification from clinical notes. In: 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Kansas City, MO, USA, pp. 1296–1299 (2017). https://doi.org/10.1109/BIBM.2017.8217848
Chen, X., Xie, H., Wang, F., et al.: A bibliometric analysis of natural language processing in medical research. BMC Med. Inform. Decis. Mak. 18(Suppl 1), 14 (2018). https://doi.org/10.1186/s12911-018-0594-x
Shickel, B., et al.: Deep EHR: a survey of recent advances in deep learning techniques for electronic health record (EHR) analysis. IEEE J. Biomed. Health Inform. 22(5), 1589–1604 (2017)
Kreimeyer, K., et al.: Natural language processing systems for capturing and standardizing unstructured clinical information: a systematic review. J. Biomed. Inform. 73, 14–29 (2017). Epub 2017 Jul 17. PMID: 28729030; PMCID: PMC6864736. https://doi.org/10.1016/j.jbi.2017.07.012
Hunt, D.L., Haynes, R.B., Hanna, S.E., Smith, K.: Effects of computer-based clinical decision support systems on physician performance and patient outcomes: a systematic review. JAMA 280(15), 1339–1346 (1998). PMID: 9794315. https://doi.org/10.1001/jama.280.15.1339
Bucur, A., van Leeuwen, J., Cirstea, T.C., Graf, N.: Clinical decision support framework for validation of multiscale models and personalization of treatment in oncology. In: 13th IEEE International Conference on BioInformatics and BioEngineering, Chania, Greece, pp. 1–4 (2013). https://doi.org/10.1109/BIBE.2013.6701695
Polpinij, J.: The cancerology ontology: designed to support the search of evidence-based oncology from biomedical literatures. In: 24th International Symposium on Computer-Based Medical Systems (CBMS). Bristol, UK, pp. 1–6 (2011). https://doi.org/10.1109/CBMS.2011.5999168
Wang, Y., et al.: Clinical information extraction applications: a literature review. J. Biomed. Inform. 77, 34–49 (2018). ISSN 1532–0464. https://doi.org/10.1016/j.jbi.2017.11.011
InterProcess: InterProcess Gemed Oncology - Oncological management system (2019). www.interprocess.com.br/en/gemed-oncology/
Naraei, P., Abhari, A., Sadeghian, A.: Application of multilayer perceptron neural networks and support vector machines in classification of healthcare data. In: Future Technologies Conference (FTC). San Francisco, CA, USA, vol. 2016, pp. 848–852 (2016). https://doi.org/10.1109/FTC.2016.7821702
Lemon, S.C., Roy, J., Clark, M.A., Friedmann, P.D., Rakowski, W.: Classification and regression tree analysis in public health: methodological review and comparison with logistic regression. Ann. Behav. Med. 26(3), 172–181 (2003). PMID: 14644693. https://doi.org/10.1207/S15324796ABM2603_02
Lavanya, D., Rani, K.U.: Ensemble decision tree classifier for breast cancer data. Int. J. Inf. Technol. Convergence Serv., 2(1), 17 (2012)
DuBrava, S., et al.: Using random forest models to identify correlates of a diabetic peripheral neuropathy diagnosis from electronic health record data. Pain Med. 18(1), 107–115 (2017). PMID: 27252307. https://doi.org/10.1093/pm/pnw096
Tayeb, S., et al.: Toward predicting medical conditions using K-nearest neighbors. In: 2017 IEEE International Conference on Big Data (Big Data), Boston, MA, USA, pp. 3897–3903 (2017). https://doi.org/10.1109/BigData.2017.8258395
ul Haq, A., Li, J.P., Memon, M.H., Nazir, S., Sun, R.: A hybrid intelligent system framework for the prediction of heart disease using machine learning algorithms. Hindawi Mobile Inf. Syst. (2018). https://doi.org/10.1155/2018/3860146
Haq, A.U., et al.: Intelligent machine learning approach for effective recognition of diabetes in E-Healthcare using clinical data. Sensors 20, 2649 (2020). https://doi.org/10.3390/s20092649
Tai, K.S., Socher, R., Manning, C.D.: Language processing, improved semantic representations from tree structured long short-term memory networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural, pp. 1556–1566 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
da Silva, D.P., Fröhlich, W.d.R., Schwertner, M.A., Rigo, S.J. (2023). Clinical Oncology Textual Notes Analysis Using Machine Learning and Deep Learning. In: Naldi, M.C., Bianchi, R.A.C. (eds) Intelligent Systems. BRACIS 2023. Lecture Notes in Computer Science(), vol 14196. Springer, Cham. https://doi.org/10.1007/978-3-031-45389-2_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-45389-2_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-45388-5
Online ISBN: 978-3-031-45389-2
eBook Packages: Computer ScienceComputer Science (R0)