Automating Risk of Bias Inference in Clinical Studies

Abel C. Dias; Viviane P. Moreira; João Luiz D. Comba

doi:10.5753/sbcas_estendido.2025.7404

Abel C. Dias UFRGS
Viviane P. Moreira UFRGS
João Luiz D. Comba UFRGS

DOI: https://doi.org/10.5753/sbcas_estendido.2025.7404

Resumo

One of the best quality indicators in the clinical domain is the risk of bias (RoB). Bias refers to any systematic error in results that may lead to misinterpretation. In the systematic review process, human reviewers manually assess the RoB. Existing works attempt to automate this process using support vector machines (SVM), convolutional neural networks (CNN), or logistic regression. To the best of our knowledge, no previous work has explored Transformer-based models for the RoB assessment in clinical studies. In this work, we propose a novel model for RoB inference based on the Transformers architecture, called RoBIn (i.e., Risk of Bias Inference). We employ a machine reading comprehension (MRC) approach to extract evidence that is then classified with a RoB label. Furthermore, we use distant supervision to annotate a dataset for MRC and RoB inference. As a final contribution, a large language model (LLM) application was created to receive clinical trials as input and to assess the RoB. The proposed model outperforms state-of-the-art approaches and other LLMs in many settings, with high accuracy (AUC-ROC= 0.83) for different bias types.

Referências

Brainard, J. (2020). Scientists are drowning in COVID-19 papers. Can new tools keep them afloat? Science.

Gonçalves Pereira, R., Zanon Castro, G., Azevedo, P., Tôrres, L., Zuppo, I., Rocha, T., and Afonso Guerra, A. (2020). Mcrb: A multiclassifier tool for risk of bias assessment in a systematic review to produce health evidence to decision making. In IEEE International Symposium on Computer-Based Medical Systems (CBMS), pages 1–6.

Hasdeu, S. and Tortosa, F. (2021). Riesgo de sesgo de publicación en intervenciones terapéuticas para la COVID-19. Pan American Journal of Public Health, 45.

Landhuis, E. (2016). Scientific literature: Information overload. Nature, 535:457–458.

Marshall, I. J., Kuiper, J., and Wallace, B. C. (2016). Robotreviewer: evaluation of a system for automatically assessing bias in clinical trials. Journal of the American Medical Informatics Association (JAMIA), 23:193–201.

Marshall, I. J., Nye, B., Kuiper, J., Noel-Storr, A., Marshall, R., Maclean, R., Soboczenski, F., Nenkova, A., Thomas, J., and Wallace, B. C. (2020). Trialstreamer: A living, automatically updated database of clinical trial reports. Journal of the American Medical Informatics Association (JAMIA), 27:1903–1912.

Millard, L. A., Flach, P. A., and Higgins, J. P. (2015). Machine learning to assist risk-of-bias assessments in systematic reviews. International Journal of Epidemiology, 45(1):266–277.

Phillips, M. R., Kaiser, P., Thabane, L., Bhandari, M., Chaudhary, V., Wykoff, C. C., Sivaprasad, S., Sarraf, D., Bakri, S. J., Garg, S. J., Singh, R. P., Holz, F. G., and Wong, T. Y. (2022). Risk of bias: why measure it, and how?

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., and et. al. (2017). Attention is all you need. In Proceedings of the International Conference on Neural Information Processing Systems, NIPS’17, page 6000–6010. Curran Associates Inc.

Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., and Cao, Y. (2023). React: Synergizing reasoning and acting in language models.

Zhang, Y., Marshall, I., and Wallace, B. C. (2016). Rationale-augmented convolutional neural networks for text classification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 795–804. Association for Computational Linguistics.