Domain-Specific Fine-Tuning of Large Language Models for Pharmacological Question Answering

Felipe Verol; Andre Gomes Regino; Fernando Rezende Zagatti; Ferrucio de Franco Rosa; Julio Cesar Dos Reis; Rodrigo Bonacin

doi:10.5753/sbcas.2026.21567

Felipe Verol UNICAMP / CTI Renato Archer
Andre Gomes Regino CTI Renato Archer / UniFAJ / UniMAX
Fernando Rezende Zagatti CTI Renato Archer / UniFAJ / UniMAX / UFSCar
Ferrucio de Franco Rosa CTI Renato Archer / UniFAJ / UniMAX
Julio Cesar Dos Reis UNICAMP
Rodrigo Bonacin CTI Renato Archer / UniFAJ / UniMAX

DOI: https://doi.org/10.5753/sbcas.2026.21567

Resumo

Large Language Models (LLMs) perform well in general NLP tasks but face challenges in specialized domains such as pharmacology. This study investigates whether fine-tuning with DrugBank data improves response reliability. We construct a question–answer dataset from absorption and metabolism sections and fine-tune a LLaMA 3.1 8B model using efficient adaptation techniques. The effectiveness of the fine-tuned model is evaluated against its original version using ROUGE-L, BLEU, and Exact Match metrics, as well as qualitative analysis. The results show improvements and more domain-specific responses, indicating that fine-tuning effectively adapts LLMs to pharmacological tasks.

Referências

Cao, D., Wang, J., Zhou, R., Li, Y., Yu, H., and Hou, T. (2012). Admet evaluation in drug discovery. 11. pharmacokinetics knowledge base (pkkb): a comprehensive database of pharmacokinetic and toxic properties for drugs. Journal of Chemical Information and Modeling, 52(5):1132–1137.

Dettmers, T., Pagnoni, A., Holtzman, A., and Zettlemoyer, L. (2023). Qlora: Efficient finetuning of quantized llms. Advances in neural information processing systems, 36:10088–10115.

Fan, S., Yang, K., Lu, K., Dong, X., Li, X., Zhu, Q., Li, S., Zeng, J., and Zhou, X. (2024). Drugreppt: a deep pretraining and fine-tuning framework for drug repositioning based on drug’s expression perturbation and treatment effectiveness. Bioinformatics, 40(12):btae692.

Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., Chen, W., et al. (2022). Lora: Low-rank adaptation of large language models. ICLR, 1(2):3.

Kang, H., Li, J., Hou, L., Xu, X., Zheng, S., and Li, Q. (2025). Large language model–enhanced drug repositioning knowledge extraction via long chain-of-thought: Development and evaluation study. JMIR Medical Informatics, 13:e77837.

Kim, M., Kim, Y., Kang, H. J., Seo, H., Choi, H., Han, J., Kee, G., Park, S., Ko, S., Jung, H., et al. (2025). Fine-tuning llms with medical data: can safety be ensured? NEJM AI, 2(1):AIcs2400390.

Lin, C.-Y. (2004). Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, pages 74–81.

Machado, J., Rodrigues, C., Sousa, R., and Gomes, L. M. (2025). Drug–drug interaction extraction-based system: An natural language processing approach. Expert Systems, 42(1):e13303.

Papanikolaou, N., Pavlopoulos, G. A., Theodosiou, T., Vizirianakis, I. S., and Iliopoulos, I. (2016). Drugquest-a text mining workflow for drug association discovery. BMC bioinformatics, 17(Suppl 5):182.

Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. (2002). Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pages 311–318.

Ramasamy, S. R., Rathee, G., et al. (2025). Fine-tuning llm for rare disease diagnosis. In 2025 International Conference on Sustainability, Innovation & Technology (ICSIT), pages 1–6. IEEE.

Sam, K. (2024). Llama 3.1: An in-depth analysis of the next-generation large language model. Available at SSRN 6139407.

Tosca, E. M., Aiello, L., De Carlo, A., and Magni, P. (2025). Pharmacometrics in the age of large language models: A vision of the future. Pharmaceutics, 17(10):1274.

Wang, C., Li, M., He, J., Wang, Z., Darzi, E., Chen, Z., Ye, J., Li, T., Su, Y., Ke, J., et al. (2024). A survey for large language models in biomedicine. arXiv preprint arXiv:2409.00133.

Wishart, D. S., Knox, C., Guo, A. C., Cheng, D., Shrivastava, S., Tzur, D., Gautam, B., and Hassanali, M. (2008). Drugbank: a knowledgebase for drugs, drug actions and drug targets. Nucleic acids research, 36(suppl 1):D901–D906.

Zhang, Y., Ren, S., Wang, J., Lu, J., Wu, C., He, M., Liu, X., Wu, R., Zhao, J., Zhan, C., et al. (2025). Aligning large language models with humans: a comprehensive survey of chatgpt’s aptitude in pharmacology. Drugs, 85(2):231–254.

Domain-Specific Fine-Tuning of Large Language Models for Pharmacological Question Answering

Resumo

Referências

Artigos mais lidos do(s) mesmo(s) autor(es)