Comparing LLMs in business rule-following
Abstract
Large Language Models (LLMs) have shown great capabilities in language understanding and instruction following. However, to the author’s knowledge, no prior work has evaluated their performance in real industry internal rule-following scenarios compared to humans. The present R&D project aims to analyze the applicability of LLMs in improving efficiency in task analysis and scheduling through automatic team assignment, following a set of internal business rules. The study was funded by SUFRAMA and is a collaboration between INDT and Motorola Mobility. The experiment results show that lightweight open LLMs, on average, have worse accuracy than mean worker (57.5% x 86.25%) with a higher divergence rate (90% x 45%).
Keywords:
LLM, LLMs, rule-following, Business-rules, open-source models
References
Kahng, M., Tenney, I., Pushkarna, M., Liu, M. X., Wexler, J., Reif, E., Kallarackal, K., Chang, M., Terry, M., and Dixon, L. (2024). Llm comparator: Visual analytics for side-by-side evaluation of large language models.
Sun, W., Zhang, C., Zhang, X., Yu, X., Huang, Z., Chen, P., Xu, H., He, S., Zhao, J., and Liu, K. (2024). Beyond instruction following: Evaluating inferential rule following of large language models.
Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E., Le, Q., and Zhou, D. (2023). Chain-of-thought prompting elicits reasoning in large language models.
Sun, W., Zhang, C., Zhang, X., Yu, X., Huang, Z., Chen, P., Xu, H., He, S., Zhao, J., and Liu, K. (2024). Beyond instruction following: Evaluating inferential rule following of large language models.
Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E., Le, Q., and Zhou, D. (2023). Chain-of-thought prompting elicits reasoning in large language models.
Published
2025-05-12
How to Cite
FERREIRA, Nikson Bernardes Fernandes; FREITAS, William; MELO, Hallyson; CARVALHO, Andre; BORGES, Thiago; MARQUES, Rodrigo.
Comparing LLMs in business rule-following. In: IBERO-AMERICAN CONFERENCE ON SOFTWARE ENGINEERING (CIBSE), 28. , 2025, Ciudad Real/Espanha.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 368-371.
DOI: https://doi.org/10.5753/cibse.2025.35327.
