Calibração de Sistemas de Recomendação com LLMs: Otimização de Prompts para Balancear Precisão, Diversidade e Justiça

  • Gabriel Prenassi UFSJ
  • Rodrigo Souza USP
  • Henrique Sekido USP
  • Guilherme Fonseca UFMG
  • Marcelo G. Manzato USP
  • Leonardo Rocha UFSJ

Resumo


Recommender Systems (RSs) play a central role in digital platforms, aiming to deliver relevant and personalized content. However, issues like popularity bias persist, limiting diversity and fairness. Calibration techniques seek to align recommendations with user preferences, often through post-processing adjustments. With the advent of Large Language Models (LLMs) such as GPT and LLaMA, new opportunities have emerged to personalize recommendations using prompt engineering. While recent approaches like prompt optimization have shown improvements in ranking accuracy, they often neglect other key aspects such as diversity, coverage, and fairness. This study investigates the use of LLMs for recommendation calibration, comparing their performance against traditional methods. We also evaluate the impact of different prompt optimization strategies across multiple metrics, including MAP, NDCG@10, MRMC, LTC, and GAP. Additionally, we employ a multicriteria utility function (MAUT) to analyze trade-offs between accuracy and diversity. Our results highlight the potential of LLMs and prompt engineering to enhance both personalization and fairness in Rss.

Palavras-chave: Recomendação, LLM, Engenharia de Prompt, Viés de Popularidade

Referências

Himan Abdollahpouri, Masoud Mansoury, Robin Burke, and Bamshad Mobasher. 2019. The Unfairness of Popularity Bias in Recommendation. CoRR abs/1907.13286 (2019). arXiv:1907.13286 [link]

Himan Abdollahpouri, Masoud Mansoury, Robin Burke, Bamshad Mobasher, and Edward C. Malthouse. 2021. User-centered Evaluation of Popularity Bias in Recommender Systems. In Proceedings of the 29th ACM Conference UMAP 2021, Utrecht. ACM, 119–129. DOI: 10.1145/3450613.3456821

Paul Dany Flores Atauchi, André Levi Zanon, Leonardo Chaves Dutra da Rocha, and Marcelo Garcia Manzato. 2025. Do Calibrated Recommendations Affect Explanations? A Study on Post-Hoc Adjustments. 16 (Jun. 2025), 441–460. DOI: 10.5753/jis.2025.5563

James Bennett and Stan Lanning. 2007. The netflix prize. (2007).

Guilherme Bittencourt, Guilherme Fonseca, Yan Andrade, Nícollas Silva, and Leonardo Rocha. 2023. A survey on review-aware recommendation systems. In Proceedings of the 29th Brazilian Symposium on Multimedia and theWeb. 198–207.

Rodrigo Carvalho and Leonardo Rocha. 2020. Estratégias para Aprimorar a Diversidade Categórica e Geográfica de Sistemas de Recomendação de POIs. In Simpósio Brasileiro de Sistemas Multimídia e Web (WebMedia). SBC, 23–26.

Oscar Celma and Paul Lamere. 2011. Music recommendation and discovery revisited. In Proceedings of the fifth ACM RecSys. 7–8.

Luiz Chaves, Nícollas Silva, Rodrigo Carvalho, Adriano C. M. Pereira, and Leonardo Rocha. 2019. Exploiting the user activity-level to improve the models’ accuracy in point-of-interest recommender systems. In Proceedings of the 25th Brazillian Symposium on Multimedia and the Web (WebMedia ’19). 341–348. DOI: 10.1145/3323503.3349551

Chien Chin Chen, Shun-Yuan Shih, and Meng Lee. 2016. Who should you follow? Combining learning to rank with social influence for informative friend recommendation. Decision Support Systems 90 (2016), 33–45.

Dan Cosley, Shyong K Lam, Istvan Albert, Joseph A Konstan, and John Riedl. 2003. Is seeing believing? How recommender system interfaces affect users’ opinions. In Proceedings of the SIGCHI. 585–592.

Washington Cunha, Leonardo Rocha, and Marcos André Gonçalves. 2025. A thorough benchmark of automatic text classification: From traditional approaches to large language models. arXiv preprint arXiv:2504.01930 (2025).

Diego Corrêa da Silva and Dietmar Jannach. 2025. Calibrated Recommendations: Survey and Future Directions. [link]

Diego Corrêa da Silva, Marcelo Garcia Manzato, and Frederico Araújo Durão. 2021. Exploiting personalized calibration and metrics for fairness recommendation. Expert Systems with Applications 181 (2021), 115112. DOI: 10.1016/j.eswa.2021.115112

Rodrigo Ferrari de Souza and Marcelo Garcia Manzato. 2024. Uma Abordagem em Etapa de Processamento para Redução do Viés de Popularidade. In Brazilian Symposium on Multimedia and the Web (WebMedia). SBC, 310–317.

Guilherme Fonseca, Washington Cunha, Gabriel Prenassi, Marcos André Gonçalves, and Leonardo Chaves Dutra Da Rocha. 2025. Instance-Selection-Inspired Undersampling Strategies for Bias Reduction in Small and Large Language Models for Binary Text Classification. In Proceedings of the 63rd ACL. Association for Computational Linguistics, 9323–9340.

Guilherme Fonseca, Gabriel Prenassi, Washington Cunha, Marcos André Gonçalves, and Leonardo Rocha. 2024. Estratégias de Undersampling para Redução de Viés em Classificação de Texto Baseada em Transformers. In Brazilian Symposium on Multimedia and the Web (WebMedia). SBC, 144–152.

Jingtong Gao, Bo Chen, Xiangyu Zhao, Weiwen Liu, Xiangyang Li, Yichao Wang, Wanyu Wang, Huifeng Guo, and Ruiming Tang. 2025. LLM4Rerank: LLM-based Auto-Reranking Framework for Recommendations. In Proceedings of the ACM on Web Conference 2025 (Sydney NSW, Australia) (WWW ’25). 228–239. DOI: 10. 1145/3696410.3714922

F Maxwell Harper and Joseph A Konstan. 2015. The movielens datasets: History and context. Acm transactions on interactive intelligent systems (tiis) 5, 4 (2015).

F. Maxwell Harper and Joseph A. Konstan. 2015. The MovieLens Datasets: History and Context. ACM Trans. Interact. Intell. Syst. 5, 4, Article 19 (Dec. 2015), 19 pages. DOI: 10.1145/2827872

Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. 2021. LoRA: Low-Rank Adaptation of Large Language Models. arXiv:2106.09685 [cs.CL] [link]

Aryan Jadon and Avinash Patil. 2024. A comprehensive survey of evaluation techniques for recommendation systems. In International Conference on Computation of Artificial Intelligence & Machine Learning. Springer, 281–304.

Dietmar Jannach. 2022. Multi-Objective Recommender Systems: Survey and Challenges. arXiv:2210.10309 [cs.IR] [link]

Anastasiia Klimashevskaia, Dietmar Jannach, Mehdi Elahi, and Christoph Trattner. 2024. A survey on popularity bias in recommender systems. User Modeling and User-Adapted Interaction 34, 5 (2024), 1777–1834.

Jan Malte Lichtenberg, Alexander Buchholz, and Pola Schwöbel. 2024. Large Language Models as Recommender Systems: A Study of Popularity Bias. In Proceedings of the SIGIR 2024 Workshop on Generative Information Retrieval.

Dairui Liu, Boming Yang, Honghui Du, Derek Greene, Aonghus Lawlor, Ruihai Dong, and Irene Li. 2023. Recprompt: A prompt tuning framework for news recommendation using large language models. CoRR (2023).

Silvia Beatriz Neiva and Luiz Flavio Autran Monteiro Gomes. 2007. A aplicação da teoria da utilidade multiatributo à escolha de um software de e-procurement. Revista Tecnologia 28, 2 (2007).

Gustavo Mendonça Ortega, Rodrigo Ferrari de Souza, and Marcelo Garcia Manzato. 2024. Evaluating Zero-Shot Large Language Models Recommenders on Popularity Bias and Unfairness: A Comparative Approach to Traditional Algorithms. In Simpósio Brasileiro de Sistemas Multimídia e Web (WebMedia). SBC, 45–48.

Massimo Quadrana, Paolo Cremonesi, and Dietmar Jannach. 2018. Sequenceaware recommender systems. ACM computing surveys (CSUR) 51, 4 (2018), 1–36.

Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2012. BPR: Bayesian personalized ranking from implicit feedback. arXiv preprint arXiv:1205.2618 (2012).

Denise Rey and Markus Neuhäuser. 2011. Wilcoxon-signed-rank test. In International encyclopedia of statistical science. Springer, 1658–1659.

Francesco Ricci, Lior Rokach, and Bracha Shapira. 2015. Recommender systems: introduction and challenges. Recommender systems handbook (2015), 1–34.

Andre Sacilotti, Rodrigo Ferrari de Souza, and Marcelo Garcia Manzato. 2023. Counteracting popularity-bias and improving diversity through calibrated recommendations. In In Proceedings of the 25th International Conference on Enterprise Information Systems, Vol. 1. Scitepress, Prague, Czech Republic.

Nícollas Silva, Heitor Werneck, Thiago Silva, Adriano CM Pereira, and Leonardo Rocha. 2021. A contextual approach to improve the user’s experience in interactive recommendation systems. In Proceedings of the Brazilian Symposium on Multimedia and the Web. 89–96.

Rodrigo Souza and Marcelo Manzato. 2024. Explorando Formas de Calibração e Redução do Viés de Popularidade em Sistemas de Recomendação. In Anais Estendidos do XXX Simpósio Brasileiro de Sistemas Multimídia e Web (Juiz de Fora/MG). SBC, Porto Alegre, RS, Brasil, 9–10. DOI: 10.5753/webmedia_estendido. 2024.244380

Rodrigo Souza and Marcelo Manzato. 2024. A Two-Stage Calibration Approach for Mitigating Bias and Fairness in Recommender Systems. In Proceedings of the 39th ACM/SIGAPP Symposium on Applied Computing. ACM, New York, NY, USA.

Harald Steck. 2018. Calibrated recommendations. In Proceedings of the 12th ACM conference on recommender systems. 154–162.

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, et al. 2023. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023).

Célina Treuillier, Sylvain Castagnos, Özlem Özgöbek, and Armelle Brun. 2024. Beyond Trade-offs: Unveiling Fairness-Constrained Diversity in News Recommender Systems (UMAP ’24). 143–148. DOI: 10.1145/3627043.3659571

Jiawei Wang, Xinyu Chen, Kuan-Chieh Lee, Deb Ghosh, Neelesh Rao, and Hexiang Hu. 2025. Automating Personalization: Prompt Optimization for Recommendation Reranking. arXiv:2504.03965 [cs.IR]

Heitor Werneck, Nícollas Silva, Matheus Carvalho Viana, Fernando Mourão, Adriano C. M. Pereira, and Leonardo Rocha. 2020. A Survey on Point-of-Interest Recommendation in Location-based Social Networks. In Proceedings of the Brazilian Symposium on Multimedia and the Web (São Luís, Brazil) (WebMedia ’20). Association for Computing Machinery, New York, NY, USA, 185–192. DOI: 10.1145/3428658.3430970

Heitor Werneck, Nícollas Silva, Matheus Carvalho Viana, Fernando Mourão, Adriano CM Pereira, and Leonardo Rocha. 2020. A survey on point-of-interest recommendation in location-based social networks. In Proceedings of the Brazilian Symposium on Multimedia and the Web. 185–192.

Likang Wu, Zhi Zheng, Zhaopeng Qiu, Hao Wang, Hongchao Gu, Tingjia Shen, Chuan Qin, Chen Zhu, Hengshu Zhu, Qi Liu, et al. 2024. A survey on large language models for recommendation. World Wide Web 27, 5 (2024), 60.

Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, Quoc V Le, Denny Zhou, and Xinyun Chen. 2023. Large language models as optimizers. In The Twelfth International Conference on Learning Representations.

André L. Zanon, Leonardo Chaves Dutra da Rocha, and Marcelo Garcia Manzato. 2022. Balancing the trade-off between accuracy and diversity in recommender systems with personalized explanations based on Linked Open Data. Knowl. Based Syst. 252 (2022), 109333. DOI: 10.1016/J.KNOSYS.2022.109333

Jizhi Zhang, Keqin Bao, Yang Zhang, Wenjie Wang, Fuli Feng, and Xiangnan He. 2023. Is ChatGPT Fair for Recommendation? Evaluating Fairness in Large Language Model Recommendation. In Proceedings of the 17th ACM Conference on Recommender Systems.

Yuying Zhao, YuWang, Yunchao Liu, Xueqi Cheng, Charu C. Aggarwal, and Tyler Derr. 2025. Fairness and Diversity in Recommender Systems: A Survey. ACM Trans. Intell. Syst. Technol. 16, 1, Article 2 (Jan. 2025), 28 pages. DOI: 10.1145/3664928

Zihao Zhao, EricWallace, Shi Feng, Dan Klein, and Sameer Singh. 2021. Calibrate before use: Improving few-shot performance of language models. In International conference on machine learning. PMLR, 12697–12706.
Publicado
10/11/2025
PRENASSI, Gabriel; SOUZA, Rodrigo; SEKIDO, Henrique; FONSECA, Guilherme; MANZATO, Marcelo G.; ROCHA, Leonardo. Calibração de Sistemas de Recomendação com LLMs: Otimização de Prompts para Balancear Precisão, Diversidade e Justiça. In: BRAZILIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB (WEBMEDIA), 31. , 2025, Rio de Janeiro/RJ. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 103-111. DOI: https://doi.org/10.5753/webmedia.2025.15613.

Artigos mais lidos do(s) mesmo(s) autor(es)

1 2 3 4 5 > >>