Boosting not so Large Language Models by using Knowledge Graphs and Reinforcement Learning

Resumo


Ensuring the viability of large language models (LLMs) in situations requiring data privacy with limited on-premise resources is a significant current challenge. This work investigates how to tackle this challenge using knowledge graphs (KGs) and reinforcement learning (RL) to enhance minor LLMs by reducing non-factual responses and response gaps. We evaluated variations of GPT (4o, 4, and 3.5), Llama2 (7b, 13b, and 70b), and Llama3 (8b and 70b) for multi-label classification and information extraction, with or without KG and RL, and also fine-tuned a BERT model. Llama3 8b combined with KG and RL outperformed all other LLM models, and the fine-tuned BERT model too.

Palavras-chave: Large Language Models, Knowledge Graphs, Reinforcement Learning

Referências

Alfasi, D., Shapira, T., and Barr, A. B. (2024). Unveiling hidden links between unseen security entities. arXiv preprint arXiv:2403.02014. DOI: 10.48550/arXiv.2403.02014

Alizadeh, K., Mirzadeh, I., Belenko, D., Khatamifard, K., Cho, M., Del Mundo, C. C., Rastegari, M., and Farajtabar, M. (2023). Llm in a flash: Efficient large language model inference with limited memory. arXiv preprint arXiv:2312.11514. DOI: 10.48550/arXiv.2312.11514

Beckhauser, W. and Fileto, R. (2024). Can a simple customer review outperform a feature set for predicting churn? In Anais do XXXIX Simpósio Brasileiro de Bancos de Dados, pages 117–128, Porto Alegre, RS, Brasil. SBC. DOI: 10.5753/sbbd.2024.240217

Bruno, A., Mazzeo, P. L., Chetouani, A., Tliba, M., and Kerkouri, M. A. (2023). Insights into classifying and mitigating llms’ hallucinations. arXiv arXiv:2311.08117. DOI: 10.48550/arXiv.2311.08117

Chen, J., Xiao, S., Zhang, P., Luo, K., Lian, D., and Liu, Z. (2024). Bge m3-embedding: Multi-lingual, multi-functionality, multi-granularity text embeddings through self-knowledge distillation. ArXiv, abs/2402.03216. DOI: 10.48550/arXiv.2402.03216

Erickson, A. (2018). Comparative analysis of the eu’s gdpr and brazil’s lgpd: Enforcement challenges with the lgpd. Brook. J. Int’l L., 44:859. [link]

Gao, P., Han, J., Zhang, R., Lin, Z., Geng, S., Zhou, A., Zhang, W., Lu, P., He, C., Yue, X., et al. (2023). Llama-adapter v2: Parameter-efficient visual instruction model. arXiv preprint arXiv:2304.15010. DOI: 10.48550/arXiv.2304.15010

Gouidis, F., Papantoniou, K., Patkos, K. P. T., Argyros, A., and Plexousakis, D. (2024). Fusing domain-specific content from large language models into knowledge graphs for enhanced zero shot object state classification. arXiv arXiv:2403.12151 DOI: 10.48550/arXiv.2403.12151

Guan, Z., Wu, Z., Liu, Z., Wu, D., Ren, H., Li, Q., Li, X., and Liu, N. (2023). Cohortgpt: An enhanced gpt for participant recruitment in clinical study. arXiv preprint arXiv:2307.11346. DOI: 10.48550/arXiv.2307.11346

Hartmann, J., Heitmann, M., Siebert, C., and Schamp, C. (2023). More than a feeling: Accuracy and application of sentiment analysis. International Journal of Research in Marketing, 40(1):75–87. DOI: 10.1016/j.ijresmar.2022.05.005

He, X., Bresson, X., Laurent, T., Perold, A., LeCun, Y., and Hooi, B. (2023). Harnessing explanations: Llm-to-lm interpreter for enhanced text-attributed graph representation learning. In ICLR. DOI: 10.48550/arXiv.2305.19523

Hu, S., Zou, G., Yang, S., Zhang, B., and Chen, Y. (2024). Large language model meets graph neural network in knowledge distillation. arXiv preprint arXiv:2402.05894. DOI: 10.48550/arXiv.2402.05894

Krugmann, J. O. and Hartmann, J. (2024). Sentiment analysis in the age of generative ai. Customer Needs and Solutions, 11(1):1–19. DOI: 10.1007/s40547-024-00143-4

Kwon, D., Weiss, E., Kulshrestha, T., Chawla, K., Lucas, G. M., and Gratch, J. (2024). Are llms effective negotiators? systematic evaluation of the multi-faceted capabilities of llms in negotiation dialogues. arXiv preprint arXiv:2402.13550. DOI: 10.48550/arXiv.2402.13550

Li, R., Li, J., Han, J., and Wang, G. (2024). Similarity-based neighbor selection for graph llms. arXiv preprint arXiv:2402.03720. DOI: 10.48550/arXiv.2402.03720

Liu, X., Li, P., Huang, H., Li, Z., Cui, X., Liang, J., Qin, L., Deng, W., and He, Z. (2024). Fakenewsgpt4: Advancing multimodal fake news detection through knowledge-augmented lvlms. arXiv preprint arXiv:2403.01988. DOI: 10.48550/arXiv.2403.01988

Liu, X., Zheng, Y., Du, Z., Ding, M., Qian, Y., Yang, Z., and Tang, J. (2023). Gpt understands, too. AI Open. DOI: 10.1016/j.aiopen.2023.08.012

Mandi, Z., Jain, S., and Song, S. (2023). Roco: Dialectic multi-robot collaboration with large language models. arXiv preprint arXiv:2307.04738. DOI: 10.48550/arXiv.2307.04738

McKenna, N., Li, T., Cheng, L., Hosseini, M. J., Johnson, M., and Steedman, M. (2023). Sources of hallucination by large language models on inference tasks. arXiv preprint arXiv:2305.14552. DOI: 10.48550/arXiv.2305.14552

Nguyen, H. A., Stec, H., Hou, X., Di, S., and McLaren, B. M. (2023). Evaluating chatgpt’s decimal skills and feedback generation in a digital learning game. In European Conference on Technology Enhanced Learning, pages 278–293. Springer. DOI: 10.48550/arXiv.2306.16639

Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., et al. (2022). Training language models to follow instructions with human feedback. Advances in neural information processing systems, 35:27730–27744. [link]

Pan, S., Luo, L., Wang, Y., Chen, C., Wang, J., and Wu, X. (2023). Unifying large language models and knowledge graphs: A roadmap. ArXiv, abs/2306.08302. DOI: 10.48550/arXiv.2306.08302

Roit, P., Ferret, J., Shani, L., Aharoni, R., Cideron, G., Dadashi, R., Geist, M., Girgin, S., Hussenot, L., Keller, O., et al. (2023). Factually consistent summarization via reinforcement learning with textual entailment feedback. arXiv preprint arXiv:2306.00186. DOI: 10.48550/arXiv.2306.00186

Shen, W., Li, C., Chen, H., Yan, M., Quan, X., Chen, H., Zhang, J., and Huang, F. (2024). Small llms are weak tool learners: A multi-llm agent. arXiv preprint arXiv:2401.07324. DOI: 10.48550/arXiv.2401.07324

Shi, Y., Ma, H., Zhong, W., Tan, Q., Mai, G., Li, X., Liu, T., and Huang, J. (2023). Chatgraph: Interpretable text classification by converting chatgpt knowledge to graphs. In 2023 IEEE International Conference on Data Mining Workshops (ICDMW), pages 515–520. IEEE. DOI: 10.1109/ICDMW60847.2023.00073

Shridhar, K., Sinha, K., Cohen, A., Wang, T., Yu, P., Pasunuru, R., Sachan, M., Weston, J., and Celikyilmaz, A. (2023). The art of llm refinement: Ask, refine, and trust. arXiv preprint arXiv:2311.07961. DOI: 10.48550/arXiv.2311.07961

Team, G., Anil, R., Borgeaud, S., Wu, Y., Alayrac, J.-B., Yu, J., Soricut, R., Schalkwyk, J., Dai, A. M., Hauth, A., et al. (2023). Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805. DOI: 10.48550/arXiv.2312.11805

Wang, Q., Gao, Z., and Xu, R. (2023). Graph agent: Explicit reasoning agent for graphs. arXiv preprint arXiv:2310.16421. DOI: 10.48550/arXiv.2310.16421

Wu, H., Zhang, Y., Han, Z., Hou, Y., Wang, L., Liu, S., Gong, Q., and Ge, Y. (2024). Quartet logic: A four-step reasoning (qlfr) framework for advancing short text classification. arXiv preprint arXiv:2401.03158. DOI: 10.48550/arXiv.2401.03158

Xu, D., Zhang, Z., Lin, Z., Wu, X., Zhu, Z., Xu, T., Zhao, X., Zheng, Y., and Chen, E. (2024). Multi-perspective improvement of knowledge graph completion with large language models. arXiv preprint arXiv:2403.01972. DOI: 10.48550/arXiv.2403.01972

Xu, H., Gao, Y., Hui, Z., Li, J., and Gao, X. (2023). Language knowledge-assisted representation learning for skeleton-based action recognition. arXiv preprint arXiv:2305.12398. DOI: 10.48550/arXiv.2305.12398

Xue, F., Fu, Y., Zhou, W., Zheng, Z., and You, Y. (2024). To repeat or not to repeat: Insights from scaling llm under token-crisis. Advances in Neural Information Processing Systems, 36. [link]

Yang, Y., Chen, S., Zhu, Y., Zhu, H., and Chen, Z. (2023). Knowledge graph empowerment from knowledge learning to graduation requirements achievement. Plos one, 18(10):e0292903. DOI: 10.1371/journal.pone.0292903

Yao, Y., Duan, J., Xu, K., Cai, Y., Sun, Z., and Zhang, Y. (2024). A survey on large language model (llm) security and privacy: The good, the bad, and the ugly. High-Confidence Computing, page 100211. DOI: 10.1016/j.hcc.2024.100211
Publicado
17/11/2024
BECKHAUSER, William Jones; FILETO, Renato. Boosting not so Large Language Models by using Knowledge Graphs and Reinforcement Learning. In: SIMPÓSIO BRASILEIRO DE TECNOLOGIA DA INFORMAÇÃO E DA LINGUAGEM HUMANA (STIL), 15. , 2024, Belém/PA. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 165-175. DOI: https://doi.org/10.5753/stil.2024.245396.