Boosting not so Large Language Models by using Knowledge Graphs and Reinforcement Learning
Resumo
Ensuring the viability of large language models (LLMs) in situations requiring data privacy with limited on-premise resources is a significant current challenge. This work investigates how to tackle this challenge using knowledge graphs (KGs) and reinforcement learning (RL) to enhance minor LLMs by reducing non-factual responses and response gaps. We evaluated variations of GPT (4o, 4, and 3.5), Llama2 (7b, 13b, and 70b), and Llama3 (8b and 70b) for multi-label classification and information extraction, with or without KG and RL, and also fine-tuned a BERT model. Llama3 8b combined with KG and RL outperformed all other LLM models, and the fine-tuned BERT model too.
Referências
Alizadeh, K., Mirzadeh, I., Belenko, D., Khatamifard, K., Cho, M., Del Mundo, C. C., Rastegari, M., and Farajtabar, M. (2023). Llm in a flash: Efficient large language model inference with limited memory. arXiv preprint arXiv:2312.11514. DOI: 10.48550/arXiv.2312.11514
Beckhauser, W. and Fileto, R. (2024). Can a simple customer review outperform a feature set for predicting churn? In Anais do XXXIX Simpósio Brasileiro de Bancos de Dados, pages 117–128, Porto Alegre, RS, Brasil. SBC. DOI: 10.5753/sbbd.2024.240217
Bruno, A., Mazzeo, P. L., Chetouani, A., Tliba, M., and Kerkouri, M. A. (2023). Insights into classifying and mitigating llms’ hallucinations. arXiv arXiv:2311.08117. DOI: 10.48550/arXiv.2311.08117
Chen, J., Xiao, S., Zhang, P., Luo, K., Lian, D., and Liu, Z. (2024). Bge m3-embedding: Multi-lingual, multi-functionality, multi-granularity text embeddings through self-knowledge distillation. ArXiv, abs/2402.03216. DOI: 10.48550/arXiv.2402.03216
Erickson, A. (2018). Comparative analysis of the eu’s gdpr and brazil’s lgpd: Enforcement challenges with the lgpd. Brook. J. Int’l L., 44:859. [link]
Gao, P., Han, J., Zhang, R., Lin, Z., Geng, S., Zhou, A., Zhang, W., Lu, P., He, C., Yue, X., et al. (2023). Llama-adapter v2: Parameter-efficient visual instruction model. arXiv preprint arXiv:2304.15010. DOI: 10.48550/arXiv.2304.15010
Gouidis, F., Papantoniou, K., Patkos, K. P. T., Argyros, A., and Plexousakis, D. (2024). Fusing domain-specific content from large language models into knowledge graphs for enhanced zero shot object state classification. arXiv arXiv:2403.12151 DOI: 10.48550/arXiv.2403.12151
Guan, Z., Wu, Z., Liu, Z., Wu, D., Ren, H., Li, Q., Li, X., and Liu, N. (2023). Cohortgpt: An enhanced gpt for participant recruitment in clinical study. arXiv preprint arXiv:2307.11346. DOI: 10.48550/arXiv.2307.11346
Hartmann, J., Heitmann, M., Siebert, C., and Schamp, C. (2023). More than a feeling: Accuracy and application of sentiment analysis. International Journal of Research in Marketing, 40(1):75–87. DOI: 10.1016/j.ijresmar.2022.05.005
He, X., Bresson, X., Laurent, T., Perold, A., LeCun, Y., and Hooi, B. (2023). Harnessing explanations: Llm-to-lm interpreter for enhanced text-attributed graph representation learning. In ICLR. DOI: 10.48550/arXiv.2305.19523
Hu, S., Zou, G., Yang, S., Zhang, B., and Chen, Y. (2024). Large language model meets graph neural network in knowledge distillation. arXiv preprint arXiv:2402.05894. DOI: 10.48550/arXiv.2402.05894
Krugmann, J. O. and Hartmann, J. (2024). Sentiment analysis in the age of generative ai. Customer Needs and Solutions, 11(1):1–19. DOI: 10.1007/s40547-024-00143-4
Kwon, D., Weiss, E., Kulshrestha, T., Chawla, K., Lucas, G. M., and Gratch, J. (2024). Are llms effective negotiators? systematic evaluation of the multi-faceted capabilities of llms in negotiation dialogues. arXiv preprint arXiv:2402.13550. DOI: 10.48550/arXiv.2402.13550
Li, R., Li, J., Han, J., and Wang, G. (2024). Similarity-based neighbor selection for graph llms. arXiv preprint arXiv:2402.03720. DOI: 10.48550/arXiv.2402.03720
Liu, X., Li, P., Huang, H., Li, Z., Cui, X., Liang, J., Qin, L., Deng, W., and He, Z. (2024). Fakenewsgpt4: Advancing multimodal fake news detection through knowledge-augmented lvlms. arXiv preprint arXiv:2403.01988. DOI: 10.48550/arXiv.2403.01988
Liu, X., Zheng, Y., Du, Z., Ding, M., Qian, Y., Yang, Z., and Tang, J. (2023). Gpt understands, too. AI Open. DOI: 10.1016/j.aiopen.2023.08.012
Mandi, Z., Jain, S., and Song, S. (2023). Roco: Dialectic multi-robot collaboration with large language models. arXiv preprint arXiv:2307.04738. DOI: 10.48550/arXiv.2307.04738
McKenna, N., Li, T., Cheng, L., Hosseini, M. J., Johnson, M., and Steedman, M. (2023). Sources of hallucination by large language models on inference tasks. arXiv preprint arXiv:2305.14552. DOI: 10.48550/arXiv.2305.14552
Nguyen, H. A., Stec, H., Hou, X., Di, S., and McLaren, B. M. (2023). Evaluating chatgpt’s decimal skills and feedback generation in a digital learning game. In European Conference on Technology Enhanced Learning, pages 278–293. Springer. DOI: 10.48550/arXiv.2306.16639
Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., et al. (2022). Training language models to follow instructions with human feedback. Advances in neural information processing systems, 35:27730–27744. [link]
Pan, S., Luo, L., Wang, Y., Chen, C., Wang, J., and Wu, X. (2023). Unifying large language models and knowledge graphs: A roadmap. ArXiv, abs/2306.08302. DOI: 10.48550/arXiv.2306.08302
Roit, P., Ferret, J., Shani, L., Aharoni, R., Cideron, G., Dadashi, R., Geist, M., Girgin, S., Hussenot, L., Keller, O., et al. (2023). Factually consistent summarization via reinforcement learning with textual entailment feedback. arXiv preprint arXiv:2306.00186. DOI: 10.48550/arXiv.2306.00186
Shen, W., Li, C., Chen, H., Yan, M., Quan, X., Chen, H., Zhang, J., and Huang, F. (2024). Small llms are weak tool learners: A multi-llm agent. arXiv preprint arXiv:2401.07324. DOI: 10.48550/arXiv.2401.07324
Shi, Y., Ma, H., Zhong, W., Tan, Q., Mai, G., Li, X., Liu, T., and Huang, J. (2023). Chatgraph: Interpretable text classification by converting chatgpt knowledge to graphs. In 2023 IEEE International Conference on Data Mining Workshops (ICDMW), pages 515–520. IEEE. DOI: 10.1109/ICDMW60847.2023.00073
Shridhar, K., Sinha, K., Cohen, A., Wang, T., Yu, P., Pasunuru, R., Sachan, M., Weston, J., and Celikyilmaz, A. (2023). The art of llm refinement: Ask, refine, and trust. arXiv preprint arXiv:2311.07961. DOI: 10.48550/arXiv.2311.07961
Team, G., Anil, R., Borgeaud, S., Wu, Y., Alayrac, J.-B., Yu, J., Soricut, R., Schalkwyk, J., Dai, A. M., Hauth, A., et al. (2023). Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805. DOI: 10.48550/arXiv.2312.11805
Wang, Q., Gao, Z., and Xu, R. (2023). Graph agent: Explicit reasoning agent for graphs. arXiv preprint arXiv:2310.16421. DOI: 10.48550/arXiv.2310.16421
Wu, H., Zhang, Y., Han, Z., Hou, Y., Wang, L., Liu, S., Gong, Q., and Ge, Y. (2024). Quartet logic: A four-step reasoning (qlfr) framework for advancing short text classification. arXiv preprint arXiv:2401.03158. DOI: 10.48550/arXiv.2401.03158
Xu, D., Zhang, Z., Lin, Z., Wu, X., Zhu, Z., Xu, T., Zhao, X., Zheng, Y., and Chen, E. (2024). Multi-perspective improvement of knowledge graph completion with large language models. arXiv preprint arXiv:2403.01972. DOI: 10.48550/arXiv.2403.01972
Xu, H., Gao, Y., Hui, Z., Li, J., and Gao, X. (2023). Language knowledge-assisted representation learning for skeleton-based action recognition. arXiv preprint arXiv:2305.12398. DOI: 10.48550/arXiv.2305.12398
Xue, F., Fu, Y., Zhou, W., Zheng, Z., and You, Y. (2024). To repeat or not to repeat: Insights from scaling llm under token-crisis. Advances in Neural Information Processing Systems, 36. [link]
Yang, Y., Chen, S., Zhu, Y., Zhu, H., and Chen, Z. (2023). Knowledge graph empowerment from knowledge learning to graduation requirements achievement. Plos one, 18(10):e0292903. DOI: 10.1371/journal.pone.0292903
Yao, Y., Duan, J., Xu, K., Cai, Y., Sun, Z., and Zhang, Y. (2024). A survey on large language model (llm) security and privacy: The good, the bad, and the ugly. High-Confidence Computing, page 100211. DOI: 10.1016/j.hcc.2024.100211