Sectum: O ChatBot de Segurança da Informação

Mateus Fernandes dos Santos

doi:10.5753/sbseg_estendido.2024.243394

Mateus Fernandes dos Santos UNICAMP

DOI: https://doi.org/10.5753/sbseg_estendido.2024.243394

Resumo

Este artigo aborda o desenvolvimento do Sectum, o chat de segurança da informação em português a partir do ajuste fino do Llama. Para tanto, emprega a metodologia QLora para ajustar os pesos, retreinando-os a partir de uma base de dados formada por perguntas e respostas relacionadas à segurança da informação. O modelo superou o modelo Llama-7B nas tarefas em português em geral, destacando-se nas atividades de Similaridade Semântica e Inferência Textual. O modelo está disponível no https://github.com/MateusFernandes25/Sectrum e https://huggingface.co/MatNLP/Sectrum.

Referências

AlDaajeh, S., Saleous, H., Alrabaee, S., Barka, E., Breitinger, F., and Raymond Choo, K.-K. (2022). The role of national cybersecurity strategies on the improvement of cybersecurity education. Computers Security, 119:102754.

Corrêa, N. K., Falk, S., Fatimah, S., Sen, A., and De Oliveira, N. (2024). Teenytinyllama: Open-source tiny language models trained in brazilian portuguese. Machine Learning with Applications, 16:100558.

Cui, J., Li, Z., Yan, Y., Chen, B., and Yuan, L. (2023). Chatlaw: Open-source legal large language model with integrated external knowledge bases. arXiv preprint arXiv:2306.16092.

Dettmers, T., Pagnoni, A., Holtzman, A., and Zettlemoyer, L. (2023). Qlora: Efficient finetuning of quantized llms. In Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., and Levine, S., editors, Advances in Neural Information Processing Systems, volume 36, pages 10088–10115. Curran Associates, Inc.

Feng, S., Shi, W., Bai, Y., Balachandran, V., He, T., and Tsvetkov, Y. (2024). Knowledge card: Filling LLMs’ knowledge gaps with plug-in specialized language models. In The Twelfth International Conference on Learning Representations.

Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., and Zou, A. (2023). A framework for few-shot language model evaluation.

Garfinkel, S. L. (2012). The cybersecurity risk. Communications of the ACM, 55(6):29–32.

Gökçearslan, S., Tosun, C., and Erdemir, Z. G. (2024). Benefits, challenges, and methods of artificial intelligence (ai) chatbots in education: A systematic literature review. International Journal of Technology in Education, 7(1):19–39.

Gonzalez, J. J. and Sawicka, A. (2002). A framework for human factors in information security. In Wseas international conference on information security, Rio de Janeiro, pages 448–187.

Gundu, T. (2023). Chatbots: A framework for improving information security behaviours using chatgpt. In Furnell, S. and Clarke, N., editors, Human Aspects of Information Security and Assurance, pages 418–431, Cham. Springer Nature Switzerland.

Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., and Chen, W. (2021). Lora: Low-rank adaptation of large language models.

Hu, Z., Wang, L., Lan, Y., Xu, W., Lim, E.-P., Bing, L., Xu, X., Poria, S., and Lee, R. K.-W. (2023). Llm-adapters: An adapter family for parameter-efficient fine-tuning of large language models.

Lai, V., Ngo, N. T., Veyseh, A. P. B., Dernoncourt, F., and Nguyen, T. H. (2023). Open multilingual llm evaluation leaderboard.

Lee, D. D., Pham, P., Largman, Y., and Ng, A. (2009). Advances in neural information processing systems 22. Tech Rep.

META (2023). Llama 2. Acessado: 16/05/2024.

Mishra, A., Asai, A., Balachandran, V., Wang, Y., Neubig, G., Tsvetkov, Y., and Hajishirzi, H. (2024). Fine-grained hallucination detection and editing for language models.

Ni, H., Meng, S., Chen, X., Zhao, Z., Chen, A., Li, P., Zhang, S., Yin, Q., Wang, Y., and Chan, Y. (2024). Harnessing earnings reports for stock predictions: A qlora-enhanced llm approach.

Pires, R., Abonizio, H., Almeida, T. S., and Nogueira, R. (2023). Sabiá: Portuguese Large Language Models, page 226–240. Springer Nature Switzerland.

Rabe, M. N. and Staats, C. (2021). Self-attention does not need o(n̂2) memory. arXiv preprint arXiv:2112.05682.

Roberts, A., Raffel, C., and Shazeer, N. (2020). How much knowledge can you pack into the parameters of a language model? Schulman, J., Zoph, B., Kim, C., Hilton, J., Menick, J., Weng, J., Uribe, J. F. C., Fedus, L., Metz, L., Pokorny, M., et al. (2022). Chatgpt: Optimizing language models for dialogue. OpenAI blog, 2:4.

Shazeer, N. (2020). Glu variants improve transformer.

Su, J., Ahmed, M., Lu, Y., Pan, S., Bo, W., and Liu, Y. (2024). Roformer: Enhanced transformer with rotary position embedding. Neurocomputing, 568:127063.

Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., Rodriguez, A., Joulin, A., Grave, E., and Lample, G. (2023). Llama: Open and efficient foundation language models.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I. (2023). Attention is all you need.

Yang, H., Liu, X.-Y., and Wang, C. D. (2023). Fingpt: Open-source financial large language models. arXiv preprint arXiv:2306.06031.

Yunxiang, L., Zihan, L., Kai, Z., Ruilong, D., and You, Z. (2023). Chatdoctor: A medical chat model fine-tuned on llama model using medical domain knowledge. arXiv preprint arXiv:2303.14070.

Zhang, H., Chen, J., Jiang, F., Yu, F., Chen, Z., Li, J., Chen, G., Wu, X., Zhang, Z., Xiao, Q., et al. (2023a). Huatuogpt, towards taming language model to be a doctor. arXiv preprint arXiv:2305.15075.

Zhang, X., Rajabi, N., Duh, K., and Koehn, P. (2023b). Machine translation with large language models: Prompting, few-shot learning, and fine-tuning with QLoRA. In Koehn, P., Haddow, B., Kocmi, T., and Monz, C., editors, Proceedings of the Eighth Conference on Machine Translation, pages 468–481, Singapore. Association for Computational Linguistics.

Zhao, W. X., Zhou, K., Li, J., Tang, T., Wang, X., Hou, Y., Min, Y., Zhang, B., Zhang, J., Dong, Z., Du, Y., Yang, C., Chen, Y., Chen, Z., Jiang, J., Ren, R., Li, Y., Tang, X., Liu, Z., Liu, P., Nie, J.-Y., and Wen, J.-R. (2023). A survey of large language models.