Knowledge Distillation in Compact Models: An Approach Applied to Text Processing for Public Security
Abstract
This article aims to develop a model for summarizing police reports in the context of public security, with a focus on execution on limited hardware. The approach combined hybrid distillation (logits and intermediate representations) and supervised fine-tuning with LoRA, using a corpus of 19,286 police reports. The evaluation was conducted using automatic metrics (BERTScoreF1) and qualitative analysis by specialists. The results demonstrate that the distilled model generates clear, coherent, and semantically appropriate summaries, comparable to those of the larger model, with superior computational performance, including in hardware-constrained environments.References
Brown, T. B. et al. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems 33 (NeurIPS 2020), Proceedings of the 34th Conference on Neural Information Processing Systems, pages 1877–1901.
Hinton, G., Vinyals, O., and Dean, J. (2014). Distilling the knowledge in a neural network. Deep Learning and Representation Learning Workshop.
Jiao, X., Yin, Y., Shang, L., Jiang, X., Chen, X., Li, L., Wang, F., and Liu, Q. (2020). Tiny-bert: distilling bert for natural language understanding. Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, pages 4163–4174.
Pereira, J., Assumpção, A., Trecenti, J., Airosa, L., Lente, C., Cléto, J., Dobins, G., Nogueira, R., Mitchell, L., and Lotufo, R. (2025). Inacia: integrating large language models in brazilian audit courts: opportunities and challenges. Digital Government: Research and Practice, 6(1):1–20.
Ramesh, R., M, A. T. R., Reddy, H. V., and N, S. V. (2024). Fine-tuning large language models for task specific data. Proceedings of the 2nd International Conference on Networking, Embedded and Wireless Systems (ICNEWS), pages 1–6.
Sarzaeim, P., Mahmoud, Q. H., and Azim, A. (2024). A framework for llm-assisted smart policing system. IEEE Access.
Setiawan, H. (2024). Accurate knowledge distillation via n-best reranking. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 1330–1345.
Wei, J., Tay, Y., Bommasani, R., Raffel, C., Zoph, B., Borgeaud, S., Yogatama, D., Bosma, M., Zhou, D., Metzler, D., Chi, E. H., Hashimoto, T., Vinyals, O., Liang, P., Dean, J., and Fedus, W. (2022). Emergent abilities of large language models. Transactions on Machine Learning Research.
Zhang, T., Kishore, V., Wu, F., Weinberger, K. Q., and Artzi, Y. (2020). Bertscore: evaluating text generation with bert. Proceedings of the International Conference on Learning Representations (ICLR).
Zhang, Y., Long, D., Li, Z., and Xie, P. (2023). Text representation distillation via information bottleneck principle. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 14372–14383.
Zheng, X., Li, Y., Wang, H., Zhang, C., Zhang, Y., Zhou, X., Zhang, J., Zhang, J., Wang, Z., Li, K., Liu, Z., Li, L., He, X., and Wang, B. (2024). Large language models (llms): survey, technical frameworks, and future challenges. Artificial Intelligence Review, 58(3):1–29.
Hinton, G., Vinyals, O., and Dean, J. (2014). Distilling the knowledge in a neural network. Deep Learning and Representation Learning Workshop.
Jiao, X., Yin, Y., Shang, L., Jiang, X., Chen, X., Li, L., Wang, F., and Liu, Q. (2020). Tiny-bert: distilling bert for natural language understanding. Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, pages 4163–4174.
Pereira, J., Assumpção, A., Trecenti, J., Airosa, L., Lente, C., Cléto, J., Dobins, G., Nogueira, R., Mitchell, L., and Lotufo, R. (2025). Inacia: integrating large language models in brazilian audit courts: opportunities and challenges. Digital Government: Research and Practice, 6(1):1–20.
Ramesh, R., M, A. T. R., Reddy, H. V., and N, S. V. (2024). Fine-tuning large language models for task specific data. Proceedings of the 2nd International Conference on Networking, Embedded and Wireless Systems (ICNEWS), pages 1–6.
Sarzaeim, P., Mahmoud, Q. H., and Azim, A. (2024). A framework for llm-assisted smart policing system. IEEE Access.
Setiawan, H. (2024). Accurate knowledge distillation via n-best reranking. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 1330–1345.
Wei, J., Tay, Y., Bommasani, R., Raffel, C., Zoph, B., Borgeaud, S., Yogatama, D., Bosma, M., Zhou, D., Metzler, D., Chi, E. H., Hashimoto, T., Vinyals, O., Liang, P., Dean, J., and Fedus, W. (2022). Emergent abilities of large language models. Transactions on Machine Learning Research.
Zhang, T., Kishore, V., Wu, F., Weinberger, K. Q., and Artzi, Y. (2020). Bertscore: evaluating text generation with bert. Proceedings of the International Conference on Learning Representations (ICLR).
Zhang, Y., Long, D., Li, Z., and Xie, P. (2023). Text representation distillation via information bottleneck principle. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 14372–14383.
Zheng, X., Li, Y., Wang, H., Zhang, C., Zhang, Y., Zhou, X., Zhang, J., Zhang, J., Wang, Z., Li, K., Liu, Z., Li, L., He, X., and Wang, B. (2024). Large language models (llms): survey, technical frameworks, and future challenges. Artificial Intelligence Review, 58(3):1–29.
Published
2025-09-29
How to Cite
BARCELAR, Ricardo Rodrigues; GARCIA, Leonardo Arruda Vilela; HELENO, Alan Papafanurakis; VENTURA, Thiago Meirelles; OLIVEIRA, Allan Gonçalves de.
Knowledge Distillation in Compact Models: An Approach Applied to Text Processing for Public Security. In: BRAZILIAN SYMPOSIUM IN INFORMATION AND HUMAN LANGUAGE TECHNOLOGY (STIL), 16. , 2025, Fortaleza/CE.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 42-51.
DOI: https://doi.org/10.5753/stil.2025.37812.
