Alocação de Blocos de Recursos em Redes Sem Fio Considerando Fatiamento de Rede e Comunicação D2D Utilizando Aprendizado por Reforço Profundo

Hudson H. de Souza Lopes; Flávio H. Teles Vieira

doi:10.5753/sbrc.2026.19785

Hudson H. de Souza Lopes UFG
Flávio H. Teles Vieira UFG

DOI: https://doi.org/10.5753/sbrc.2026.19785

Resumo

Neste artigo, consideramos um cenário de redes sem fio composto por uma única Estação Base (BS - Base Station) onde ocorrem Instâncias de Fatia de Rede (NSI - Network Slice Instances) e Comunicação Dispositivo a Dispositivo (D2D - Device-to-Device). A fim de resolver o problema complexo de alocação de Blocos de Recursos (RBs - Resource Blocks) levando em consideração os Acordos de Nível de Serviço (SLAs - Service Level Agreements) em cada NSI, propomos uma abordagem denominada DDPG-KRP, baseada no algoritmo de Gradiente de Política Determinística Profunda (DDPG - Deep Deterministic Policy Gradient), com o algoritmo K-vizinhos mais próximos (KNN - K-Nearest Neighbors) e no método de Penalização de Recompensa (RP - Reward Penalization). Os resultados de simulações mostram que o algoritmo proposto supera significativamente outros métodos baseados em Aprendizagem por Reforço Profundo (DRL - Deep Reinforcement Learning).

Referências

3GPP (2015). Proximity-based services (ProSe). Technical Specification (TS) 23.303, 3rd Generation Partnership Project (3GPP). TS 23.303 v13.0.0.

3GPP (2026). System architecture for the 5G System (5GS). Technical Specification (TS) 23.501, 3rd Generation Partnership Project (3GPP). TS 23.501 v19.7.0.

Alwarafy, A., Abdallah, M. M., Ciftler, B. S., Al-Fuqaha, A. I., and Hamdi, M. (2021). Deep reinforcement learning for radio resource allocation and management in next generation heterogeneous wireless networks: A survey. CoRR, abs/2106.00574.

An, Q., Segarra, S., Dick, C., Sabharwal, A., and Doost-Mohammady, R. (2023). A deep reinforcement learning-based resource scheduler for massive MIMO networks. CoRR, abs/2303.00958.

Chakraborty, S. and Sivalingam, K. (2023). Drl-based admission control and resource allocation for 5g network slicing. Sādhanā, 48.

Dulac-Arnold, G., Evans, R., van Hasselt, H., Sunehag, P., Lillicrap, T., Hunt, J., Mann, T., Weber, T., Degris, T., and Coppin, B. (2015). Deep reinforcement learning in large discrete action spaces.

Engin, E., Hokelek, I., Gorcin, A., and Cirpan, H. A. (2025). A pre-emptive scheduling mechanism for service assurance of network slicing in next generation cellular networks. IEEE Access, 13:23297–23311.

Li, R., Zhao, Z., Sun, Q., I, C.-L., Yang, C., Chen, X., Zhao, M., and Zhang, H. (2018). Deep reinforcement learning for resource management in network slicing. IEEE Access, 6:74429–74441.

Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., and Wierstra, D. (2016). Continuous control with deep reinforcement learning. 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings.

Liu, Y., Ding, J., and Liu, X. (2020). A constrained reinforcement learning based approach for network slicing. 2020 IEEE 28th International Conference on Network Protocols (ICNP), pages 1–6.

Moubayed, A., Shami, A., and Lutfiyya, H. (2015). Wireless resource virtualization with device-to-device communication underlaying lte network. IEEE Transactions on Broadcasting, 61(4):734–740.

Muja, M. and Lowe, D. G. (2014). Scalable nearest neighbor algorithms for high dimensional data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(11):2227–2240.

Nadeem, L., Amin, Y., Loo, J., Azam, M. A., and Chai, K. K. (2021). Efficient resource allocation using distributed edge computing in d2d based 5g-hcn with network slicing. IEEE Access, 9:134148–134162.

Saravanan, M. and Ganeshkumar, P. (2020). Routing using reinforcement learning in vehicular ad hoc networks. Computational Intelligence, 36(2):682–697.

Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., and Riedmiller, M. (2014). Deterministic policy gradient algorithms. Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32, page I–387–I–395.

Souza Lopes, H., Rocha, F., and Vieira, F. (2023). Deep reinforcement learning based resource allocation approach for wireless networks considering network slicing paradigm. Journal of Communication and Information Systems, 38(1):21–33.

Suh, K., Kim, S., Ahn, Y., Kim, S., Ju, H., and Shim, B. (2022). Deep reinforcement learning-based network slicing for beyond 5g. IEEE Access, 10:7384–7395.

Sun, G., Boateng, G. O., Ayepah-Mensah, D., Liu, G., and Wei, J. (2020). Autonomous resource slicing for virtualized vehicular networks with d2d communications based on deep reinforcement learning. IEEE Systems Journal, 14(4):4694–4705.

Xu, Y. (2017). Energy-efficient power control scheme for device-to-device communications. Wireless Personal Communications, 94(3):481–495.

Zulhasnine, M., Huang, C., and Srinivasan, A. (2010). Efficient resource allocation for device-to-device communication underlaying lte network. 2010 IEEE 6th International Conference on Wireless and Mobile Computing, Networking and Communications, pages 368–375.