Predição de Viewports em Transmissão de Vídeo 360° com Aprendizado Profundo: Modelagem e Avaliação
Resumo
O streaming de vídeo imersivo 360° exige a predição de viewports futuras a partir do movimento angular do HMD (Head-Mounted Display), viabilizando requisições antecipadas e mitigando interrupções na reprodução (stalls) causadas pela navegação imersiva do usuário e por instabilidades de rede. Apesar do avanço de modelos de Aprendizado Profundo, a avaliação permanece majoritariamente restrita a métricas de domínio dos modelos de aprendizado, erros de regressão, o que é insuficiente para refletir o impacto e a aplicabilidade desses preditores em sistemas reais de streaming em 360°. Neste contexto, este trabalho propõe Viewport-P, uma modelagem de predição de viewports alinhada às operações e métricas de sistemas imersivos. Com base nessa modelagem, é implementado um pipeline completo de predição, integrando modelos de Aprendizado Profundo amplamente estabelecidos, como CNN puros e híbridos com GRU e LSTM. A avaliação combina métricas de erro angular com métricas orientadas ao sistema, em particular a eficiência espacial em buffer via eventos de tile miss. Os resultados mostram que métricas espaciais capturam de forma mais fiel o efeito do horizonte de predição e da dinâmica do usuário, evidenciando ganhos consistentes dos modelos CNN híbridos em cenários relevantes para QoE.Referências
Alidadi Shamsabadi, A., Yadav, A., Gadallah, Y., and Yanikomeroglu, H. (2025). Exploring the 6g potentials: Immersive, hyperreliable, and low-latency communication. IEEE Vehicular Technology Magazine, pages 2–10.
Ashida, H. and Fujimoto, K. (2022). Comparing measurements of head motion and centre of pressure for body sway induced by optic flow on a head-mounted display. Frontiers in Virtual Reality, Volume 3 - 2022.
Bentaleb, A., Lim, M., Hammoudi, S., Harous, S., and Zimmermann, R. (2025). Solutions, challenges, and opportunities in volumetric video streaming: An architectural perspective. ACM Trans. Multimedia Comput. Commun. Appl., 21(7).
Corbillon, X., De Simone, F., and Simon, G. (2017). 360-degree video head movement dataset. In Proceedings of the 8th ACM on Multimedia Systems Conference, MMSys’17, page 199–204, New York, NY, USA. Association for Computing Machinery.
Dharmasiri, A., Kattadige, C., Zhang, V., and Thilakarathna, K. (2021). Viewport-aware dynamic 360° video segment categorization. In Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video, NOSSDAV ’21, page 114–121, New York, NY, USA. Association for Computing Machinery.
Guan, Y., Zheng, C., Zhang, X., Guo, Z., and Jiang, J. (2019). Pano: optimizing 360° video streaming with a better understanding of quality perception. In Proceedings of the ACM Special Interest Group on Data Communication, SIGCOMM ’19, page 394–407, New York, NY, USA. Association for Computing Machinery.
Kimura, B., Ferlin, S., Paiva, T., Mahmoodi, T., Brunstrom, A., and Alay, O. (2025). Evaluating adaptive video streaming over multipath quic with shared bottleneck detection. ACM Trans. Multimedia Comput. Commun. Appl., 21(9).
Lo, W.-C., Fan, C.-L., Lee, J., Huang, C.-Y., Chen, K.-T., and Hsu, C.-H. (2017). 360° video viewing dataset in head-mounted virtual reality. In Proceedings of the 8th ACM on Multimedia Systems Conference, MMSys’17, page 211–216, New York, NY, USA. Association for Computing Machinery.
Mahmoud, M., Stamatia, S. R. R., Panayides, A. S., Lazaridis, P. I., Kantartzis, N. V., Karagiannidis, G. K., and Zaharis, Z. D. (2024). A comparative analysis of viewing prediction techniques for 360° video streaming applications. In 2024 Panhellenic Conference on Electronics Telecommunications (PACET), pages 1–4.
Nasrabadi, A. T., Samiei, A., Mahzari, A., McMahan, R. P., Prakash, R., Farias, M. C. Q., and Carvalho, M. M. (2019). A taxonomy and dataset for 360° videos. In Proceedings of the 10th ACM Multimedia Systems Conference, MMSys ’19, page 273–278, New York, NY, USA. Association for Computing Machinery.
Perumal, T., Mustapha, N., Mohamed, R., and Shiri, F. M. (2024). A comprehensive overview and comparative analysis on deep learning models. Journal on Artificial Intelligence, 6(1):301–360.
Recommendation, I. (2023). Framework and overall objectives of the future development of imt for 2030 and beyond. International Telecommunication Union (ITU) Recommendation (ITU-R).
Rosa, F., Ferlin, S., Brunstrom, A., da Costa, J. B. D., and Kimura, B. (2026). Enhancing 360° Video Streaming with Stream Scheduling Policies over HTTP/3. In (to appear) Proceedings of the IEEE Wireless Communications and Networking Conference (IEEE WCNC 2026), Malaysia. IEEE.
Rosa, F., Ferlin, S., Brunström, A., and Kimura, B. (2025). End-to-End 360° Video Streaming over HTTP/3: Architecture and Implementation. In Proceedings of the ACM/IRTF Applied Networking Research Workshop 2025 (ANRW’25), Spain. ACM.
Setayesh, M. and Wong, V. W. (2023). A content-based viewport prediction framework for 360° video using personalized federated learning and fusion techniques. In 2023 IEEE International Conference on Multimedia and Expo (ICME), pages 654–659.
Wan, Z., Hu, Y., Zhou, Y., Liu, X., and Zhao, S. (2024). Ebi360: An edge-assisted viewport prediction method for 360° video based on bilstm. In 2024 International Conference on Virtual Reality and Visualization (ICVRV), pages 19–24.
Wu, C., Tan, Z., Wang, Z., and Yang, S. (2017). A dataset for exploring user behaviors in vr spherical video streaming. In Proceedings of the 8th ACM on Multimedia Systems Conference, MMSys’17, page 193–198, New York, NY, USA. Association for Computing Machinery.
Ashida, H. and Fujimoto, K. (2022). Comparing measurements of head motion and centre of pressure for body sway induced by optic flow on a head-mounted display. Frontiers in Virtual Reality, Volume 3 - 2022.
Bentaleb, A., Lim, M., Hammoudi, S., Harous, S., and Zimmermann, R. (2025). Solutions, challenges, and opportunities in volumetric video streaming: An architectural perspective. ACM Trans. Multimedia Comput. Commun. Appl., 21(7).
Corbillon, X., De Simone, F., and Simon, G. (2017). 360-degree video head movement dataset. In Proceedings of the 8th ACM on Multimedia Systems Conference, MMSys’17, page 199–204, New York, NY, USA. Association for Computing Machinery.
Dharmasiri, A., Kattadige, C., Zhang, V., and Thilakarathna, K. (2021). Viewport-aware dynamic 360° video segment categorization. In Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video, NOSSDAV ’21, page 114–121, New York, NY, USA. Association for Computing Machinery.
Guan, Y., Zheng, C., Zhang, X., Guo, Z., and Jiang, J. (2019). Pano: optimizing 360° video streaming with a better understanding of quality perception. In Proceedings of the ACM Special Interest Group on Data Communication, SIGCOMM ’19, page 394–407, New York, NY, USA. Association for Computing Machinery.
Kimura, B., Ferlin, S., Paiva, T., Mahmoodi, T., Brunstrom, A., and Alay, O. (2025). Evaluating adaptive video streaming over multipath quic with shared bottleneck detection. ACM Trans. Multimedia Comput. Commun. Appl., 21(9).
Lo, W.-C., Fan, C.-L., Lee, J., Huang, C.-Y., Chen, K.-T., and Hsu, C.-H. (2017). 360° video viewing dataset in head-mounted virtual reality. In Proceedings of the 8th ACM on Multimedia Systems Conference, MMSys’17, page 211–216, New York, NY, USA. Association for Computing Machinery.
Mahmoud, M., Stamatia, S. R. R., Panayides, A. S., Lazaridis, P. I., Kantartzis, N. V., Karagiannidis, G. K., and Zaharis, Z. D. (2024). A comparative analysis of viewing prediction techniques for 360° video streaming applications. In 2024 Panhellenic Conference on Electronics Telecommunications (PACET), pages 1–4.
Nasrabadi, A. T., Samiei, A., Mahzari, A., McMahan, R. P., Prakash, R., Farias, M. C. Q., and Carvalho, M. M. (2019). A taxonomy and dataset for 360° videos. In Proceedings of the 10th ACM Multimedia Systems Conference, MMSys ’19, page 273–278, New York, NY, USA. Association for Computing Machinery.
Perumal, T., Mustapha, N., Mohamed, R., and Shiri, F. M. (2024). A comprehensive overview and comparative analysis on deep learning models. Journal on Artificial Intelligence, 6(1):301–360.
Recommendation, I. (2023). Framework and overall objectives of the future development of imt for 2030 and beyond. International Telecommunication Union (ITU) Recommendation (ITU-R).
Rosa, F., Ferlin, S., Brunstrom, A., da Costa, J. B. D., and Kimura, B. (2026). Enhancing 360° Video Streaming with Stream Scheduling Policies over HTTP/3. In (to appear) Proceedings of the IEEE Wireless Communications and Networking Conference (IEEE WCNC 2026), Malaysia. IEEE.
Rosa, F., Ferlin, S., Brunström, A., and Kimura, B. (2025). End-to-End 360° Video Streaming over HTTP/3: Architecture and Implementation. In Proceedings of the ACM/IRTF Applied Networking Research Workshop 2025 (ANRW’25), Spain. ACM.
Setayesh, M. and Wong, V. W. (2023). A content-based viewport prediction framework for 360° video using personalized federated learning and fusion techniques. In 2023 IEEE International Conference on Multimedia and Expo (ICME), pages 654–659.
Wan, Z., Hu, Y., Zhou, Y., Liu, X., and Zhao, S. (2024). Ebi360: An edge-assisted viewport prediction method for 360° video based on bilstm. In 2024 International Conference on Virtual Reality and Visualization (ICVRV), pages 19–24.
Wu, C., Tan, Z., Wang, Z., and Yang, S. (2017). A dataset for exploring user behaviors in vr spherical video streaming. In Proceedings of the 8th ACM on Multimedia Systems Conference, MMSys’17, page 193–198, New York, NY, USA. Association for Computing Machinery.
Publicado
25/05/2026
Como Citar
ROSA, Felipe; FERLIN, Simone; COSTA, Joahannes B. D. da; KIMURA, Bruno.
Predição de Viewports em Transmissão de Vídeo 360° com Aprendizado Profundo: Modelagem e Avaliação. In: WORKSHOP DE INTELIGÊNCIA ARTIFICIAL PARA REDES DE COMPUTADORES (WIARC), 1. , 2026, Praia do Forte/BA.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2026
.
p. 141-154.
DOI: https://doi.org/10.5753/wiarc.2026.23997.
