Advancing Network Monitoring and Operation with In-band Network Telemetry and Data Plane Programmability

  • Jonatas A. Marques UFRGS
  • Luciano Paschoal Gaspary UFRGS

Abstract


Modern communication networks operate under high expectations on performance and resilience mainly due to the continuous proliferation of nonelastic highly-distributed applications. In this context, closely monitoring the state, behavior, and performance of networking devices and their traffic as well as quickly troubleshooting problems as they arise is essential for the operation of network infrastructures. In this thesis, we make several contributions — based on in-band network telemetry and data plane programmability — that advance the discipline of network monitoring and operation. We formalize telemetry orchestration problems, prove their NP-Completeness, and propose polynomial computing time heuristic to efficiently solve real instances of these problems. We also design a system that combines in-band telemetry and in-network computation to enable the highly accurate and fine-grained detection and diagnosis of service-level objective violations. Finally, we introduce an approach that is able to recover from network link and node failures at data-plane timescales via policy-optimal paths. We also discuss opportunities and challenges for adapting this approach for other time-sensitive network management tasks.

References

Balakrishnan, H. (2021). Mind the app! SIGCOMM Lifetime Achievement Award (SIGCOMM’21 Keynote).

Bosshart, P., Daly, D., Gibb, G., Izzard, M., McKeown, N., Rexford, J., Schlesinger, C., Talayco, D., Vahdat, A., Varghese, G., and Walker, D. (2014). P4: Programming protocol-independent packet processors. SIGCOMM Comput. Commun. Rev., 44(3):87–95.

Bosshart, P., Gibb, G., Kim, H.-S., Varghese, G., McKeown, N., Izzard, M., Mujica, F., and Horowitz, M. (2013). Forwarding metamorphosis: Fast programmable matchaction processing in hardware for sdn. In Proceedings of the ACM SIGCOMM 2013 Conference on SIGCOMM, SIGCOMM ’13, pages 99–110, New York, NY, USA. ACM.

Cordeiro, W. L. d. C., Marques, J. A., and Gaspary, L. P. (2017). Data plane programmability beyond openflow: Opportunities and challenges for network and service operations and management. Journal of Network and Systems Management, 25(4):784–818.

Dalmazo, B. L., Marques, J. A., Costa, L. R., Bonfim, M. S., Carvalho, R. N., da Silva, A. S., Fernandes, S., Bordim, J. L., Alchieri, E., Schaeffer-Filho, A., Paschoal Gaspary, L., and Cordeiro, W. (2021). A systematic review on distributed denial of service attack defense mechanisms in programmable networks. International Journal of Network Management, 31(6):e2163.

González, L. A. Q., Castanheira, L., Marques, J. A., Schaeffer-Filho, A., and Gaspary, L. P. (2021). Bungee: An adaptive pushback mechanism for ddos detection and mitigation in p4 data planes. In 2021 IFIP/IEEE International Symposium on Integrated Network Management (IM), pages 393–401.

Ilha, A. d. S., Lapolli, A. C., Marques, J. A., and Gaspary, L. P. (2021). Euclid: A fully in-network, p4-based approach for real-time ddos attack detection and mitigation. IEEE Transactions on Network and Service Management, 18(3):3121–3139.

Kim, C., Sivaraman, A., Katta, N., Bas, A., Dixit, A., and Wobker, L. J. (2015). In-band network telemetry via programmable dataplanes. In Proceedings of the 2015 ACM Symposium on SDN Research, SOSR’15, New York, NY, USA. ACM.

Lapolli, A. C., Marques, J. A., and Gaspary, L. P. (2019). Offloading real-time ddos attack detection to programmable data planes. In 2019 IFIP/IEEE Symposium on Integrated Network and Service Management (IM), pages 19–27.

Marques, J., Levchenko, K., and Gaspary, L. (2020). Intsight: Diagnosing slo violations with in-band network telemetry. In Proceedings of the 16th International Conference on Emerging Networking EXperiments and Technologies, CoNEXT ’20, page 421–434, New York, NY, USA. Association for Computing Machinery.

Marques, J. A. and Gaspary, L. (2018). Explorando estratégias de orquestração de telemetria em planos de dados programáveis. In Anais do XXXVI Simpósio Brasileiro de Redes de Computadores e Sistemas Distribuídos, pages 1299–1312, Porto Alegre, RS, Brasil. SBC.

Marques, J. A., Levchenko, K., and Gaspary, L. P. (2023). Responding to network failures at data-plane speeds with network programmability. In 2023 IEEE/IFIP Network Operations and Management Symposium (NOMS). To appear.

Marques, J. A., Luizelli, M. C., da Costa Filho, R. I. T., and Gaspary, L. P. (2019). An optimization-based approach for efficient network monitoring using in-band network telemetry. Journal of Internet Services and Applications, 10(1):1–20.

McKeown, N., Anderson, T., Balakrishnan, H., Parulkar, G., Peterson, L., Rexford, J., Shenker, S., and Turner, J. (2008). Openflow: Enabling innovation in campus networks. SIGCOMM Comput. Commun. Rev., 38(2):69–74.

Silva, M. V., Marques, J. A., Gaspary, L., and Granville, L. Z. (2018). Identificação de fluxos elefantes em redes de ponto de troca de tráfego com suporte à programabilidade p4. In Anais do XXXVI Simpósio Brasileiro de Redes de Computadores e Sistemas Distribuídos, pages 1131–1144, Porto Alegre, RS, Brasil. SBC.

Silva, M. V. B. d., Marques, J. A., Gaspary, L. P., and Granville, L. Z. (2020). Identifying elephant flows using dynamic thresholds in programmable ixp networks. Journal of Internet Services and Applications, 11(1):1–12.

Vassoler, G., Marques, J. A., and Gaspary, L. P. (2023). Vermont: Towards an in-band telemetry-based approach for live network property verification. In 2023 IEEE/IFIP Network Operations and Management Symposium (NOMS). To appear.
Published
2023-05-22
MARQUES, Jonatas A.; GASPARY, Luciano Paschoal. Advancing Network Monitoring and Operation with In-band Network Telemetry and Data Plane Programmability. In: DISSERTATION DIGEST - BRAZILIAN SYMPOSIUM ON COMPUTER NETWORKS AND DISTRIBUTED SYSTEMS (SBRC), 41. , 2023, Brasília/DF. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 112-119. ISSN 2177-9384. DOI: https://doi.org/10.5753/sbrc_estendido.2023.677.