Sliced WANs for Data-Intensive Science: Deployment Experiences and Performance Analysis

  • Edgard C. Pontes UFES
  • Vitor Zanotelli UFES
  • Magnos Martinello UFES
  • Jordi Ros-Giralt Qualcomm Europe
  • Everson S. Borges UFES
  • Moisés R. N. Ribeiro UFES
  • Harvey Newman Caltech

Abstract


The task of transferring massive data sets in Data-Intensive Science (DIS) systems, such as those generated from high energy experiments at CERN in Switzerland and France, and at the Sirius synchrotron light source in Brazil, often rely on physical WAN infrastructure for network connectivity that is provided by various National Research and Education Networks (NRENs), including ESnet, Géant, Internet2, RNP, among others. Sliced WANs bring a new paradigm for infrastructure yet to be exploited by DIS, but a realistic study of these particular systems poses a significant challenge due to their complexity, scale, and the number of factors affecting the data transport. In this paper, we address some of these challenges by deploying and evaluating a virtual infrastructure for data transport within a representative national-scale WAN. Our approach here encompasses two main aspects: i) Evaluating the performance of TCP congestion control algorithms (BBR versus Cubic) when only a single path is available for the data transfer; and ii) Assessing the performance of flow completion times (related to the management of bandwidth allocation) for sets of interdependent transfers in an environment provided by a network slice.

References

Baldin, I., Nikolich, A., Griffioen, J., Monga, I. I. S., Wang, K.-C., Lehman, T., and Ruth, P. (2019). Fabric: A national-scale programmable experimental network infrastructure. IEEE Internet Computing, 23(6):38–47.

Berman, M., Chase, J. S., Landweber, L., Nakao, A., Ott, M., Raychaudhuri, D., Ricci, R., and Seskar, I. (2014). Geni: A federated testbed for innovative network experiments. Computer Networks, 61:5–23. Special issue on Future Internet Testbeds – Part I.

Borges, E., Pontes, E., Dominicini, C., Schwarz, M., Mate, C., Loui, F., Guimarães, R., Martinello, M., Villaça, R., and Ribeiro, M. (2022). A lifecycle experience of polka: From prototyping to deployment at géant lab with rare/freertr. In Anais do XIII Workshop de Pesquisa Experimental da Internet do Futuro, pages 35–40, Porto Alegre, RS, Brasil. SBC.

Both, C., Guimaraes, R., Slyne, F., Wickboldt, J., Martinello, M., Dominicini, C., Martins, R., Zhang, Y., Cardoso, D., Villaca, R., Ceravolo, I., Nejabati, R., Marquez-Barja, J., Ruffini, M., and DaSilva, L. (2019). Futebol control framework: Enabling experimentation in convergent optical, wireless, and cloud infrastructures. IEEE Communications Magazine, 57(10):56–62.

Cao, Y., Jain, A., Sharma, K., Balasubramanian, A., and Gandhi, A. (2019). When to use and when not to use bbr: An empirical analysis and evaluation study. In Proceedings of the Internet Measurement Conference, IMC ’19, page 130–136, New York, NY, USA. Association for Computing Machinery.

Cardwell, N., Cheng, Y., Yang, K., Morley, D., Hassas, S., Jha, P., Seung, Y., Jacobson, V., Swett, I., Wu, B., et al. (2023). Bbrv3: Algorithm bug fixes and public internet deployment. Presentation in CCWG at IETF, 117.

Dominicini, C. et al. (2020). Polka: Polynomial key-based architecture for source routing in network fabrics. In 2020 6th IEEE Conference on Network Softwarization (NetSoft), pages 326–334. IEEE.

Dunefsky, J., Soleimani, M., Yang, R., Ros-Giralt, J., Lassnig, M., Monga, I., Wuerthwein, F. K., Zhang, J., Gao, K., and Yang, Y. R. (2022). Transport control networking: Optimizing efficiency and control of data transport for data-intensive networks. In Proceedings of the ACM SIGCOMM Workshop on Network-Application Integration, NAI ’22, page 60–66, New York, NY, USA. Association for Computing Machinery.

Guimarães, R. S., Dominicini, C., Martínez, V. M. G., Xavier, B. M., Mafioletti, D. R., Locateli, A. C., Villaca, R., Martinello, M., and Ribeiro, M. R. N. (2022). M-polka: Multipath polynomial key-based source routing for reliable communications. IEEE Transactions on Network and Service Management, pages 1–1.

Hopps, C. and Thaler, D. (2000). Multipath Issues in Unicast and Multicast Next-Hop Selection. RFC 2991.

Huang, T., Yu, F. R., Zhang, C., Liu, J., Zhang, J., and Liu, Y. (2017). A survey on large-scale software defined networking SDN testbeds: Approaches and challenges. IEEE Communications Surveys Tutorials.

Iyengar, J. and Thomson, M. (2021). Rfc 9000: Quic: A udp-based multiplexed and secure transport. Omtermet Emgomeeromg Task Force.

M., S. et al. (2014). Design and implementation of the ofelia FP7 facility: The european openflow testbed. Computer Network, 61:132–150.

Martins, J. S. B., Carvalho, T. C., Moreira, R., Both, C. B., Donatti, A., Corrêa, J. H., Suruagy, J. A., Corrêa, S. L., Abelem, A. J. G., Ribeiro, M. R. N., Nogueira, J.- m. S., Magalhães, L. C. S., Wickboldt, J., Ferreto, T. C., Mello, R., Pasquini, R., Schwarz, M., Sampaio, L. N., Macedo, D. F., De Rezende, J. F., Cardoso, K. V., and De Oliveira Silva, F. (2023). Enhancing network slicing architectures with machine learning, security, sustainability and experimental networks integration. IEEE Access, 11:69144–69163.

Newman, H. B., Ellisman, M. H., and Orcutt, J. A. (2003). Data-intensive e-science frontier research. Commun. ACM, 46(11):68–77.

Ros-Giralt, J., Amsel, N., Yellamraju, S., Ezick, J., Lethin, R., Jiang, Y., Feng, A., Tassiulas, L., Wu, Z., Teh, M. Y., and Bergman, K. (2021). Designing data center networks using bottleneck structures. In Proceedings of the 2021 ACM SIGCOMM 2021 Conference, SIGCOMM ’21, page 319–348, New York, NY, USA. Association for Computing Machinery.

Ros-Giralt, J., Bohara, A., Yellamraju, S., Langston, M. H., Lethin, R., Jiang, Y., Tassiulas, L., Li, J., Tan, Y., and Veeraraghavan, M. (2019). On the bottleneck structure of congestion-controlled networks. Proc. ACM Meas. Anal. Comput. Syst., 3(3).

Salmito, T. et al. (2014). Fibre-an international testbed for future internet experimentation. In Simpósio Brasileiro de Redes de Computadores e Sistemas Distribuídos-SBRC 2014, pages p–969.

Spang, B., Arslan, S., and McKeown, N. (2022). Updating the theory of buffer sizing. SIGMETRICS Perform. Eval. Rev., 49(3):55–56.

Vargas, S., Drucker, R., Renganathan, A., Balasubramanian, A., and Gandhi, A. (2021). Bbr bufferbloat in dash video. In Proceedings of the Web Conference 2021, WWW ’21, page 329–341, New York, NY, USA. Association for Computing Machinery.

Weigle, E. and chun Feng, W. (2002). A comparison of tcp automatic tuning techniques for distributed computing. In Proceedings 11th IEEE International Symposium on High Performance Distributed Computing, pages 265–272.

Yuan, B., He, Y., Davis, J. Q., Zhang, T., Dao, T., Chen, B., Liang, P., Re, C., and Zhang, C. (2023). Decentralized training of foundation models in heterogeneous environments.

Zink, M., Irwin, D., Cecchet, E., Saplakoglu, H., Krieger, O., Herbordt, M., Daitzman, M., Desnoyers, P., Leeser, M., and Handagala, S. (2021). The open cloud testbed (oct): A platform for research into new cloud technologies. In 2021 IEEE 10th International Conference on Cloud Networking (CloudNet), pages 140–147.
Published
2024-05-20
PONTES, Edgard C.; ZANOTELLI, Vitor; MARTINELLO, Magnos; ROS-GIRALT, Jordi; BORGES, Everson S.; RIBEIRO, Moisés R. N.; NEWMAN, Harvey. Sliced WANs for Data-Intensive Science: Deployment Experiences and Performance Analysis. In: BRAZILIAN SYMPOSIUM ON COMPUTER NETWORKS AND DISTRIBUTED SYSTEMS (SBRC), 42. , 2024, Niterói/RJ. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 461-474. ISSN 2177-9384. DOI: https://doi.org/10.5753/sbrc.2024.1425.

Most read articles by the same author(s)

1 2 > >>