Reprodutibilidade de Experimentos em Redes de Computadores através do Catálogo de Dados RNP

Vitor Fontana Zanotelli; Nilson Luís Damasceno; Arthur Almeida Vianna; Gustavo Araujo; Michael Prieto Hernandez; Giovanni Comarela; Magnos Martinello; Antonio A. de A. Rocha

doi:10.5753/wgrs.2024.3264

Vitor Fontana Zanotelli UFES
Nilson Luís Damasceno UFF
Arthur Almeida Vianna UFF
Gustavo Araujo RNP
Michael Prieto Hernandez RNP
Giovanni Comarela UFES
Magnos Martinello UFES
Antonio A. de A. Rocha UFF

DOI: https://doi.org/10.5753/wgrs.2024.3264

Resumo

A importância da reprodução de experimentos é amplamente discutida na comunidade científica. São encontrados tanto desafios quanto propostas de soluções na literatura. Um entrave comum está relacionado à disponibilidade de dados. A RNP coleta e armazena dados relacionados aos seus serviços e para facilitar seu acesso, o projeto Catálogo de Dados foi criado. Esse trabalho apresenta o Catálogo em duas etapas: i) primeiro a partir da descrição do projeto e de suas bases de dados e, ii) em seguida, um caso de replicação de trabalho da literatura referente ao uso de modelos de aprendizado de máquina para predição de RTT. Para a replicação, são utilizadas redes neurais recorrentes (RNNs, GRUs e LSTMs), alcançando resultados próximos aos originais.

Referências

ACM (2020). Artifact review and badging version 1.1. [link]. Acessado: 03.01.2024.

Arnold, B., Bowler, L., Gibson, S., Herterich, P., Higman, R., Krystalli, A., Morley, A., O’Reilly, M., Whitaker, K., et al. (2019). The turing way: a handbook for reproducible data science. Zenodo.

Bajpai, V., Kühlewind, M., Ott, J., Schönwälder, J., Sperotto, A., and Trammell, B. (2017). Challenges with reproducibility. In Proceedings of the Reproducibility Workshop, Reproducibility ’17, page 1–4, New York, NY, USA. Association for Computing Machinery.

Canini, M. and Crowcroft, J. (2017). Learning reproducibility with a yearly networking contest. In Proceedings of the Reproducibility Workshop, Reproducibility ’17, page 9–13, New York, NY, USA. Association for Computing Machinery.

Collberg, C. and Proebsting, T. A. (2016). Repeatability in computer systems research. Commun. ACM, 59(3):62–69.

Cunha, I., Teixeira, R., Veitch, D., and Diot, C. (2014). Dtrack: A system to predict and track internet path changes. IEEE/ACM Transactions on Networking, 22(4):1025–1038.

David, A., Souppe, M., Jimenez, I., Obraczka, K., Mansfield, S., Veenstra, K., and Maltzahn, C. (2019). Reproducible computer network experiments: A case study using popper. In Proceedings of the 2nd International Workshop on Practical Reproducible Evaluation of Computer Systems, P-RECS ’19, page 29–34, New York, NY, USA. Association for Computing Machinery.

Dong, A., Du, Z., and Yan, Z. (2019). Round Trip Time Prediction Using Recurrent Neural Networks With Minimal Gated Unit. IEEE Communications Letters, 23(4):584–587. Conference Name: IEEE Communications Letters.

Garcia, S., Grill, M., Stiborek, J., and Zunino, A. (2014). An empirical comparison of botnet detection methods. computers & security, 45:100–123.

Li, R. and Zhang, X. (2021). All You Need is Transformer: RTT Prediction for TCP based on Deep Learning Approach. In 2021 International Conference on Digital Society and Intelligent Systems (DSInS), pages 348–351.

Pan, J. Z. (2009). Resource description framework. In Handbook on ontologies, pages 71–90. Springer.

Powers, S. (2003). Practical RDF: solving problems with the resource description framework. ”O’Reilly Media, Inc.”.

Sandve, G. K., Nekrutenko, A., Taylor, J., and Hovig, E. (2013). Ten simple rules for reproducible computational research. PLoS computational biology, 9(10):e1003285.

Scheitle, Q., Wählisch, M., Gasser, O., Schmidt, T. C., and Carle, G. (2017). Towards an ecosystem for reproducible research in computer networking. In Proceedings of the Reproducibility Workshop, Reproducibility ’17, page 5–8, New York, NY, USA. Association for Computing Machinery.

Shannon, C., Moore, D., Keys, K., Fomenkov, M., Huffaker, B., and claffy, k. (2005). The internet measurement data catalog. SIGCOMM Comput. Commun. Rev., 35(5):97–100.

Valeros, V. and Garcia, S. (2022). Hornet 40: Network dataset of geographically placed honeypots. Data in Brief, 40:107795.

Vandewalle, P., Kovacevic, J., and Vetterli, M. (2009). Reproducible research in signal processing. IEEE Signal Processing Magazine, 26(3):37–47.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.

Wassermann, S., Casas, P., Cuvelier, T., and Donnet, B. (2017). NETPerfTrace: Predicting Internet Path Dynamics and Performance with Machine Learning. In Proceedings of the Workshop on Big Data Analytics and Machine Learning for Data Communication Networks, Big-DAMA ’17, pages 31–36, New York, NY, USA. Association for Computing Machinery.

Yeo, J., Kotz, D., and Henderson, T. (2006). Crawdad: A community resource for archiving wireless data at dartmouth. SIGCOMM Comput. Commun. Rev., 36(2):21–22.