A set of data bases to support intelligent agents in Internet Infrastructure routing domains

Julião Braga; Joao  Silva; Nizam  Omar

doi:10.5753/wpietf.2019.6579

Julião Braga INESC-ID
Joao Silva INESC-ID
Nizam Omar MAKENZIE

DOI: https://doi.org/10.5753/wpietf.2019.6579

Abstract

This paper presents a set of three data bases that make up the In- ternet Infrastructure Data Base (IIDB). IIDB has three data bases – iidb.rfc, iidb.person, and iidb.acronym – that are key pieces to support the development of machine learning techniques by the intelligent elements of the Autonomous Architecture Over Restricted Domains (A2RD). The data contained in iidb.rfc and iidb.person were created after processing the contents available at the RFC Index web page. While the data contained in the iidb.acronym was created after processing the contents of the files available at the Request for Comments (RFC) repository, produced and maintained by the RFC Editor. The data format of IIDB data is JavaScript Object Notation (JSON), whose templates are avail- able in the same site where the data bases are deposited, making them accessible through any programming language.

Keywords: ietf, irtf, rfc, acronym, Internet Infrastructure, agents.

References

Batista, G. E. A. P. A., Prati, R. C., and Monard, M. C. (2004). A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explor. Newsl., 6(1):20–29.

Bird, S., Klein, E., and Loper, E. (2009). Natural language processing with Python. ” O’Reilly Media, Inc.”.

Braga, J., Silva, J. N., Endo, P. T., and Omar, N. (2019). Autonomous Architec- ture Over Restricted Domains (A2RD). DOI 10.17605/OSF.IO/TKA9U. Available at https://osf.io/tka9u/. Acessed: 19 Mar 2019.

Braga, J., Silva, J. N., Endo, P. T., Ribas, J., and Omar, N. (2018). Blockchain to Improve Security, Knowledge and Collaboration Inter-Agent Communication over Restrict Do- mains of the Internet Infrastructure. In Proceeding of CSBC 2018 - VWorkshop pre IETF, pages 61–73, Natal, RN Brazil.

Colel, R., Callon, R., Gardner, E., and Rekhter, Y. (May 1994). Guidelines for OSI NSAP Allocation in the Internet . Technical report, RFC Editor. RFC1629.

Conroy, J. M. and O’leary, D. P. (2001). Text summarization via hidden Markov models. In Proceedings ofthe 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 406–407.

Fellbaum, C. (1998). WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA.

Freitag, D. and McCallum, A. (1999). Information extraction with hmms and shrink- age. In Proceedings of the AAAI-99 workshop on machine learning for information extraction, pages 31–36. Orlando, Florida.

Hares, S. and Katz, D. (December 1989). Administrative Domains and Routing Domains: A model for routing in the Internet. Technical report, RFC Editor. RFC113.

Isotani, S. and Bittencourt, I. I. (2015). Dados abertos conectados. Novatec Editora, S˜ao Paulo, SP, Brasil.

Jacobs, K., Itai, A., and Wintner, S. (2018). Acronyms: identification, expansion and disambiguation. Annals ofMathematics and Artificial Intelligence, pages 1–16.

Ji, X., Xu, G., Bailey, J., and Li, H. (2008). Mining, ranking, and using acronym patterns. In Asia-Pacific Web Conference, pages 371–382. Springer.

Khairova, N., Petrasova, S., Lewoniewski, W., Mamyrbayev, O., and Mukhsina, K. (2018). Automatic extraction of synonymous collocation pairs from a text corpus. In 2018 Federated Conference on Computer Science and Information Systems (FedCSIS), pages 485–488. IEEE.

Miller, G. A. (1995). WordNet: A Lexical Database for English. Communications ofthe ACM, 38(11):39–41.

Moldovan, D. and Novischi, A. (2004). Word sense disambiguation of wordnet glosses. Computer Speech & Language, 18(3):301–317.

Musumeci, F., Rottondi, C., Nag, A., Macaluso, I., Zibar, D., Ruffini, M., and Torna- tore, M. (2018). A Survey on Application of Machine Learning Techniques in Optical Networks. IEEE Communications Surveys & Tutorials, pages 1–1.

Ong, L., Rytina, I., Garcia, M., Schwarzbauer, H., Coene, L., Lin, H., Juhasz, I., Holdrege, M., and Sharp, C. (October 1999). Framework Architecture for Signaling Transport. Technical report, RFC Editor. RFC2719.

Osiek, B. A., Xex´eo, G., and de Carvalho, L. A. V. (2010). A language-independent acronym extraction from biomedical texts with hidden markov models. IEEE Trans- actions on Biomedical Engineering, 57(11):2677–2688.

Pakhomov, S. (2002). Semi-supervised maximum entropy based approach to acronym and abbreviation normalization in medical texts. In Proceedings ofthe 40th Annual Meet- ing on Association for Computational Linguistics, ACL ’02, pages 160–167, Strouds- burg, PA, USA. Association for Computational Linguistics.

Paulino, C. D. M., Turkman, M. A. A., and Murteira, B. (2018). Estat´ıstica Bayesiana. Fundac¸ ˜ao Calouste Gulbenkian, second edition.

Perkins, J. (2014). Python 3 text processing with NLTK 3 cookbook. Packt Publishing Ltd.

Poole, D. L. and Mackworth, A. K. (2010). Artificial Intelligence: foundations ofcompu- tational agents. Cambridge University Press.

Pustejovsky, J., Castano, J., Cochran, B., Kotecki, M., and Morrell, M. (2001). Auto- matic extraction of acronym-meaning pairs from medline databases. Studies in health technology and informatics, 84(1):371–375.

Rabiner, L. R. (1989). A tutorial on hidden markov models and selected applications in speech recognition. Proceedings ofthe IEEE, 77(2):257–286.

S´anchez, D. and Isern, D. (2011). Automatic extraction of acronym definitions from theWeb. Applied Intelligence, 34(2):311–327.

Shpiner, A., Tse, R., Schelp, C., and Mizrahi, T. (December 2016). Multipath Time Synchronization. Technical report, RFC Editor. RFC8039.

Stewart, R., Tuexen, M., and Proshin, M. (February 2019). Stream Control Transmission Protocol: Errata and Issues in RFC 4960. Technical report, RFC Editor. RFC8540.

Zahariev, M. (2004). A(Acronyms). PhD thesis, Simon Fraser University.