A Distributed System for SearchOnMath Based on the Microsoft BizSpark Program

  • Ricardo M. Oliveira Universidade Federal de Alfenas
  • Flavio B. Gonzaga Universidade Federal de Alfenas
  • Valmir C. Barbosa Universidade Federal do Rio de Janeiro
  • Geraldo B. Xexéo Universidade Federal do Rio de Janeiro

Resumo


Mathematical information retrieval is a relatively new area, so the first search tools capable of retrieving mathematical formulas began to appear only a few years ago. The proposals made public so far mostly implement searches on internal university databases, small sets of scientific papers, or Wikipedia in English. As such, only modest computing power is required. In this context, SearchOnMath has emerged as a pioneering tool in that it indexes several different databases and is compatible with several mathematical representation languages. Given the significantly greater number of formulas it handles, a distributed system becomes necessary to support it. The present study is based on the Microsoft BizSpark program and has aimed, for 38 different distributed-system scenarios, to pinpoint the one affording the best response times when searching the SearchOnMath databases for a collection of 120 formulas.

Referências

Asperti, A., Guidi, F., Coen, C. S., Tassi, E., and Zacchiroli, S. (2006). A content based mathematical search engine: Whelp. In LNCS 3839, pages 17–32. Springer.

Hu, X., Gao, L., Lin, X., Tang, Z., Lin, X., and Baker, J. B. (2013). Wikimirs: A mathematical information retrieval system for wikipedia. In Proceedings of JCDL’13, pages 11–20.

Kamali, S. and Tompa, F. W. (2013). Retrieving documents with mathematical content. In Proceedings of SIGIR’13, pages 353–362.

Kohlhase, M. and Sucan, I. (2006). A search engine for mathematical formulae. In LNCS 4120, pages 241–253. Springer.

Lin, X., Gao, L., Hu, X., Tang, Z., Xiao, Y., and Liu, X. (2014). A mathematics retrieval system for formulae in layout presentations. In Proceedings of SIGIR’14, pages 697–706.

Microsoft (2017). Azure price calculator. Accessed: Nov. 23, 2016.

Pavan Kumar, P., Agarwal, A., and Bhagvati, C. (2012). A structure based approach for mathematical expression retrieval. In LNCS 7694, pages 23–34. Springer.

Salem, L., Testard, F., and Salem, C. (1992). The Most Beautiful Mathematical Formulas. Wiley.

Schellenberg, T., Yuan, B., and Zanibbi, R. (2012). Layout-based substitution tree indexing and retrieval for mathematical expressions. In Proceedings of SPIE 8297, page 829701. SPIE.

Schubotz, M., Grigorev, A., Leich, M., Cohl, H. S., Meuschke, N., Gipp, B., Youssef, A. S., and Markl, V. (2016). Semantification of identifiers in mathematics for better math information retrieval. In Proceedings of SIGIR’16, pages 135–144.

Stalnaker, D. and Zanibbi, R. (2015). Math expression retrieval using an inverted index over symbol pairs. In Proceedings of SPIE 9402, page 940207. SPIE.

Stewart, I. (2012). In Pursuit of the Unknown: 17 Equations That Changed the World. Basic Books.

Zanibbi, R., Davila, K., Kane, A., and Tompa, F. W. (2016). Multi-stage math formula search: Using appearance-based similarity metrics at scale. In Proceedings of SIGIR’ 16, pages 145–154.
Publicado
25/08/2018
Como Citar

Selecione um Formato
OLIVEIRA, Ricardo M.; GONZAGA, Flavio B.; BARBOSA, Valmir C.; XEXÉO, Geraldo B.. A Distributed System for SearchOnMath Based on the Microsoft BizSpark Program. In: SIMPÓSIO BRASILEIRO DE BANCO DE DADOS (SBBD), 33. , 2018, Rio de Janeiro. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2018 . p. 289-294. ISSN 2763-8979. DOI: https://doi.org/10.5753/sbbd.2018.22245.