A Fast and Scalable Manycore Implementation for an On-Demand Learning to Rank Method

Mateus F. e Freitas; Daniel de Sousa; Wellington Martins; Thierson Rosa; Rodrigo Silva; Marcos Gonçalves

doi:10.5753/wscad.2016.14254

Mateus F. e Freitas UFG
Daniel de Sousa UFMG
Wellington Martins UFG
Thierson Rosa UFG
Rodrigo Silva UFMG
Marcos Gonçalves UFMG

DOI: https://doi.org/10.5753/wscad.2016.14254

Resumo

Learning to rank (L2R) works by constructing a ranking model from training data so that, given unseen data (query), a somewhat similar ranking is produced. Almost all work in L2R focuses on ranking accuracy leaving performance and scalability overlooked. In this work we present a fast and scalable manycore (GPU) implementation for an on-demand L2R technique that builds ranking models on the ﬂy. Our experiments show that we are able to process a query (build a model and rank) in only a few milliseconds, achieving a speedup of 508x over a serial baseline and 4x over a parallel baseline for the best case. We extend the implementation to work with multiple GPUs, further increasing the speedup over the parallel baseline to approximately x16 when using 4 GPUs.

Referências

Agrawal, R., Imieliínski, T., and Swami, A. (1993). Mining associ ation rules between sets of items in large databases. ACM SIGMOD Record.

Atreas, N. and Karanikas, C. (2007). A fast pattern matching algorithm based on prime numbers and hashing approximation. NATO Security through Science Series -D: Information and Communication Security.

Buckles, B. P. and Lybanon, M. (1977). Algorithm 515: Generation of a vector from the lexicographical index [g6]. TOMS.

Chapelle, O., Chang, Y., and Liu, T.-Y. (2011). Future directions in learning to rank. In Yahoo! Learning to Rank Challenge.

De Sousa, D. X., Rosa, T. C., Martins, W. S., Silva, R., and Gonçalves, M. A. (2012). Improving on-demand learning to rank through parallelism. In WISE.

Duh, K. and Kirchhoff, K. (2008). Learning to rank with partially-labeled data. In SIGIR. ACM.

Geng, X., Liu, T.-Y., Qin, T., Arnold, A., Li, H., and Shum, H.-Y. (2008). Query dependent ranking using k-nearest neighbor. In SIGIR. ACM.

Harris, M. et al. (2007). Optimizing parallel reduction in cuda.

Hastie, T., Tibshirani, R., and Friedman, R. (2009). The Elements of NVIDIA Developer Technology. Statistical Learning.

Henneges, C., Hinselmann, G., Jung, S., Madlung, J., Schütz, W., Nordheim, A., and Zell, A. (2011). Ranking methods for the prediction of frequent top scoring peptides from proteomics data. Journal of Proteomics & Bioinformatics. [Jin et al. 2015] Jin, J., Cai, X., Lai, G., and Lin, X. (2015). Gpu-accelerated parallel algorithms for linear ranksvm. The Journal of Supercomputing.

Liu, T.-Y. (2009). Learning to rank for information retrieval. Foundations and Trends in Information Retrieval.

Luitjens, J. (2013). Cuda pro tip: Increase performance with vectorized https://devblogs.nvidia.com/parallelforall/memory access. cuda-pro-tip-increase-performance-with-vectorizedmemory-access/, acessed in june 2016.

Lv, Y., Moon, T., Kolari, P., Zheng, Z., Wang, X., and Chang, Y. (2011). Learning to model relatedness for news recommendation. In WWW. ACM.

McCaffrey, J. (2004). Generating the mth Lexicographical Element of a Mathematical Combination. http://msdn.microsoft.com/en-us/ library/aa289166(v=vs.71).aspx, acessed in june 2016.

Page, L., Brin, S., Motwani, R., and Winograd, T. (1999). The pager ank citation ranking: bringing order to the web.

Qin, T., Liu, T.-Y., Xu, J., and Li, H. (2010). Letor: A benchmark collection for research on learning to rank for information retrieval. Information Retrieval. [Shukla et al. 2012] Shukla, S., Lease, M., and Tewari, A. (2012). Parallelizing listnet training using spark. In SIGIR. ACM.

Silva, R., Gonçalves, M. A., and Veloso, A. (2011). Rule-based active sampling for learning to rank. In ECML PKDD. Springer.

Upadhyaya, S. R. (2013). Parallel approaches to machine learning—a comprehensive survey. JPDC.

Veloso, A. A., Almeida, H. M., Gonçalves, M. A., and Meira Jr, W. (2008). Learning to rank at query-time using association rules. In SIGIR. ACM.

Wang, S., Wu, Y., Gao, B. J., Wang, K., Lauw, H. W., and Ma, J. (2015). A cooperative coevolution framework for parallel learning to rank. TKDE. [Yue et al. 2007] Yue, Y., Finley, T., Radlinski, F., and Joachims, T. (2007). A support vector method for optimizing average precision. In SIGIR. ACM.