Evolutionary Risk-Sensitive Feature Selection for Learning to Rank

Daniel Xavier de Sousa; Thierson Couto Rosa; Marcos André Gonçalves

doi:10.5753/ctd.2019.6329

Daniel Xavier de Sousa IFG
Thierson Couto Rosa UFG
Marcos André Gonçalves UFMG

DOI: https://doi.org/10.5753/ctd.2019.6329

Resumo

Learning to Rank (L2R) is one of the main research lines in Information Retrieval. Risk-sensitive L2R is a sub-area of L2R that tries to learn models that are good on average while at the same time reduce the risk of performing poorly in a few but important queries (e.g., medical or legal queries). One way of reducing risk in learned models is by selecting and removing noisy, redundant or features that promote some queries in detriment of others. This is exacerbated by learning methods that usually maximize an average metric (e.g., mean average precision (MAP) or Normalized Discounted Cumulative Gain (NDCG)). However, historically feature selection (FS) methods have focused only on effectiveness and feature reduction as the main objectives. Accordingly, in this work we propose to evaluate FS for L2R with an additional objective in mind, namely risk-sensitiveness. We present novel single and multi-objective criteria to optimize feature reduction, effectiveness and risk-sensitiveness, all at the same time. We also introduce a new methodology to explore the search space, suggesting effective and efficient extensions of a well-known Evolutionary Algorithm (SPEA2) for FS applied to L2R. Our experiments show that explicitly including risk as an objective criterion is crucial to achieve more effective and risk-sensitive performance. We also provide a thorough analysis of our methodology and experimental results.

Palavras-chave: Aprendizado de Ranqueamento, Seleção de Atributos, Sensibilidade a Risco

Referências

Chapelle, O., Yi, C., and Liu, T.-Y. (2011). Future directions in learning to rank. In YLRC, pages 129–136.

Dinçer, B. T., Macdonald, C., and Ounis, I. (2016). Risk-Sensitive Evaluation and Learning to Rank using Multiple Baselines. In SIGIR, pages 483–492.

Freitas, M., Sousa, D., Martins, W., Couto, T., Silva, R., and Gonc¸alves, M. (2016). A Fast and Scalable Manycore Implementation for an On-Demand Learning to Rank Method. In WSCAD, 1:1–12.

Freitas, M., Sousa, D., Martins, W., Couto, T., Silva, R., and Gonc¸alves, M. (2018). Parallel rule-based selective sampling and on-demand learning to rank. In CCPE, pages 1–12.

Laporte, L., Flamary, R., Canu, S., Dejean, S., and Mothe, J. (2014). Nonconvex regularizations for feature selection in ranking with sparse SVM. In IEEE TNNLS, abs/1507.00500:1118–1130.

Li, B., Li, J., and Tang, K. (2015). Many-Objective Evolutionary Algorithms: A Survey. In CSUR, 48:1–35.

Sousa, D., Canuto, S., Couto, T., Martins, W., and Gonc¸alves, M. (2016). Incorporating Risk-Sensitiveness into Feature Selection for Learning to Rank. In CIKM, pages 257–266.

Sousa, D., Canuto, S., Gonc¸alves, M. A., Couto, T., and Martins, W. (2019). Risk-sensitive learning to rank with evolutionary multi-objective feature selection. In TOIS, 37:24:1–24:34.

Sousa, D., Couto, T., Martins, W., Silva, R., and Gonc¸alves, M. (2012). Improving on-demand learning to rank through parallelism. In WISE, pages 526–537.

Zhang, P., Hao, L., Song, D., Wang, J., Hou, Y., and Hu, B. (2014). Generalized Bias-Variance Evaluation of TREC Participated Systems. In CIKM, pages 3–6.

Zitzler, E., Laumanns, M., and Thiele, L. (2001). SPEA2: Improving the strength pareto evolutionary algorithm. In EUROGEN, pages 12–19.