Uma Busca Ordenada Branch-and-Bound para solução do Problema de Inferência Transdutiva usando Máquinas de Vetores Suporte

Hygor Xavier Araújo; Raul Fonseca Neto; Saulo Moraes Villela

doi:10.5753/kdmile.2018.27392

Hygor Xavier Araújo UFJF
Raul Fonseca Neto UFJF
Saulo Moraes Villela UFJF

DOI: https://doi.org/10.5753/kdmile.2018.27392

Resumo

Nesse artigo é apresentado um novo método para resolver o problema de inferência transdutiva cujo objetivo é predizer os rótulos binários de um subconjunto de pontos de interesse de uma função de decisão desconhecida. É utilizada a Máquina de Vetores Suporte para tentar encontrar um limite de decisão. Para obter a hipótese de margem máxima sobre as amostras rotuladas e não rotuladas, é empregada uma busca ordenada (best-first) admissível com base nos valores de margem. Evidências empíricas sugerem que esta solução globalmente ótima pode obter excelentes resultados no problema de transdução. Devido à estratégia de seleção usada, o algoritmo de busca explora apenas uma pequena fração de amostras não rotuladas, tornando-a eficiente para bases de dados de tamanho médio. Os resultados obtidos foram comparados com os resultados da Transductive Support Vector Machine, demonstrando melhores resultados em valores de margem.

Palavras-chave: inferência transdutiva, aprendizado semissupervisionado, busca ordenada admissível, máquina de vetores suporte, separação de baixa densidade

Referências

Bennett, K. P. and Demiriz, A. Semi-supervised support vector machines. In Proceedings of the 1998 Conference on Advances in Neural Information Processing Systems II. MIT Press, Cambridge, MA, USA, pp. 368–374, 1999.

Boser, B. E., Guyon, I. M., and Vapnik, V. N. A training algorithm for optimal margin classifiers. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory. ACM, New York, NY, USA, pp. 144–152, 1992.

Chapelle, O., Schölkopf, B., and Zien, A., editors. Semi-Supervised Learning. MIT Press, Cambridge, MA, 2006.

Chapelle, O., Sindhwani, V., and Keerthi, S. S. Branch and bound for semi-supervised support vector machines. In Advances in Neural Information Processing Systems 19, B. Schölkopf, J. C. Platt, and T. Hoffman (Eds.). MIT Press, Hyatt Regency Vancouver, in Vancouver, B.C., Canada, pp. 217–224, 2007.

Gammerman, A., Vovk, V., and Vapnik, V. Learning by transduction. In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp. 148–155, 1998.

Graepel, T., Herbrich, R., and Obermayer, K. Bayesian transduction. In Proceedings of the 12th International Conference on Neural Information Processing Systems. MIT Press, Cambridge, MA, USA, pp. 456–462, 1999.

Joachims, T. Transductive inference for text classification using support vector machines. In Proceedings of the Sixteenth International Conference on Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp. 200–209, 1999.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research vol. 12, pp. 2825–2830, 2011.

Platt, J. C. Fast training of support vector machines using sequential minimal optimization. In Advances in Kernel Methods, B. Schölkopf, C. J. C. Burges, and A. J. Smola (Eds.). MIT Press, Cambridge, MA, USA, pp. 185–208, 1999.

Vapnik, V. N. The Nature of Statistical Learning Theory. Springer-Verlag, Berlin, Heidelberg, 1995.

Villela, S. M., Leite, S. C., and Fonseca Neto, R. Feature selection from microarray data via an ordered search with projected margin. In Proceedings of the 24th International Conference on Artificial Intelligence. AAAI Press, Buenos Aires, Argentina, pp. 3874–3881, 2015.

Villela, S. M., Leite, S. C., and Fonseca Neto, R. Incremental p-margin algorithm for classification with arbitrary norm. Pattern Recognition vol. 55, pp. 261–272, 2016.