On the impact of missing value imputation methods for multiple kernel learning on bipartite graphs

  • Victor Vidal Universidade Federal Rural de Pernambuco
  • Tássia Bastos Universidade Federal Rural de Pernambuco
  • Rafael Ferreira Mello Universidade Federal Rural de Pernambuco / Cesar School
  • Péricles Miranda Universidade Federal Rural de Pernambuco
  • André C. A. Nascimento Universidade Federal Rural de Pernambuco / Cesar School


In the last decade, the study of pharmacological networks has received a lot of attention, given its relevance to the drug discovery process. Many different approaches for predicting biological interactions have been proposed, especially in the area of multiple kernel learning (MKL). Such methods comprise integrative approaches that can handle heterogeneous data sources in the form of kernels, but can suffer from the missing data problem. Techniques to handle missing values in the base kernel matrices can be used, usually based on simpler techniques, such as imputing zeroes, mean and median of the kernel matrix. In this work, techniques for handling missing values were evaluated in the context of bipartite networks. Our analyses showed that depending on the amount of missing data, k-NN and Singular Value Decomposition (SVD) techniques performed much better than the other techniques, bringing encouraging results, while zero-fill showed the worst performance in relation to all other evaluated methods.

Palavras-chave: machine learning, kernel methods, bioinformatics


