Mutual Kernel Matrix Completion

Rachelle RIVERO  Richard LEMENCE  Tsuyoshi KATO  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E100-D   No.8   pp.1844-1851
Publication Date: 2017/08/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2017EDP7059
Type of Manuscript: PAPER
Category: Artificial Intelligence, Data Mining
Keyword: 
kernel matrix completion,  EM algorithm,  Kullback-Leibler divergence,  support vector machine (SVM),  data fusion,  

Full Text: PDF>>
Buy this Article




Summary: 
With the huge influx of various data nowadays, extracting knowledge from them has become an interesting but tedious task among data scientists, particularly when the data come in heterogeneous form and have missing information. Many data completion techniques had been introduced, especially in the advent of kernel methods — a way in which one can represent heterogeneous data sets into a single form: as kernel matrices. However, among the many data completion techniques available in the literature, studies about mutually completing several incomplete kernel matrices have not been given much attention yet. In this paper, we present a new method, called Mutual Kernel Matrix Completion (MKMC) algorithm, that tackles this problem of mutually inferring the missing entries of multiple kernel matrices by combining the notions of data fusion and kernel matrix completion, applied on biological data sets to be used for classification task. We first introduced an objective function that will be minimized by exploiting the EM algorithm, which in turn results to an estimate of the missing entries of the kernel matrices involved. The completed kernel matrices are then combined to produce a model matrix that can be used to further improve the obtained estimates. An interesting result of our study is that the E-step and the M-step are given in closed form, which makes our algorithm efficient in terms of time and memory. After completion, the (completed) kernel matrices are then used to train an SVM classifier to test how well the relationships among the entries are preserved. Our empirical results show that the proposed algorithm bested the traditional completion techniques in preserving the relationships among the data points, and in accurately recovering the missing kernel matrix entries. By far, MKMC offers a promising solution to the problem of mutual estimation of a number of relevant incomplete kernel matrices.