Theoretical Analysis of Density Ratio Estimation

Takafumi KANAMORI  Taiji SUZUKI  Masashi SUGIYAMA  

IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences   Vol.E93-A   No.4   pp.787-798
Publication Date: 2010/04/01
Online ISSN: 1745-1337
DOI: 10.1587/transfun.E93.A.787
Print ISSN: 0916-8508
Type of Manuscript: PAPER
Category: Algorithms and Data Structures
density ratio estimation,  density estimation,  logistic regression,  asymptotic analysis,  Gaussian assumption,  

Full Text: PDF>>
Buy this Article

Density ratio estimation has gathered a great deal of attention recently since it can be used for various data processing tasks. In this paper, we consider three methods of density ratio estimation: (A) the numerator and denominator densities are separately estimated and then the ratio of the estimated densities is computed, (B) a logistic regression classifier discriminating denominator samples from numerator samples is learned and then the ratio of the posterior probabilities is computed, and (C) the density ratio function is directly modeled and learned by minimizing the empirical Kullback-Leibler divergence. We first prove that when the numerator and denominator densities are known to be members of the exponential family, (A) is better than (B) and (B) is better than (C). Then we show that once the model assumption is violated, (C) is better than (A) and (B). Thus in practical situations where no exact model is available, (C) would be the most promising approach to density ratio estimation.