Constrained Least-Squares Density-Difference Estimation

Tuan Duong NGUYEN
Marthinus Christoffel DU PLESSIS

IEICE TRANSACTIONS on Information and Systems   Vol.E97-D    No.7    pp.1822-1829
Publication Date: 2014/07/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E97.D.1822
Type of Manuscript: PAPER
Category: Artificial Intelligence, Data Mining
density difference,  asymptotic variance,  L2-distance,  bias,  class-balance change,  two-sample homogeneity test,  

Full Text: PDF>>
Buy this Article

We address the problem of estimating the difference between two probability densities. A naive approach is a two-step procedure that first estimates two densities separately and then computes their difference. However, such a two-step procedure does not necessarily work well because the first step is performed without regard to the second step and thus a small error in the first stage can cause a big error in the second stage. Recently, a single-shot method called the least-squares density-difference (LSDD) estimator has been proposed. LSDD directly estimates the density difference without separately estimating two densities, and it was demonstrated to outperform the two-step approach. In this paper, we propose a variation of LSDD called the constrained least-squares density-difference (CLSDD) estimator, and theoretically prove that CLSDD improves the accuracy of density difference estimation for correctly specified parametric models. The usefulness of the proposed method is also demonstrated experimentally.