
For FullText PDF, please login, if you are a member of IEICE,
or go to Pay Per View on menu list, if you are a nonmember of IEICE.

Nonparametric Regression Method Based on Orthogonalization and Thresholding
Katsuyuki HAGIWARA
Publication
IEICE TRANSACTIONS on Information and Systems
Vol.E94D
No.8
pp.16101619 Publication Date: 2011/08/01
Online ISSN: 17451361
DOI: 10.1587/transinf.E94.D.1610
Print ISSN: 09168532 Type of Manuscript: PAPER Category: Artificial Intelligence, Data Mining Keyword: nonparametric regression, orthogonalization, hard thresholding, model selection,
Full Text: PDF(185KB) >>Buy this Article
Summary:
In this paper, we consider a nonparametric regression problem using a learning machine defined by a weighted sum of fixed basis functions, where the number of basis functions, or equivalently, the number of weights, is equal to the number of training data. For the learning machine, we propose a training scheme that is based on orthogonalization and thresholding. On the basis of the scheme, vectors of basis function outputs are orthogonalized and coefficients of the orthogonalized vectors are estimated instead of weights. The coefficient is set to zero if it is less than a predetermined threshold level assigned componentwise to each coefficient. We then obtain the resulting weight vector by transforming the thresholded coefficients. In this training scheme, we propose asymptotically reasonable threshold levels to distinguish contributed components from unnecessary ones. To see how this works in a simple case, we derive an upper bound for the generalization error of the training scheme with the given threshold levels. It tells us that an increase in the generalization error is of O(log n/n) when there is a sparse representation of a target function in an orthogonal domain. In implementing the training scheme, eigendecomposition or the Gram–Schmidt procedure is employed for orthogonalization, and the corresponding training methods are referred to as OHTED and OHTGS. Furthermore, modified versions of OHTED and OHTGS, called OHTED2 and OHTGS2 respectively, are proposed for reduced estimation bias. On real benchmark datasets, OHTED2 and OHTGS2 are found to exhibit relatively good generalization performance. In addition, OHTGS2 is found to be obtain a sparse representation of a target function in terms of the basis functions.

