Training Data Selection Method for Generalization by Multilayer Neural Networks

Kazuyuki HARA  Kenji NAKAYAMA  

IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences   Vol.E81-A   No.3   pp.374-381
Publication Date: 1998/03/25
Online ISSN: 
Print ISSN: 0916-8508
Type of Manuscript: Special Section PAPER (Special Section of Selected Papers from the 10th Karuizawa Workshop on Circuits and Systems)
training data selection,  generalization performance,  multilayer neural networks,  

Full Text: PDF(640KB)>>
Buy this Article

A training data selection method is proposed for multilayer neural networks (MLNNs). This method selects a small number of the training data, which guarantee both generalization and fast training of the MLNNs applied to pattern classification. The generalization will be satisfied using the data locate close to the boundary of the pattern classes. However, if these data are only used in the training, convergence is slow. This phenomenon is analyzed in this paper. Therefore, in the proposed method, the MLNN is first trained using some number of the data, which are randomly selected (Step 1). The data, for which the output error is relatively large, are selected. Furthermore, they are paired with the nearest data belong to the different class. The newly selected data are further paired with the nearest data. Finally, pairs of the data, which locate close to the boundary, can be found. Using these pairs of the data, the MLNNs are further trained (Step 2). Since, there are some variations to combine Steps 1 and 2, the proposed method can be applied to both off-line and on-line training. The proposed method can reduce the number of the training data, at the same time, can hasten the training. Usefulness is confirmed through computer simulation.