Group-Linking Method: A Unified Benchmark for Machine Learning with Recurrent Neural Network

Tsungnan LIN  C. Lee GILES  

Publication
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences   Vol.E90-A   No.12   pp.2916-2929
Publication Date: 2007/12/01
Online ISSN: 1745-1337
DOI: 10.1093/ietfec/e90-a.12.2916
Print ISSN: 0916-8508
Type of Manuscript: PAPER
Category: Neural Networks and Bioengineering
Keyword: 
recurrent neural networks,  finite state machines,  grammatical inference,  NARX neural networks,  

Full Text: PDF(1MB)>>
Buy this Article




Summary: 
This paper proposes a method (Group-Linking Method) that has control over the complexity of the sequential function to construct Finite Memory Machines with minimal order--the machines have the largest number of states based on their memory taps. Finding a machine with maximum number of states is a nontrivial problem because the total number of machines with memory order k is (256)2k-2, a pretty large number. Based on the analysis of Group-Linking Method, it is shown that the amount of data necessary to reconstruct an FMM is the set of strings not longer than the depth of the machine plus one, which is significantly less than that required for traditional greedy-based machine learning algorithm. Group-Linking Method provides a useful systematic way of generating unified benchmarks to evaluate the capability of machine learning techniques. One example is to test the learning capability of recurrent neural networks. The problem of encoding finite state machines with recurrent neural networks has been extensively explored. However, the great representation power of those networks does not guarantee the solution in terms of learning exists. Previous learning benchmarks are shown to be not rich enough structurally in term of solutions in weight space. This set of benchmarks with great expressive power can serve as a convenient framework in which to study the learning and computation capabilities of various network models. A fundamental understanding of the capabilities of these networks will allow users to be able to select the most appropriate model for a given application.