Recurrent Neural Network Compression Based on Low-Rank Tensor Representation

Andros TJANDRA  Sakriani SAKTI  Satoshi NAKAMURA  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E103-D   No.2   pp.435-449
Publication Date: 2020/02/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2019EDP7040
Type of Manuscript: PAPER
Category: Music Information Processing
Keyword: 
recurrent neural network,  model compression,  tensor decomposition,  deep learning,  

Full Text: PDF(1.2MB)>>
Buy this Article




Summary: 
Recurrent Neural Network (RNN) has achieved many state-of-the-art performances on various complex tasks related to the temporal and sequential data. But most of these RNNs require much computational power and a huge number of parameters for both training and inference stage. Several tensor decomposition methods are included such as CANDECOMP/PARAFAC (CP), Tucker decomposition and Tensor Train (TT) to re-parameterize the Gated Recurrent Unit (GRU) RNN. First, we evaluate all tensor-based RNNs performance on sequence modeling tasks with a various number of parameters. Based on our experiment results, TT-GRU achieved the best results in a various number of parameters compared to other decomposition methods. Later, we evaluate our proposed TT-GRU with speech recognition task. We compressed the bidirectional GRU layers inside DeepSpeech2 architecture. Based on our experiment result, our proposed TT-format GRU are able to preserve the performance while reducing the number of GRU parameters significantly compared to the uncompressed GRU.