A Variable-to-Fixed Length Lossless Source Code Attaining Better Performance than Tunstall Code in Several Criterions

Mitsuharu ARIMURA  

IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences   Vol.E101-A   No.1   pp.249-258
Publication Date: 2018/01/01
Online ISSN: 1745-1337
DOI: 10.1587/transfun.E101.A.249
Type of Manuscript: PAPER
Category: Information Theory
lossless data compression,  source coding,  variable-to-fixed length code,  Tunstall code,  average pointwise coding rate,  worst-case coding rate,  

Full Text: PDF>>
Buy this Article

Tunstall code is known as an optimal variable-to-fixed length (VF) lossless source code under the criterion of average coding rate, which is defined as the codeword length divided by the average phrase length. In this paper we define the average coding rate of a VF code as the expectation of the pointwise coding rate defined by the codeword length divided by the phrase length. We call this type of average coding rate the average pointwise coding rate. In this paper, a new VF code is proposed. An incremental parsing tree construction algorithm like the one that builds Tunstall parsing tree is presented. It is proved that this code is optimal under the criterion of the average pointwise coding rate, and that the average pointwise coding rate of this code converges asymptotically to the entropy of the stationary memoryless source emitting the data to be encoded. Moreover, it is proved that the proposed code attains better worst-case coding rate than Tunstall code.