Data Compression by Context Sorting

Hidetoshi YOKOO  Masaharu TAKAHASHI  

IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences   Vol.E79-A   No.5   pp.681-686
Publication Date: 1996/05/25
Online ISSN: 
Print ISSN: 0916-8508
Type of Manuscript: PAPER
Category: Information Theory and Coding Theory
data compression,  context sorting,  universal coding,  rank coding,  

Full Text: PDF(491.6KB)>>
Buy this Article

This paper proposes a new lossless data compression method, which utilizes a context sorting algorithm. Every symbol in the data can be predicted by taking its immediately preceding characters, or context, into account. The context sorting algorithm sorts a set of all the previous contexts to find the most similar context to the current one. It then predicts the next symbol by sorting previous symbol-context pairs in an order of context similarity. The codeword for the next symbol represents the rank of the symbol in this sorted sequence. The compression performance is evaluated both analytically and empirically. Although the proposed method operates character by character, with no probability distribution used to make a prediction, it has comparable compression performance to the best known data compression utilities.