Huffman Codes versus Augmented Non-Prefix-Free Codes

14th International Symposium on Experimental Algorithms (SEA), Paris, Fransa, 29 Haziran - 01 Temmuz 2015, cilt.9125, ss.315-326

Yayın Türü: Bildiri / Tam Metin Bildiri
Cilt numarası: 9125
Doi Numarası: 10.1007/978-3-319-20086-6_24
Basıldığı Şehir: Paris
Basıldığı Ülke: Fransa
Sayfa Sayıları: ss.315-326
İstanbul Teknik Üniversitesi Adresli: Evet

Özet

Non-prefix-free (NPF) codes are not uniquely decodable, and thus, have received very few attention due to the lack of that most essential feature required in any coding scheme. Augmenting NPF codes with compressed data structures has been proposed in ISIT'2013 [8] to overcome this limitation. It had been shown there that such an augmentation not only brings the unique decodability to NPF codes, but also provides efficient random access. In this study, we extend this approach and compare augmented NPF codes with the 0th-order Huffman codes in terms of compression ratios and random access times. Basically, we benchmark four coding schemes as NPF codes augmented with wavelet trees (NPF-WT), with R/S dictionaries (NPF-RS), Huffman codes, and sampled Huffman codes. Since Huffman coding originally does not provide random access feature, sampling is a common way in practice to speed up access to arbitrary symbols in the encoded stream. We achieve sampling by simply managing an additional array that marks the beginnings of the codewords in steps of the sampling ratio, and keeping that sparse bit array compressed via R/S dictionary data structure. The experiments revealed that augmented NPF codes achieve compression very close to the Huffman with the additional advantage of random access. When compared to sampled Huffman coding both the compression ratios and random access performances of the NPF schemes are superior.