|
Radu RĂDESCU
Transform Methods Used in Lossless Compression of Text Files
Abstract. This paper presents a study of transform methods used in
lossless text compression in order to preprocess the text by exploiting the
inner redundancy of the source file. The transform methods are Burrows-Wheeler
Transform (BWT, also known as Block Sorting), Star Transform and Length-Index
Preserving Transform (LIPT). BWT converts the original blocks of data into a
format that is extremely well suited for compression. The chosen range of the
block length for different text file types is presented, evaluating the
compression ratio and compression time. Star Transform and LIPT applied to text
emphasize their positive effects on a set of test files picked up from the
classical corpora of both English and Romanian texts. Experimental results and
comparisons with universal lossless compressors were performed, and set of
interesting conclusions and recommendations are driven on their basis.
READ THE PDF |