15-853: Algorithms in the Real World (Guy Blelloch)
Compression
[ Topics |
Readings |
Text Books |
Links ]
Introduction to Compression
(ps.gz,
pdf).
A draft
of the data compression chapter I'm writing for an eventual book.
Suggested
David Salomon.
Data Compression: The Complete Reference.
Springer Verlag, 1998.
Goes through a wide variety of topics and a huge number of specific
"real world" algorithms. The theory is not as strong as Sayood's book
(below), and the algorithms are sometimes not described in enough
depth to implement them, but the number of algorithms covered is impressive,
including Burrows-Wheeler, ABC, and about a dozen variants of Lempel-Ziv.
Khalid Sayood.
Introduction
to Data Compression, Second Edition.
Morgan Kaufmann, 2000.
Looks at both Theoretical and practical
aspects of data compression.
Discusses a reasonably wide range of lossless and
lossy compression methods, including
fractals, wavelets, and subband coding.
The coverage of the most recent best algorithms for text compression is
not as good as Salomon's book (above).
For example it does not cover PPM, Burrows-Wheeler, ACB, and some
of the variants of LZ77 and LZ78 (e.g. gzip).
Source code for many of the algorithms is supposed to be
here,
but the connection timed out last time I tried.
Others (Alphabetical order)
Timothy C. Bell, John G. Cleary, and Ian H. Witten.
Text compression.
Prentice Hall, 1990.
Darrel Hankersson, Greg A. Harris, and Peter D. Johnson Jr..
Introduction to Information Theory and Data Compression.
CRC Press, 1997.
Jerry Gibson, Toby Berger, Tom Lookabaugh, Rich Baker and David Lindbergh.
Digital Compression for Multimedia: Principles & Standards.
Morgan Kaufmann, 1998.
Gilbert Held and Thomas R. Marshall.
Data and Image Compression: Tools and Techniques.
Wiley 1996 (4th ed.)
This book is quite basic and does not cover many important
topics.
On the other hand, it includes source code and a detailed
description of most of the basic algorithms.
Mark Nelson.
The
Data Compression Book.
M&T Books, 1995.
James A. Storer (Editor).
Image and Text Compression.
Kluwer, 1992.
This a collection of articles on a broad set of topics.
Ian H. Witten, Alistair Moffat, and Timothy C. Bell.
Managing Gigabytes: Compressing and Indexing Documents and
Images (Second Edition).
Van Nostrand Reinhold, 1999.
Benchmarks
The Archive Comparison Test (ACT) is
an excellent collection of up-to-date comparisons of many compression
algorithms with both compression ratios, and run times.
The overall winning implementations tend to be based on the
the Burrows-Wheeler
block sorting algorithm (e.g. BZIP),
variants of PPM (e.g. BOA), or ACB.
A comparison
of various compression codes for text and black and white images.
Shows progression of the state-of-the-art over the years. The data
is somewhat out of date (e.g. the best bpc for the Calgary Corpus is
now around 2).
Overviews and Surveys
Introduction to data compression by Peter Gutmann.
Overview of Huffman, Shannon-Fano, Arithmetic, LZ78, LZW, and LZ77
coding techniques.
Data
Compression by Debra Lelewer and Daniel S. Hirshberg.
This is a more detained overview. Originally appeared in
Computing Surveys 19,3 (1987) 261-297.
Image Compression - from DCT to Wavelets : A Review by Subhasis Saha.
This has little technical content, but gives a broad background
of the various techniques.
General links
comp.compression faq. This is surely the best source
of information on compression available on the web.
The
Data Compression Library. A nice reference on various topics on
compression with many pointers to textbooks, course pages and other
links on the topics.
Compression Pointers (comprehensive, but badly organized).
Mitsuharu Arimura's
Bookmarks on Source Coding/Data Compression
Entropy and Information Theory
Entropy
on the World Wide Web
LZ78 and LZW
LZW and GIF explained by Steve Blackstock.
A nice 200 line description of Lempel-Ziv Welch compression along
with the particular implementation used in the GIF standard.
Terry A. Welch.
A Technique for High Performance Data Compression.
IEEE Computer, Vol. 17, No. 6, 1984, pp. 8-19.
The GIF format, which uses LZW, is described in
GIF87(5) - GIF 87 and
GIF89a(5) - GIF 89a standards.
The GIF
Licensing Controversy. The GIF format is patented by ComputServe
and the LZW algorithm used by the format is patented by Unisys.
LZ77 (sliding window compression)
E. Fiala and D. Greene.
Data compression with finite windows
(abstract).
A.D. Wyner and J. Ziv.
The sliding-window Lempel-Ziv algorithm is asymptotically optimal.
(abstract).
The PPM algorithm.
A. Moffat.
Implementing the PPM data compression scheme.
(abstract).
Bill Teahan's home page
has pointers to several papers on PPM.
The Burrows-Wheeler algorithm
Data Compression with the Burrows-Wheeler Transform by Mark Nelson. A nice quick overview.
The Burrows-Wheeler block sorting algorithm off of the
compression faq.
BZIP, an implementation
that uses Burrows-Wheeler.
JPEG and MPEG.
The JPEG2000 page. This
is the new JPEG standard, which uses Wavelet transforms instead of the Cosine
transform.
JPEG 2000 Still Image Coding System: An Overview.
An good paper on the topic.
Video Compression. A nice description of the
JPEG and MPEG compression algorithms.
The JPEG
and MPEG
FAQS from faqs.org.
MPEG
Starting Points and FAQs.
Another nice MPEG FAQ.
Audio Compression
MPEG Audio Layer-3
overview,
FAQ,
and
general resources
Another FAQ for MPEG Audio
Audio Compression References
Information on lossless audio compression.
Fractal compression
Introduction to fractal compression off of the
compression faq.
What is the state of fractal compression?.
Links to various resources organized by Yuval Fisher. He also has a book on the topic.
Wavelet compression
Amara's Wavelet
Page. This page is comprehensive including an online Introduction to Wavelets.
Summus. A company that
sells products based on wavelet compression and has a nice short overview
of Wavelet theory off of their home page.
Luratech. Another company that
sells products based on wavelet compression, with some nice demos.
Also includes a paper (published at the PICS conference) describing the
techniques used at a high level.
A nice example of the degradation of images as compression
is increased.
Some cute wavelet applets
An excellent collection of papers on wavelets from MathSoft.
Back to the Algorithms in the Real World page.
Guy Blelloch,
guyb@cs.cmu.edu.