What is JPEG2000?
JPEG2000 is a new image compression standard being developed by the
Joint Photographic Experts Group (JPEG), part of the International
Organization for Standardization (ISO). It will reach a "Committee
Draft" (CD) status in December 1999. It is designed for different
types of still images (bi-level, gray-level, color, multicomponent)
allowing different imaging models (client/server, real-time
transmission, image library archival, limited buffer and bandwidth
resources, etc), within an unified system.
JPEG2000 is intended to provide low bit rate operation with rate-distorsion
and subjective image quality performance superior to existing
standards, without sacrificing performance at other points in the
rate-distorsion spectrum.
It has been decided to register the file extensions for testing and
final version of JPEG2000 as ".j2k".
JPEG2000 addresses areas where current standards fail to produce the
best quality of performance, such as:
|
Low bit rate compression performance (rates below 0.25 bpp
for highly-detailed gray-level images) |
|
Lossless and lossy compression in a single code stream |
|
Seamless quality and resolution scalability, without having
to download the entire file. The major benefit is the
conservation of bandwidth |
|
Large images: JPEG is restricted to 64kx64k images (without
tiling). JPEG2000 will handle image sizes up to (2^32 - 1) |
|
Single decompression architecture |
|
Error resilience for transmission in noisy environments,
such as wireless and the Internet |
|
Computer generated imagery |
|
Compound documents |
|
Region of Interest coding |
|
Improved compression techniques to accommodate richer
content and higher resolutions |
|
Metadata mechanisms for incorporating additional non-image
data as part of the file |
JPEG2000 will be able to handle up to 256 channels of
information, as compared to JPEG, which is limited to only RGB data.
Thus, JPEG2000 will be capable of describing complete alternate
color models, such as CMYK, and full ICC (International Color
Consortium).
Compression Efficiency
Early results show a 20% compression efficiency improvement over
JPEG, and a 40% improvement over Flashpix.
Important factors taken into account for achieving high compression
efficiency:
|
Embedded lossy to lossless |
|
Multiple component images |
|
Static and dynamic Region-of-interest |
|
Error resilience |
|
Spatial and quality scalability |
|
Rate-control |
JPEG2000 has two coding modes:
|
DCT-based coding mode: Currently baseline JPEG |
|
Wavelet-based coding mode: Includes non-reversible and
reversible transforms |
For an complete definition of the existing JPEG2000 compression
system, please refer to the Verification Model (see "References"
below). The discussion below is adapted from the JPEG2000 VM4.0.
Overview
The coder is essentially a bit-plane coder, using the same Layered
Zero Coding (LZC) techniques which have been employed in a number of
embedded Wavelet coders and were originally proposed by Taubman and
Zakhor (IEEE Tx IP, Sep '94). (See References below.) In fact, many
of the ideas presented in the VM4.0, including the use of separate
code blocks and post-compression rate-distortion optimization are
taken directly from that work and Dr. Taubman's own doctoral
dissertation (UC Berkeley, 1994). The key additions are:
|
The use of fractional bit-planes, in which the quantization
symbols for any given quantization layer (or bit-plane) are
coded in a succession of separate passes, rather than just one
pass |
|
A simple embedded quad-tree algorithm is used to identify
whether or not each of a collection of "sub-blocks" contains any
non-zero (significant) samples at each quantization layer, so
that the encoding and decoding algorithms need only visit those
samples which lie within sub-blocks which are known to have
significant samples. |
EBCOT: The Basic Idea
In VM4, the coding algorithm used is known as "Embedded Block Coding
with Optimized Truncation" (EBCOT). The coding subsystem in JPEG2000
is responsible for both the low level entropy coding operations
associated with the representation of subband sample values, and
organizing and packing the resulting code words into the bit stream.
The basic idea in EBCOT is to divide each subband into blocks of
samples which are coded independently. For each block, a separate
bitstream is generated without using any information from the other
blocks. The bit stream has the property that it can be truncated to
a variety of discrete lengths.
Once the entire image has been compressed, a postprocessing
operation passes over all the compressed blocks and determines the
extent to which each block's embedded bit stream should be truncated
in order to achieve a particular target bit rate, distortion bound
or other quality metric. More generally, the final bit stream is
composed from a collection of so-called "layers", where each layer
has an interpretation in terms of overall image quality.
The first, lowest quality layer, is formed from the optimally
truncated block bit streams in the manner described above. Each
subsequent layer is formed by optimally truncating the block bit
streams to achieve successively higher target bit rates, distortion
bounds or other quality metrics, as appropriate, and including the
additional code words required to augment the information
represented in previous layers to the new truncation points.
An important aspect of the EBCOT algorithm is the manner by which it
forms a final bit stream from the independent embedded bit streams
generated for every block. The bit stream formation problem is very
much simplified when the coder operates on entire subbands at a
time, since the additional spatial organization imposed by
independent blocks does not exist.
The fact that blocks are encoded independently enables the "random
access" feature. Suppose, however, that the bit stream must also
possess the "SNR progressive" feature. These two features appear to
work against each other since the random access feature requires
that individual blocks be separately decodable, while the SNR
progressive feature requires that the embedded bit streams for these
blocks be distributed throughout the bit stream so that more
important information always precedes less important information,
regardless of the spatial location associated with this information.
It would seem that the amount of overhead required to identify the
individual blocks within this distributed representation would be
quite considerable.
As the image becomes larger, the increased overhead required to
identify the exact sequence of a larger number of block segments is
largely wasted because many of these blocks will have almost
identical rate-distortion slopes so that the order in which they
appear is largely immaterial. It makes sense, therefore, to identify
the block truncation points which are very similar and include the
relevant code bytes for each of these blocks in a pre-defined order.
This is essentially the bit stream layering idea.
Basically, the bit stream is organized as a succession of layers,
where each layer contains the additional contributions from each
code block (some contributions may be empty). The block truncation
points associated with each layer are optimal in the rate-distortion
sense, which means that the bit stream obtained by discarding a
whole number of least important layers will always be
rate-distortion optimal. If the bit stream is truncated part way
through a layer then it will not be strictly optimal, but the
departure from optimality can be small if the number of layers is
large.
As the number of layers is increased so that the number of code
bytes in each layer is decreased, the rate-distortion slopes
associated with all block truncation points in the layer will become
increasingly similar; however, the number of code blocks which do
not contribute to the layer will also increase so that the overhead
associated with identifying the code blocks which do contribute to
the layer will increase. In practice, we find that optimal
compression performance for SNR progressive applications is achieved
when the number of layers is approximately twice as large as the
number of sub-bit-plane passes made by the entropy coder (that is,
the bit stream contains twice as much granularity as that provided
by previous verification models).
The boundaries of the sub-bit-plane passes are also the truncation
points for each block's embedded bit stream. Consequently, on
average each layer contains contributions from approximately half
the code blocks so that the cost of identifying whether or not a
block contributes to any given layer (about 2 bits per block) is
much less than the cost of identifying a strict order on the block
contributions. Moreover, the relative contribution of this overhead
to the overall bit rate is independent of the size of the image.
The DIG2000 File Format Proposal
The goal of the DIG2000 Initiative is to create a digital file
format that embodies a tightly-integrated set of essential features
for storing images, and provides the needed mechanisms for images to
be used effectively.
Some of the most important features are:
|
Flexible metadata architecture |
|
Unambiguous specification of color (default sRGB) |
|
Resolution-independent coordinate system |
|
Asymmetric storage and delivery - server holds original and
all metadata. |
|
Client can select subset to be delivered |
|
Protection of intellectual property - requires encryption,
watermarking etc |
|
Improved quality and rendition - consistent colour, print
capability |
|
Some backwards compatibility with original JPEG compression |
|
Object oriented functionalities (coding, information
embedding, a€|) |
|
File format |
Interesting links
For your convenience, these links will open in a new window.
|
From the official JPEG web site:
|
JPEG links |
|
JPEG2000links, including a PDF version of the first
Committee Draft |
|
Public Relations |
|
|
A set of JPEG2000 tools for coding and decoding, JP2 file
parsing and validation, and Static ROI setting and displaying. |
|
A non-technical article about JPEG2000 from WebReview.com |
|
Organization of the JPEG2000 committee, from the SPEAR
project web page |
|
Watermarkingon JPEG2000 |
|
The Digital Imaging Group(DIG) web site. The DIG2000 working
group proposed a file format for use with JPEG2000 |
References
|
JPEG2000 Verification Model 4.0, ISO/IEC JTC 1/SC 29/WG 1,
Charilaos Christopoulos (Ericsson, Sweden), Editor, April 22
1999. (Note: The latest version of the VM is v5.0). |
|
Requirements Ad Hoc Group, "JPEG2000 requirements and
profiles version 6.0," WG1 Vancouver Meeting, July 1999. |
|
D. Taubman, "High Performance Scalable Image Compression
with EBCOT", to appear in IEEE Trans. Image Proc., Submitted
March 1999; Revised August 1999. Available in PDF format |
|
D. Taubman and A. Zakhor, ``Multirate 3-D Subband Coding of
Video,'' IEEE Transactions on Image Processing, September 1994,
vol. 3, no. 5, pp. 572-588. |
|
D. Taubman, ``Directionality and Scalability in Image and
Video Compression,'' Ph.D. Thesis, Department of Electrical
Engineering and Computer Sciences, University of California at
Berkeley, December 1994. |
|
JPEG2000 Committee Draft Version 1.0, December 1999.
Available in PDF format |
|
Tutorial on JPEG2000, by Dr. Charilaos Christopoulos,
presented at ICIP '99 |
|
Video Technology Branch, Media Technologies Laboratory, DSP
Solutions R&D Center, Texas Instruments. |
|
Digital Imaging Group, "DIG2000 file format proposal
overview," DIG2000 Working Group, October 30, 1998. |
|
The Digital Imaging Group's DIG2000 Initiative, "An Overview
of JPEG2000 Technology and Benefits," |
|
JPEG Public Relations press releases |
|