JPEG

出典: meddic

Wikipedia preview

出典(authority):フリー百科事典『ウィキペディア(Wikipedia)』「2015/10/02 21:43:40」(JST)

wiki ja

[Wiki ja表示]

wiki en

[Wiki en表示]

&-2&-4&1&1&0&0&0\\-3&1&5&-1&-1&0&0&0\\-3&1&2&-1&0&0&0&0\\1&0&0&0&0&0&0&0\

Wikipedia preview

出典(authority):フリー百科事典『ウィキペディア(Wikipedia)』「2015/10/02 21:43:40」(JST)

wiki ja

[Wiki ja表示]

wiki en

[Wiki en表示]

&0&0&0&0&0&0&0\

Wikipedia preview

出典(authority):フリー百科事典『ウィキペディア(Wikipedia)』「2015/10/02 21:43:40」(JST)

wiki ja

[Wiki ja表示]

wiki en

[Wiki en表示]

&0&0&0&0&0&0&0\

Wikipedia preview

出典(authority):フリー百科事典『ウィキペディア(Wikipedia)』「2015/10/02 21:43:40」(JST)

wiki ja

[Wiki ja表示]

wiki en

[Wiki en表示]

&0&0&0&0&0&0&0\end{array}}\right].}</annotation>

 </semantics>

</math></span></span></dd> </dl>

For example, using −415 (the DC coefficient) and rounding to the nearest integer

Notice that most of the higher-frequency elements of the sub-block (i.e., those with an x or y spatial frequency greater than 4) are compressed into zero values.

Entropy coding

Zigzag ordering of JPEG image components

Entropy coding is a special form of lossless data compression. It involves arranging the image components in a "zigzag" order employing run-length encoding (RLE) algorithm that groups similar frequencies together, inserting length coding zeros, and then using Huffman coding on what is left.

The JPEG standard also allows, but does not require, decoders to support the use of arithmetic coding, which is mathematically superior to Huffman coding. However, this feature has rarely been used, as it was historically covered by patents requiring royalty-bearing licenses, and because it is slower to encode and decode compared to Huffman coding. Arithmetic coding typically makes files about 5–7% smaller.

The previous quantized DC coefficient is used to predict the current quantized DC coefficient. The difference between the two is encoded rather than the actual value. The encoding of the 63 quantized AC coefficients does not use such prediction differencing.

The zigzag sequence for the above quantized coefficients are shown below. (The format shown is just for ease of understanding/viewing.)

−26
−3 0
−3 −2 −6
2 −4 1 −3
1 1 5 1 2
−1 1 −1 2 0 0
0 0 0 −1 −1 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0
0 0 0 0
0 0 0
0 0
0

</dd>

If the i-th block is represented by and positions within each block are represented by where and , then any coefficient in the DCT image can be represented as . Thus, in the above scheme, the order of encoding pixels (for the i-th block) is , , , , , , , and so on.

Baseline sequential JPEG encoding and decoding processes

This encoding mode is called baseline sequential encoding. Baseline JPEG also supports progressive encoding. While sequential encoding encodes coefficients of a single block at a time (in a zigzag manner), progressive encoding encodes similar-positioned batch of coefficients of all blocks in one go (called a scan), followed by the next batch of coefficients of all blocks, and so on. For example, if the image is divided into N 8×8 blocks , then a 3-scan progressive encoding encodes DC component, for all blocks, i.e., for all , in first scan. This is followed by the second scan which encoding a few more components (assuming four more components, they are to , still in a zigzag manner) coefficients of all blocks (so the sequence is: ), followed by all the remained coefficients of all blocks in the last scan.

It should be noted here that once all similar-positioned coefficients have been encoded, the next position to be encoded is the one occurring next in the zigzag traversal as indicated in the figure above. It has been found that baseline progressive JPEG encoding usually gives better compression as compared to baseline sequential JPEG due to the ability to use different Huffman tables (see below) tailored for different frequencies on each "scan" or "pass" (which includes similar-positioned coefficients), though the difference is not too large.

In the rest of the article, it is assumed that the coefficient pattern generated is due to sequential mode.

In order to encode the above generated coefficient pattern, JPEG uses Huffman encoding. The JPEG standard provides general-purpose Huffman tables; encoders may also choose to generate Huffman tables optimized for the actual frequency distributions in images being encoded.

The process of encoding the zig-zag quantized data begins with a run-length encoding explained below, where:

  • x is the non-zero, quantized AC coefficient.
  • RUNLENGTH is the number of zeroes that came before this non-zero AC coefficient.
  • SIZE is the number of bits required to represent x.
  • AMPLITUDE is the bit-representation of x.

The run-length encoding works by examining each non-zero AC coefficient x and determining how many zeroes came before the previous AC coefficient. With this information, two symbols are created:

Symbol 1 Symbol 2
(RUNLENGTH, SIZE) (AMPLITUDE)

</dd>

Both RUNLENGTH and SIZE rest on the same byte, meaning that each only contains four bits of information. The higher bits deal with the number of zeroes, while the lower bits denote the number of bits necessary to encode the value of x.

This has the immediate implication of Symbol 1 being only able store information regarding the first 15 zeroes preceding the non-zero AC coefficient. However, JPEG defines two special Huffman code words. One is for ending the sequence prematurely when the remaining coefficients are zero (called "End-of-Block" or "EOB"), and another when the run of zeroes goes beyond 15 before reaching a non-zero AC coefficient. In such a case where 16 zeroes are encountered before a given non-zero AC coefficient, Symbol 1 is encoded "specially" as: (15, 0)(0).

The overall process continues until "EOB" – denoted by (0, 0) – is reached.

With this in mind, the sequence from earlier becomes:

(0, 2)(−3); (1, 2)(−3); (0, 2)(−2); (0, 3)(−6); (0, 2)(2); (0, 3)(−4); (0, 1)(1); (0, 2)(−3); (0, 1)(1);</dd>
(0, 1)(1); (0, 3)(5); (0, 1)(1); (0, 2)(2); (0, 1)(−1); (0, 1)(1); (0, 1)(−1); (0, 2)(2); (5, 1)(−1);</dd>
(0, 1)(−1); (0, 0).</dd>

(The first value in the matrix, −26, is the DC coefficient; it is not encoded the same way. See above.)

From here, frequency calculations are made based on occurrences of the coefficients. In our example block, most of the quantized coefficients are small numbers that are not preceded immediately by a zero coefficient. These more-frequent cases will be represented by shorter code words.

Compression ratio and artifacts

This image shows the pixels that are different between a non-compressed image and the same image JPEG compressed with a quality setting of 50. Darker means a larger difference. Note especially the changes occurring near sharp edges and having a block-like shape.
The original image
The compressed 8×8 squares are visible in the scaled-up picture, together with other visual artifacts of the lossy compression

The resulting compression ratio can be varied according to need by being more or less aggressive in the divisors used in the quantization phase. Ten to one compression usually results in an image that cannot be distinguished by eye from the original. A compression ration of 100:1 is usually possible, but will look distinctly artifacted compared to the original. The appropriate level of compression depends on the use to which the image will be put.

External image
Illustration of edge busyness[22]

Those who use the World Wide Web may be familiar with the irregularities known as compression artifacts that appear in JPEG images, which may take the form of noise around contrasting edges (especially curves and corners), or "blocky" images. These are due to the quantization step of the JPEG algorithm. They are especially noticeable around sharp corners between contrasting colors (text is a good example, as it contains many such corners). The analogous artifacts in MPEG video are referred to as mosquito noise, as the resulting "edge busyness" and spurious dots, which change over time, resemble mosquitoes swarming around the object.[22][23]

These artifacts can be reduced by choosing a lower level of compression; they may be completely avoided by saving an image using a lossless file format, though this will result in a larger file size. The images created with ray-tracing programs have noticeable blocky shapes on the terrain. Certain low-intensity compression artifacts might be acceptable when simply viewing the images, but can be emphasized if the image is subsequently processed, usually resulting in unacceptable quality. Consider the example below, demonstrating the effect of lossy compression on an edge detection processing step.

Image Lossless compression Lossy compression
Original
Processed by
Canny edge detector

Some programs allow the user to vary the amount by which individual blocks are compressed. Stronger compression is applied to areas of the image that show fewer artifacts. This way it is possible to manually reduce JPEG file size with less loss of quality.

Since the quantization stage always results in a loss of information, JPEG standard is always a lossy compression codec. (Information is lost both in quantizing and rounding of the floating-point numbers.) Even if the quantization matrix is a matrix of ones, information will still be lost in the rounding step.

Decoding

Decoding to display the image consists of doing all the above in reverse.

Taking the DCT coefficient matrix (after adding the difference of the DC coefficient back in)