In signal processing, data compression, source coding,[1] or bit-rate reduction involves encoding information using fewer bits than the original representation.[2] Compression can be either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information.[3]
The process of reducing the size of a data file is often referred to as data compression. In the context of data transmission, it is called source coding; encoding done at the source of the data before it is stored or transmitted.[4] Source coding should not be confused with channel coding, for error detection and correction or line coding, the means for mapping data onto a signal.
Compression is useful because it reduces resources required to store and transmit data. Computational resources are consumed in the compression process and, usually, in the reversal of the process (decompression). Data compression is subject to a space–time complexity trade-off. For instance, a compression scheme for video may require expensive hardware for the video to be decompressed fast enough to be viewed as it is being decompressed, and the option to decompress the video in full before watching it may be inconvenient or require additional storage. The design of data compression schemes involves trade-offs among various factors, including the degree of compression, the amount of distortion introduced (when using lossy data compression), and the computational resources required to compress and decompress the data.
Data compression
Data compression is the science (and art) of representing information in a compact form. Having been the domain of a relatively small group of engineers and scientists, it is now ubiquitous. It has been one of the critical enabling technologies for the on-going digital multimedia revolution for decades. Without compression techniques, none of the ever-growing Internet, digital TV, mobile communication or increasing video communication would have been practical developments. Data compression is an active research area in computer science. By ‘compressing data’, we actually mean deriving techniques or, more specifically, designing efficient algorithms to: • represent data in a less redundant fashion • remove the redundancy in data • implement coding, including both encoding and decoding. The key approaches of data compression can be summarised as modelling + coding. Modelling is a process of constructing a knowledge system for performing compression. Coding includes the design of the code and product of the compact data form
Importance of data compression Data compression techniques is motivated mainly by the need to improve efficiency of information processing.
This includes improving the following main aspects in the digital domain:
- storage efficiency
- efficient usage of transmission bandwidth
- reduction of transmission time.
Although the cost of storage and transmission bandwidth for digital data have dropped dramatically, the demand for increasing their capacity in many applications has been growing rapidly ever since. There are cases in which extra storage or extra bandwidth is difficult to achieve, if not impossible. Data compression as a means may make much more efficient use of existing resources with less cost. Active research on data compression can lead to innovative new products and help provide better services.
Brief history Data compression can be viewed as the art of creating shorthand representations for the data even today, but this process started as early as 1,000 BC.
The short list below gives a brief survey of the historical milestones:
- 1000BC, Shorthand
- 1829, Braille code
- 1843, Morse code
- 1930 onwards, Analog compression
- 1950, Huffman codes
- 1975, Arithmetic coding
- 1977, Dictionary-based compression
- 1980s – early 80s, FAX – mid-80s, Video conferencing, still images (JPEG), improved FAX standard (JBIG) – late 80s, onward Motion video compression (MPEG)
- 1990s – early 90s, Disk compression (stacker) – mid-90s, Satellite TV – late 90s, Digital TV (HDTV), DVD, MP3
Source data
In this subject guide, the word data includes any digital information that can be processed in a computer, which includes text, voice, video, still images, audio and movies. The data before any compression (i.e. encoding) process is called the source data, or the source for short.
Three common types of source data in the computer are text and (digital) image and sound.
- Text data is usually represented by ASCII code (or EBCDIC).
- Image data is represented often by a two-dimensional array of pixels in which each pixel is associated with its color code.
- Sound data is represented by a wave (periodic) function.
In the application world, the source data to be compressed is likely to be so-called multimedia and can be a mixture of text, image and sound.
Lossless and lossy data compression
Data compression is simply a means for efficient digital representation of a source of data such as text, image and the sound. The goal of data compression is to represent a source in digital form with as few bits as possible while meeting the minimum requirement of reconstruction. This goal is achieved by removing any redundancy presented in the source. There are two major families of compression techniques in terms of the possibility of reconstructing the original source. They are called Lossless and lossy compression.
Lossless compression
A compression approach is lossless only if it is possible to exactly reconstruct the original data from the compressed version. There is no loss of any information during the compression1 process. 1This, when used as a general term, actually includes both compression and decompression process. Lossless compression techniques are mostly applied to symbolic data such as character text, numeric data, computer source code and executable graphics and icons. Lossless compression techniques are also used when the original data of a source are so important that we cannot afford to lose any details. For example, medical images, text and images preserved for legal reasons; some computer executable files, etc. Lossy compression A compression method is lossy compression only if it is not possible to reconstruct the original exactly from the compressed version. There are some insignificant details that may get lost during the process of compression. Approximate reconstruction may be very good in terms of the compression-ratio but usually it often requires a trade-off between the visual quality and the computation complexity (i.e. speed). Data such as multimedia images, video and audio are more easily compressed by lossy compression techniques.
Main compression techniques
Data compression is often called coding due to the fact that its aim is to find a specific short (or shorter) way of representing data. Encoding and decoding are used to mean compression and decompression respectively.
We outline some major compression algorithms below:
- Run-length coding
- Quantisation
- Statistical coding
- Dictionary-based coding
- Transform-based coding
- Motion prediction.
LOSSY COMPRESSION
Our eyes and ears cannot distinguish subtle changes. In such cases, we can use a lossy data compression method. These methods are cheaper—they take less time and space when it comes to sending millions of bits per second for images and video. Several methods have been developed using lossy compression techniques. JPEG (Joint Photographic Experts Group) encoding is used to compress pictures and graphics, MPEG (Moving Picture Experts Group) encoding is used to compress video, and MP3 (MPEG audio layer 3) for audio compression.
Image compression
As discussed in Chapter 2, an image can be represented by a two-dimensional array (table) of picture elements (pixels).
A grayscale picture of 307,200 pixels is represented by 2,457,600 bits, and a color picture is represented by 7,372,800 bits.
In JPEG, a grayscale picture is divided into blocks of 8 × 8 pixel blocks to decrease the number of calculations because, as we will see shortly, the number of mathematical operations for each picture is the square of the number of units.
Figure 15.10 JPEG grayscale example, 640 × 480 pixels
The whole idea of JPEG is to change the picture into a linear (vector) set of numbers that reveals the redundancies. The redundancies (lack of changes) can then be removed using one of the lossless compression methods we studied previously. A simplified version of the process is shown in Figure 15.11.
Figure 15.11 The JPEG compression process
Video compression
The Moving Picture Experts Group (MPEG) method is used to compress video. In principle, a motion picture is a rapid sequence of a set of frames in which each frame is a picture. In other words, a frame is a spatial combination of pixels, and a video is a temporal combination of frames that are sent one after another. Compressing video, then, means spatially compressing each frame and temporally compressing a set of frames.
MPEG Overview
- MPEG-1 : a standard for storage and retrieval of moving picture and audio on storage media
- MPEG-2 : a standard for digital television
- MPEG-4 : a standard for multimedia applications
- MPEG-7 : a content representation standard for information search
- MPEG-21: offers metadata information for audio and video files
MPEG 1
- First standard to be published by the MPEG organization (in 1992)
- A standard for storage and retrieval of moving pictures and audio on storage media
- Example formats: VideoCD (VCD), mp3, mp2
MPEG-2
- Video and audio standards for broadcast-quality television. Used for digital satellite TV services like DIRECT TV, digital Cable television signals, and (with slight modifications) for DVD video discs.
MPEG-4
Expands MPEG-1 to supportvideo/audio “objects”, 3D content, lowbitrate encoding and support forDigital Rights Managements.
MPEG 7
- Another ISO/IEC standard being developed by MPEG
- Content representation standard for information search
- Makes searching the Web for multimedia content as easy as
searching for text-only files
- Operates in both real-time and non real-time
environments
MPEG21
- “Multimedia framework”
- Based on two essential concepts:
- Digital Item
- Concept of Users interacting with Digital Item
- More universal framework for digital content protection
- Most of MPEG-21’s elements are set for completion in 2003 and 2004.
Audio compression
Audio data compression, not to be confused with dynamic range compression, has the potential to reduce the transmission bandwidth and storage requirements of audio data. Audio compression algorithmsare implemented in software as audio codecs. Lossy audio compression algorithms provide higher compression at the cost of fidelity and are used in numerous audio applications. These algorithms almost all rely on psychoacoustics to eliminate or reduce fidelity of less audible sounds, thereby reducing the space required to store or transmit them.
Lossless audio compression produces a representation of digital data that decompress to an exact digital duplicate of the original audio stream, unlike playback from lossy compression techniques such as Vorbis and MP3. Compression ratios are around 50–60 % of original size,[21] which is similar to those for generic lossless data compression. Lossless compression is unable to attain high compression ratios due to the complexity of waveforms and the rapid changes in sound forms. Codecs like FLAC, Shorten, and TTA use linear prediction to estimate the spectrum of the signal. Many of these algorithms use convolution with the filter [-1 1] to slightly whiten or flatten the spectrum, thereby allowing traditional lossless compression to work more efficiently. The process is reversed upon decompression.
When audio files are to be processed, either by further compression or for editing, it is desirable to work from an unchanged original (uncompressed or losslessly compressed). Processing of a lossily compressed file for some purpose usually produces a final result inferior to the creation of the same compressed file from an