Introduction to Steganography and Data Hiding Methods

Steganography is a technique for hiding information. The word steganography comes from Greek roots, literally meaning "covered writing". The goal of steganography is to conceal the existence of a hidden message or a piece of information.

Steganography and cryptography, although closely related, are nevertheless different in a fundamental way. While the goal of cryptography is to conceal the contents of a message, steganography tries to hide the very existence of such. However, these two techniques can be integrated very efficiently by first encrypting the secret message and then hiding it.

A variety of steganographic methods have been used throughout history. Ancient Greeks, for example, used to shave the hair of a slave messenger, tatoo the message on his bare head, and let the hair grow again. The recepient would shave off the hair and recover the message. Steganographic techniques had also been used extensively during World Wars I and II, among them invisible inks (chemicals that react only with other specific chemicals and reveal the invisible writing), microdot (photographs the size of a printed period) and null-cipher messages (unencrypted innocent-sounding messages that camouflage the real ones).

Today, with most of communication occuring electronically, digital multimedia signals are widely used as cover signals. There are many possible applications that utilize various data hiding schemes:

Dispatch of hidden messages (hidden communication)

In-band captioning (e.g. movie subtitles embedded directly into the video stream)

Revision tracking (annotation, storing the revision information directly in the image)

Tamper-proofing (indication of unauthorized modification)

Digital watermarking (copyright, indication of ownership)

Traitor-tracing schemes (identifying the source of illegal distribution)

We will concentrate on applications using digital images as the cover signals. Descriptions of other applications (audio, text) can be found in [5].

Data hiding schemes can be characterized using the following parameters ^[6]:

Capacity (the maximum number of bits that can be hidden)

Resistance to removal (robustness)

Imperceptibility (invisibility)

From information theory perspective, the steganographic channel can be viewed as one having a noise of large power, which is the cover image, and the signal itself is the embedded message. Therefore, a low signal-to-noise ratio is desired in order to satisfy the imperceptibility requirement (steganographic SNR).

It is not possible to simultaneously achieve high robustness and capacity while maintaining low SNR. Therefore, trade-offs must be made between these parameters, as required by specific application. For example, in an information-hiding application invisibility and capacity are more important than resistance to removal; watermarking application, on the other hand, will prefer high resistance to removal, but not high capacity and even invisibility.

Additionally, data hiding schemes can be classified as follows:

Cover escrow, where the original image (cover signal) is needed to extract the hidden information. It is not practical to most of the real applications (except for watermarking)

Blind schemes, on the other hand, allow direct extraction of the message from the modified image without knowledge of the original cover image.