You are on page 1of 15

S.

ID: 30232565

A literature review of common techniques for hiding data, and the methods that are used to find them

1. Introduction:

In todays world large volume of data is being stored and transmitted electronically; it is no surprise that various methods of protecting or hiding such data have evolved. One lesserknown but rapidly growing method is steganography, the art and science of hiding information so that it does not even appear to exist. Moreover, in an ideal world we would all be able to openly send encrypted email or files to each other with no fear of reprisals (Rabah, 2004) . In addition, various other techniques have been proposed for hiding the data at rest for example hiding data in the slack space and digital warrens. Although, these techniques differ from each other but their main purposed is to provide privacy, integrity and security by hiding the data in some way. Internet users frequently need to store, send, or receive private information. The most common way to do this is to transform the data into a different form. The resulting data can be understood only by those who know how to return it to its original form. This method of protecting information is known as encryption. A major drawback to encryption is that the existence of data is not hidden. Data that has been encrypted, although unreadable, still exists as data. If given enough time, someone could eventually unencrypt the data. A solution to this problem is steganography (Artz, 2001). However, it is not necessary to hide all information

S.ID: 30232565

that is being transmitted over internet so cryptographic techniques are also useful where data protection is the priority. Slack space is another technique that is used to hide information in the unallocated space of the disks. This unallocated space is some time called as logical end of the file or end of the associated cluster. Hiding the file or data in the slack space has some advantages like, the host or the carrier file is unaffected while the hidden data is transparent to the host OS and file manager. There is disadvantage as well because the hidden message can be easily recovered by some basic tools for recovering the data. All these hiding techniques are based on the past techniques that were used in the ancient times for hiding the secret information while communicating. The word cryptography comes for the ancient Greek which is combination of two words krypto and Grafo, which means hidden writing. There were various incidents of the past in which cryptographic techniques were implemented for making the communication secure and untraceable. The earliest forms of cryptography were found in the civilisations of Egypt, Greece and Rome. In early days, The Greeks implemented the idea of cryptography by wrapping a tape around a stick, and then writes the message on the wound stick. Receiver uses the same type of stick to decipher the message. Similarly Romans idea of cryptography was known as Caesar shift Cipher, which utilize the idea of shifting letters by agreed on certain number. Similar to the cryptography, the todays steganography is also developed from ancient technique of information hiding. It is believed that birth place of steganography is Greece as there were ancient records found which described the technique of steganography. For instance, the practice of melting wax off wax tablets used for writing messages and then inscribing a message in the underlying wood. The wax was then reapplied to the wood, giving the appearance of a new, unused tablet (Artz, 2001).

S.ID: 30232565

The important concept from this history lesson is that communication does not have to occur over standard open channels using well-known methods. The Internet, in its massive, protocol-laden glory, is a playground for the modern steganographer. For example, think of an IP packet as the wax tablet previously mentioned. The packets data field is equivalent to the writing in the wax. The headers serve as the wood in this analogy who ever looks at an IP packets headers, much less the data alignment padding? Most every protocol, language, and data format on the Internet has room for rent, (Artz, 2001). For making the difference between these common techniques for hiding data, further literature review is divided into two main sections. First section reviews the techniques for hiding the data while communicating and methods to find that hidden data. Second section reviews the slack space technique for hiding the data at rest and methods to find out the hidden data in that space. Finally, conclusions are drawn in last section.

2. Techniques for hiding data while communicating:

2.1. Steganography: Steganography is the art and science of hiding communication; a steganographic system thus embeds hidden content in unremarkable cover media so as not to arouse an eavesdroppers suspicion. In the past, people used hidden tattoos or invisible ink to convey steganographic content. Today, computer and network technologies provide easy-to-use communication channels for steganography. Essentially, the information-hiding process in a steganographic system starts by identifying a cover mediums redundant bits (those that can be modified without destroying that mediums integrity).The embedding process creates a stego medium
3

S.ID: 30232565

by replacing these redundant bits with data from the hidden message (Honeyman & Provos, 2003).

For embedding process an encoder is needed, in which secret message and cover message is passed. Encoder uses one or several protocols for embedding the secret information in the cover message. A stego object will be produced after the process done by the encoder and this stego object or cover object which could be picture or text file can be sent off by some communication medium such as email. At the receivers end, in order to view the secret information recipient must decode the stego object by decoding process. A simple reverse process of encoding is used for decoding the stego object. Following flow chart shows the process of steganographic encoding for an image.

Cover image

F(X,M,K) Stego-image (Z)

Message (M)

Stego-key (k)

Figure .1 Steganographic Encoding from (J.Delp & T. Lin, 2001) Steganographic techniques Over the past few years, various steganographic techniques have been developed for embedding the secret information in multimedia objects. Some of the common techniques are following:

S.ID: 30232565

1. Least significant bit insertion (LSB): Least significant bits (LSB) insertion is a simple approach for embedding information in image file. In this technique the right most bit in the binary notation is exchanged with the bit from the embedded message. The change in the right most bit or least significant bit has least impact on the binary data or image.

2. Masking and filtering: Masking and filtering techniques, usually restricted to 24 bits and grey scale images, hide information by marking an image, in a manner similar to paper watermarks. The techniques performs analysis of the image, thus embed the information in significant areas so that the hidden message is more integral to the cover image than just hiding it in the noise level.

3. Transform techniques: when the file is compressed at the time of transmission, a transformed space is generated. This transformed space is used for hiding the information. Three transform techniques used for embedding the message are the Discrete Cosine Transform (DCT) used in JPEG compression, Discrete Fourier Transform and the Wavelet Transform. These methods hide messages in significant areas of the cover-image and make them more robust and give high resistance against signal processing attacks on steganography.

4. Compression algorithm technique: The idea is to integrate the data-embedding with an image-compression algorithm (such as JPEG). For example, the steganographic tool Jpeg-Jsteg takes a lossless cover-image and the message to be hidden to generate a JPEG Stego-image. In the coding process, DCT coefficients are rounded up or down according to individual bits to be embedded. Such techniques are attractive because

S.ID: 30232565

JPEG images are popular on the Internet. Other transforms (such as DFT and wavelet transform) can also be used (Wang & Wang, 2004).

5. Spread-spectrum techniques. The hidden data is spread throughout the cover-image based on spread spectrum techniques such as frequency hopping. A stego-key is used for encryption to randomly select the frequency channels. White Noise Storm is a popular tool using this technique. (Wang & Wang, 2004)

Steganography Detection: Steganalysis Modern steganographys goal is to keep its mere presence undetectable, but steganographic systems always leave behind detectable traces in the cover medium because of their invasive nature. Even if secret content is not revealed, the existence of it is: modifying the cover medium changes its statistical properties, so eavesdroppers can detect the distortions in the resulting stego mediums statistical properties. The process of finding these distortions is called statistical steganalysis. (Honeyman & Provos, 2003) Similar to cryptanalysis, steganalysis attempts to defeat the goal of steganography. It is the art of detecting the existence of hidden information (Ping and Zhang, 2003). Steganalysis depands on the type of steganographic cover medium used for the embedding the secret message. For example steganalysis of image and audio differ from each other depanding on the algorithm used for the hiding the message inside them. Generally there are two major types by which steganalysis can be done: visual analysis and statistical (algorithmic analysis). Visual analysis tries to reveal the presence of secret communication through inspection, either with the naked eye or with the assistance of a computer. The computer can, for

S.ID: 30232565

example, help decompose an image into bit-planes. Any unusual appearance in the display of the LSB-plane would be expected to indicate the existence of secret information (Wang & Wang, 2004). Statistical analysis is more powerful since it reveals tiny alterations in an images statistical behavior caused by steganographic embedding. As there is a range of approaches to embedding, each modifying the image in a different way, unified techniques for detecting hidden information in all types of stegoimages are difficult to find. The nominally universal methods developed to detect embedded stego-data are generally less effective than the steganalytic methods aimed at specific types of embedding. (Wang & Wang, 2004) As mentioned above there are different ways of doing steganography similarly there are various ways by which its dection could be possible. It is not possible to mention all the techniques and tools but most popular steganalytic methods are mentioned below. Popular Steganalysis method: 1. RS steganalysis 2. Palette checking 3. Universal blind detection 4. RQP method There is one more recent method of steganalysis based on Statistical analysis of the difference image histogram which is aimed at LSB steganography. A measure of the weak correlation between successive bit planes is used to construct a classifier to discriminate stego-images from cover images (Ping & Zhang, 2003). If we look at the tools for steganalysis there are various tools available such as EnCase by Guidance Software Inc., ILook Investigator by Electronic Crimes Program, Stegdetect,
7

S.ID: 30232565

provided by Niels Provos, is a popular automated tool for detecting steganographic content in images. Provos is the author of the steganography program called OutGuess.

2.2. Cryptography: Today, secure communication is necessary on the internet or web and for making the communication secure there is a need of some technique which can maintain the privacy and security in the communication. Keeping all this in mind a technique called cryptography has been proposed. With the help of cryptography a sender can encrypt a massage with a small piece of information and then send the encrypted message to the receiver. The receiver decrypts the encrypted message by the key that has been shared and recover the original message. Steganography must not be confused with cryptography because classical cryptography is about concealing the content of messages while, steganography is about concealing their existence (A. P.Petitcolas & J. Anderson, 1998). Moreover, encryption and steganography achieve separate goals. Encryption encodes the data such that an unintended recipient cannot determine its intended meaning. Steganography in contrast, does not alter data to make it unusable to an intended recipient. Instead, the steganographer attempts to prevent an unintended recipient from suspecting that the data is there (Artz, 2001). Categories of cryptographic system: 1. Symmetric key ciphers: in this approach sender and receiver use the same secret key for encryption and decryption of messages. 2. Public key ciphers: are also known as asymmetric algorithms, in this approach the key used for encryption is different from the key used for decryption.
8

S.ID: 30232565

Purpose: 1. Confidentiality: main purpose of cryptography is to maintain the confidentiality by making the data unreadable to unauthorised person. 2. Authentication and integration: todays cryptography is more than just encryption and decryption. Authentication or the art of making sure one is talking to whom he thinks he is talking to, is as fundamental as privacy. It is used almost on a daily basis. For example, when a student signs an exam or signs off for a credit card. As the world moves more and more towards electronic communications, the need to have electronic techniques for providing authentication arises. Cryptography provides mechanisms for such procedures (M.Salois & Paquin, 2007). 3. Non-repudiation: this property ensures that sender cannot deny having sent the message. Digital signatures are used to implement this functionality as it binds a message to a unique private key to maintain the integrity of the message.

Cryptanalysis: It is the technique for breaking the cipher code and obtaining the secret message from the encrypted information without access to secret key normally required for the decryption. There are various methods for cryptanalysis and these methods are based on the algorithms of the cryptography. There are some methods of cryptanalysis for the symmetric algorithms for example:

1. Brute force attack: for retrieving plain text from any cipher most basic method is brute force. In this method length of key determine the number of possible keys for
9

S.ID: 30232565

decryption. This process is very time consuming and not feasible for every cryptographic method.

2. Differential cryptanalysis: this method is used for certain types of cryptography and it is efficient when the cryptanalyst can choose plaintext and obtain cipher texts. This method searches for the plain text, cipher text pairs whose difference is constant and investigate the differential behaviour of cryptosystems. Exclusive OR (XOR) operation is basically used for the difference. This technique can successfully cryptanalyze DES with an effort on the order 247 chosen plaintexts.

3. Integral cryptanalysis: this type of cryptanalysis is particularly applicable to block ciphers. Integral cryptanalysis exploits the simultaneous relationship between many encryptions, in contrast to differential cryptanalysis where one considers only pairs of encryptions (Knudsen & Wagner, p. 114). Some other cryptanalysis techniques are also available such as, linear cryptanalysis, Mod-n cryptanalysis, Related-key attack, Sandwich attack and Slide attack

3. Techniques for hiding the data at rest:

3.1.Slack space: All the methods discussed above are basically used for hiding the secret communication or making it more secure but if we want to hide the data at rest for example in our hard drive or some physical storage device then, there is a method known as slack space. Slack space is
10

S.ID: 30232565

unused or hidden space in the disk cluster. Slack space relates to all the areas on disk surface which cannot be utilised by the file system because of discrete nature of space allocation, (Huebne, Bem, and Wee, 2006). In addition (Berghel, 2007) mentioned that hiding data in the area between the logical end of the file and end of the associated cluster in which the file was placed is called file slack or slack space. It is generally used for hiding the data on both the NTFS and Ext2fs file systems. There are some new techniques also been developed which are based on slack space, such as Alternate data streams. Hiding data in the slack space has some advantages like host and the carrier file is unaffected while the hidden data is transparent to the host OS and file manager. Volume slack: Volume slack is the unused space between the end of the file system and the end of the partition where the file system resides. The size of the hidden data in volume slack is only limited by the space on the hard disk available for a partition as the size of partition can be changed in relation to the size of volume to hide more data, or the size of volume can be changed in relation to the size of partition (Huebne, Bem, & Wee, 2006). File system slack: File system slack is the unused space at the end of the file system that is not allocated to any cluster. This happens because the partition size may not be the multiple of the cluster size (Huebne, Bem, & Wee, 2006). Today there are various automated tools are also used for hiding the data in the slack spaces for example slacker hide data within slack space of NTFS or FAT file system, FragFS hides data within the NTFS master file table.

11

S.ID: 30232565

3.2.Hiding data in the disk drives and digital warrens: A formatted hard drive may be thought of as a logical structure mapped onto a physical medium. The logical structure consists of partitions, file systems, files, records, fields, and so forth. The physical structure consists of disks, cylinders, tracks, clusters, and sectors. The absence of 1:1 mappings between the logical and the physical realms creates the digital warrens for concealed data (Berghel, 2007). There are softwares which typically interface with the logical structure but if the data was hidden on a disk, the typical user would never know it. Moreover, most of the computer forensics tools are also not able to detect the digital warrens because these tools basically focus on those disk areas that are typically used for concealing data.

Detection of slack space: With the help of forensic analysis hidden data in the slack space can be detected. For example to detect the hidden data in the volume slack first chkdsk command is used to check the integrity of the file system. In the next step with the help of sleuth kit mmls command number of sector allocated to the partition can be checked. Then the number of sector allocated to the NTFS file system in that partition has to be checked with the fsstat command. There is no cluster numbers in the volume slack so Sleuthkit dcat command dd is used to extract the volume slack and then hex editor can be used to view the data. Figure 3 shows the analysis flowchart for the detection of hidden data in the volume slack. Similar to the detection of volume stack, file stack can also be detected with the help of some tools as predefined commands.

12

S.ID: 30232565

Run chkdsk command in windows

Check number of sectors allocated in the partition (A)

Check number of sector allocated to the NTFS file system (B)

Compare backup copy of boot sector with the boot sector

A=B+1 and boot sector copies match?

No hidden data in volume slack

Extract the sector found in the volume slack

Figure 2 analysis of hidden data in the volume slack (Huebne, Bem, & Wee, 2006).

4. Conclusion: The battle between techniques of data hiding and methods of their detection represents the cyber warfare with a deep influence on information security. The two sides of the battle are the attempt to transmit secret information under cover of unobjectionable multimedia and the

13

S.ID: 30232565

efforts to detect such hidden communication. This literature review explored some data hiding techniques and the methods of detecting that hidden message. Some of the data hiding techniques represented here are very simple and due to various limitations there detection can be done easily but some of the data hiding techniques are very sophisticated and difficult to detect. New and advance techniques of concealing data enables user to hide the information in complex ways but on the counter side their detection is also possible and this battle will never end.

References:
(n.d.). Retrieved from http://www.logicalsecurity.com/resources/whitepapers/Cryptography.pdf (n.d.). Retrieved July 5, 2011, from http://whereismydata.wordpress.com/2009/04/25/forensicsram-slack-and-file-slack/ A. P.Petitcolas, F., & J. Anderson, R. (1998, May). On the Limits of Steganography. IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 16(4), 474-481. Artz, D. (2001). Digital steganography: hiding data within data. IEEE Internet Computing, 75-80. Berghel, H. (2007, April). Hiding data, forensics and anti-forensics. Digital Village, 50(4), 15-20. Cryptanalysis. (n.d.). Retrieved July 4, 2001, from Wikipedia: http://en.wikipedia.org/wiki/Cryptanalysis Damico, T. M. (2009, November 10). A Brief History of Cryptography. Retrieved July 4, 2011, from Student Pulse: http://www.studentpulse.com/articles/41/a-brief-history-of-cryptography# Differential cryptanalysis. (n.d.). Retrieved July 6, 2011, from Wikipedia: http://en.wikipedia.org/wiki/Differential_cryptanalysis Enrique Cauich, R. G. (n.d.). Data Hiding in Identification and Offset IP fields. USA, California, Irvine, CA 92717: University of California, USA. 14

S.ID: 30232565 Ewa Huebne, D. B. (2006). Data hiding in the NTFS file system. Digital Investigation, 211-266. Garfinkel, S. (2007). Anti-Forensics: Techniques, Detection and Countermeasures. 2nd International Conference on i-Warfare and Security (pp. 77-84). CA, USA: ACM. Honeyman, P., & Provos, N. (2003). Hide and seek: An introduction to steganography. IEEE security and privacy, 32-44. Integral cryptanalysis. (n.d.). Retrieved July 6, 2011, from Wikipedia: http://en.wikipedia.org/wiki/Integral_cryptanalysis J.Delp, E., & T. Lin, E. (2001). A Review of Data Hiding in Digital Images. West Lafayette, Indiana: CERIAS ( Center for Education and Research Information Assurance and Security. Kessler, G. C. (n.d.). An Overview of Steganography for the Computer Forensics Examiner. 1-29. Knudsen, L., & Wagner, D. (n.d.). Integral Cryptanalysis. 144-129. Koc, C. K. (n.d.). Differential Cryptanalysis. Oregon State University. M.Salois, & Paquin, F. (2007, February). Introduction to cryptography. 1-24. Canada: Defence research and development Canada. Ping, X., & Zhang, T. (2003). A new approach to reliable detection of LSB steganography in natural images. Signal Processing, 2085-2093. Rabah, K. (2004). Steganography-The Art of Hiding Data. Information Technology Journal 3 (3), 245269. Steganalysis. (n.d.). Retrieved July 2, 2011, from http://www.infosyssec.com/infosyssec/Steganography/steganalysis.htm Wang, S., & Wang, H. (2004). Cyber Warfare: Steganography vs. Steganalysis. COMMUNICATIONS OF THE ACM, 76-82.

15

You might also like