Steganography Attack via PDF File

What is Steganography?

The goal of steganography is to hide messages in such a way that no one apart from the intended recipient even knows that a message has been sent. This can be achieved by concealing the information with a seemingly harmless carriers or cover.


Most of us use PDF format for reading, copying and editing our information. We also tend to download some related PDFs (such as Research papers, publications and notes) from the internet to help with our work. But none of us know whether they are safe to be downloaded.

The most discussed vulnerability of PDFs nowadays is the way it attaches on emails and webpages configured to start even if secure browsers and antivirus are in place. Sometimes we would have experienced our pdf crashing. This mean that our pdf is compromised.

PDFs documents contain some scripts programmed in JavaScript and XML to structure the document as well as PostScript for generating layout and graphics and its data compression technology.

The PDF file has gaps within the code data, and malicious code can be embedded. An example of this approach is the StegoPDf software which embed data, although very efficiently as there is a limited amount of empty spaces we could use, and capacity is an issue.

Is Steganography and Cryptography same?

Steganography differs from cryptography, the art of secret writing, which is intended to make a message unreadable by a third party but does not hide the existence a secret communication. Steganography hides the virus on PDF by adding  an image and Digital Watermarking to the file which carries.

PDF files aren't just docs....

It is important to know that PDF file format as it is not just text and images. It contains so many features which has led to an increase in vulnerability attack.

1) JavaScript: PDFs can contain JavaScript code that modifies the PDF’s contents or manipulates the PDF viewer’s features. It implements objects, methods, and properties that enable you to manipulate PDF files, produce database-driven PDF files, modify the appearance of PDF files.

2) Embedded Flash: PDFs can contain embedded Flash content. You need Flash Player to view Flash content in PDFs, PDF Portfolios, and other features. 

3) Launch Actions: PDF files has the ability to launch any command after popping up a confirmation window. In older versions of Adobe Reader, a PDF file could attempt to launch a dangerous command as long as the user clicked OK.

4) GoToE: PDF files can contain embedded PDF files. When a user loads the main PDF file, it could immediately load its embedded PDF file. This allows attackers to hide malicious PDF files inside other PDF files, fooling antivirus scanners by preventing them from examining the hidden PDF file.

Prevention and Detection:

The most popular technique of steganography is LSB (Least Significant Bit) Method which is achieved by toggling the LSB. This changes the colour intensity of an 8 bit colour component by 1/256th which cannot be detect by our naked eye.

For example, there are two jpeg images of a cat with their file formats.

Typical JPEG files (left image) start with FF D8 and  ends in FF D9. The file on the right below is the same file with a PDF hidden inside of it. As you can see it does not end in FF D9.

The following methods can be used for prevention or detection of malicious code in PDF files:

a) Metadata Analysis

Metadata is described as data which gives information about other data. Let me illuminate more about it. This data provides details about history of a particular electronic file such as the date of creation and modification. This meta data can be hidden and visible. Visible metadata means general information which can be found easily. Hidden metadata is present in every file which may include file's security settings or information about it's storage. For example if it is an email then it may include details about origin, systems which circulated the email .

b) XOR substitution algorithm:

This encryption method is used to encrypt data and cannot be cracked using brute-force approach. It is impossible to decrypt data without the encryption key as you cannot tell the output for XOR of two unknown variables.

Mitigation Mechanism for PDF attacks:

a)Change your preferences. In Adobe Acrobat Reader DC, for example, you can disable Adobe javascript in the preferences to help manage access to URLs.(If there is some form of java script control)

b)Preventing users from opening PDFs attached to spam or unexpected emails will greatly reduce the risk of infection.

c)Implement advanced email security – Some internet security company provides cloud, hosted and on-premises email security solutions. It leverages advanced security controls to examine files, senders, domains and URLs to look for malicious activity.

d)Use endpoint protection. -  using advanced endpoint security to constantly monitor the behaviour of a system to scout for malicious behaviour.


Steganography (art of hiding data) continues to play a role in modern attacks in several forms . Do you feel you like you wanna learn more about this? Are you ready for a hands on session for implementing a stego-attack? To know more stay tuned... and follow our blog posts.


Siddarth Singaravel