CyberDefenders: GetPDF

PDF format is the de-facto standard in exchanging documents online. Such popularity, however, has also attracted cyber criminals in spreading malware to unsuspecting users. The ability to generate malicious pdf files to distribute malware is a functionality that has been built into many exploit kits. As users are less cautious about opening PDF files, the malicious PDF file has become quite a successful attack vector. The network traffic is captured in lala.pcap contains network traffic related to a typical malicious PDF file attack, in which an unsuspecting user opens a compromised web page, which redirects the user’s web browser to a URL of a malicious PDF file. As the PDF plug-in of the browser opens the PDF, the unpatched version of Adobe Acrobat Reader is exploited and, as a result, downloads and silently installs malware on the user’s machine.

Information

Category Name: GetPDF

Files: c31-Malicious-Portable.zip 30 KB
— lala.pcap 41K

My Recommendations

This is my personal preference. I like being organized and deleting a folder when I’m done with it .

mkdir Documents/CyberDefenders/getpdf && cd Documents/CyberDefenders/getpdf

Download it from CyberDefenders and verify it with:

sha1sum /path/to/c31-Malicious-Portable.zip

SHA1SUM: 81b99e0094edde5de6cec7d9f5cd391d9eca3eb2

Walkthrough

1. How many URL path(s) are involved in this incident?

In Wireshark, select ‘Statistics’, ‘HTTP’, ‘Requests’.

There are 6 URL paths, since they all originate from the same domain:

Answer: 6

2. What is the URL which contains the JS code?

Frame 12, which is a request to http://blog.honeynet.org.my/forensic_challenge/ has a JS code in the ‘Line-based text data: text/html’ portion:

 

Answer: http[://]blog[.]honeynet[.]org[.]my/forensic_challenge/

3. What is the URL hidden in the JS code?

My preference is to extract the files and inspect them myself. To do that, chose ‘File’, ‘Export Objects’, ‘HTTP’, select Packet 12 and save it to the working directory.

Then select the JS Code and save it to a file:

				
					cat forensics_challenge
echo -n '#javascriptcode' > malicious.js
				
			

First, we can copy the contents of the script to this website which will pretty print it. Then, I like to use a JavaScript sandbox and paste the contents of the code. The last variable to be called is ZeJexn, which I will test first by appending ‘console.log(“test:” + ZeJexn);’ at the end of the code:

The output shows that the URL being called is http[://]blog[.]honeynet[.]org[.]my/forensic_challenge/getpdf[.]php.

Answer: http[://]blog[.]honeynet[.]org[.]my/forensic_challenge/getpdf[.]php

4. What is the MD5 hash of the PDF file contained in the packet?

The PDF file is in packet 46, we can extract it with the same method used before.

				
					md5sum fcexploit.pdf
#returns 659cf4c6baa87b082227540047538c2a
				
			

Answer: 659cf4c6baa87b082227540047538c2a

5. How many object(s) are contained inside the PDF file?

Using pdfid

				
					pdfid.py fcexploit.pdf
				
			

Answer: 19

6. How many filtering schemes are used for the object streams?

Using pdf-parser:

				
					pdf-parser.py fcexploit.pdf | grep Filter
				
			

The object streams use the following filters ‘/FlateDecode /ASCII85Decode /LZWDecode /RunLengthDecode ‘.

Answer: 4

7. What is the number of the 'object stream' that might contain malicious JS code?

Using pdf-parser:

				
					pdf-parser.py fcexploit.pdf | grep -i java -B 10 -A 10
				
			

The only JavaScript detected by pdf-parser is in Object 4. Object 4 references Object 5, which is where the actual JavaScript is.

Answer: 5

8. Analyzing the PDF file. What 'object-streams' contain the JS code responsible for executing the shellcodes?

The JS code is divided into two streams. Format: two numbers separated with ‘,’. Put the numbers in ascending order
 

To save time,we can dump all objects that have filters. Pdf-parser fails to decompress ASCII85 in Python3, but  it works in python2:

				
					python2 /usr/local/bin/pdf-parser.py --raw -o 7 -f fcexploit.pdf -d obj7
python2 /usr/local/bin/pdf-parser.py --raw -o 9 -f fcexploit.pdf -d obj9
python2 /usr/local/bin/pdf-parser.py --raw -o 10 -f fcexploit.pdf -d obj10
				
			

The dumped files have a specific format.
Object 7 is a long hex string. Object 9 is a long line formatted as ‘X_170987743**’.
Object 10 is long line formatted as ‘U_155bf62c9aU_7917ab39**’.

To figure out how the code operates we can look at the JavaScript code in Object 5:

				
					python2 /usr/local/bin/pdf-parser.py --raw -o 5 -f fcexploit.pdf -d obj5
				
			

Here, I pasted the contents of obj5 into CyberChef with the recipe ‘JavaScript Beautify’:

Var SSS looks for Annotations at ‘Page’. Object 3 is a Page, and is suspicious because it references three objects:

Object 3’s Annots reference  Object 6 and 8.

The ‘Subject’ of both these Annotations are Object 7 and Object 9:

 

In the second part of the code, the variable arr splits the string using the same format found in Object 10. We can do the same for Object 10:

				
					sed 's/U_155bf62c9aU_7917ab39//g' obj10 | xxd -r -p > obj10.out
				
			

The result is a JavaScript Code:


The ‘replace‘ commands match the formatting  in Object 9 and Object 7. We can reformat using the same method as before:

				
					sed  's/X_17844743X_170987743/%/g' obj9 | xxd -r -p > obj9.out
sed 's/89af50d/%/g' obj7 | xxd -r -p > obj7.out
				
			

The result is a Powershell script split into two, that executes shellcodes. For example, in Object 7:


Given the way the code operates, the objects that execute shellcodes are Object 7 and Object 9.

Answer: 7,9

 

9. The JS code responsible for executing the exploit contains shellcodes that drop malicious executable files.

What is the full path of malicious executable files after being dropped by the malware on the victim machine?
 

I wrote a script to help automate the whole thing. It relies on the library ‘pylibemu’. Object 10 calls to merge both codes in Object 7 and Object 9.

				
					cat obj7.out obj9.out > fullobj.out
python3 ShellCodeExtract.py -f fullobj.out
				
			

This is the log of the execution. The log shows that the filepath would’ve been ‘C:\Windows\system32\a.exe ‘

Answer: C:\Windows\system32\a.exe

10. The PDF file contains another exploit related to CVE-2010-0188.

What is the URL of the malicious executable that the shellcode associated with this exploit drop?
 

Back to the Pcap file, the next request is for favicon.ico, and then the_real_malware.exe:

Answer: http[://]blog[.]honeynet[.]org[.]my/forensic_challenge/the_real_malware[.]exe

11. How many CVEs are included in the PDF file?

Just searching for some of the code in Google is enough:

function updateE -> Adobe – ‘Collab.getIcon()’ Local Buffer Overflow (Metasploit) – CVE 2009-0927
function gX -> Adobe – ‘Collab.collectEmailInfo()’ Local Buffer Overflow – CVE-2007-5659
function cN -> Adobe Reader – ‘util.printf()’ JavaScript Function Stack Overflow – CVE-2008-2992
function cG -> Adobe – ‘Doc.media.newPlayer’ Use-After-Free – CVE-2009-4324

Finally, if you look at the full objects again, object 11 contains an Embedded file. The TIFF image is base64 encoded in a suspicious way.

				
					python2 /usr/local/bin/pdf-parser.py --raw -o 11 -f fcexploit.pdf -d obj11
cat obj11 #copy the base64
echo -n '#base64string' | base64 -d > obj11.out
md5sum obj11.out
#returns 80ed6b2d0c26cabcc7b869a524690d05
				
			

Looking for the hash in Virus Total flags the file as an exploit on CVE-2010-0188.
Object11 – Tiff – Adobe Acrobat – Bundled LibTIFF Integer Overflow – CVE-2010-0188

Answer: 5

TLDR

– A Network Capture containing various malicious files.
– The main tools I used to solve the challenge were pdfid.py, and pdf-parser.py (with python2 to extract JavaScript).   

– I wrote my own script ShellCodeExtract.py to automate shell code extraction. 

Recent Posts

Follow Us

Featured Video

Guide

Discover more from forensicskween

Subscribe now to keep reading and get access to the full archive.

Continue reading

Exit mobile version
%%footer%%