
Information
Category Name: GetPDF
Files: c31-Malicious-Portable.zip 30 KB
— lala.pcap 41K
My Recommendations
This is my personal preference. I like being organized and deleting a folder when I’m done with it .
mkdir Documents/CyberDefenders/getpdf && cd Documents/CyberDefenders/getpdf
Download it from CyberDefenders and verify it with:
sha1sum /path/to/c31-Malicious-Portable.zipSHA1SUM: 81b99e0094edde5de6cec7d9f5cd391d9eca3eb2
Walkthrough
1. How many URL path(s) are involved in this incident?
In Wireshark, select ‘Statistics’, ‘HTTP’, ‘Requests’.
There are 6 URL paths, since they all originate from the same domain:
Answer: 6
2. What is the URL which contains the JS code?
Frame 12, which is a request to http://blog.honeynet.org.my/forensic_challenge/ has a JS code in the ‘Line-based text data: text/html’ portion:
Answer: http[://]blog[.]honeynet[.]org[.]my/forensic_challenge/
3. What is the URL hidden in the JS code?
My preference is to extract the files and inspect them myself. To do that, chose ‘File’, ‘Export Objects’, ‘HTTP’, select Packet 12 and save it to the working directory.
Then select the JS Code and save it to a file:
cat forensics_challenge
echo -n '#javascriptcode' > malicious.js
First, we can copy the contents of the script to this website which will pretty print it. Then, I like to use a JavaScript sandbox and paste the contents of the code. The last variable to be called is ZeJexn, which I will test first by appending ‘console.log(“test:” + ZeJexn);’ at the end of the code:
The output shows that the URL being called is http[://]blog[.]honeynet[.]org[.]my/forensic_challenge/getpdf[.]php.
Answer: http[://]blog[.]honeynet[.]org[.]my/forensic_challenge/getpdf[.]php
4. What is the MD5 hash of the PDF file contained in the packet?
The PDF file is in packet 46, we can extract it with the same method used before.
md5sum fcexploit.pdf
#returns 659cf4c6baa87b082227540047538c2a
Answer: 659cf4c6baa87b082227540047538c2a
5. How many object(s) are contained inside the PDF file?
Using pdfid:
pdfid.py fcexploit.pdf
Answer: 19
6. How many filtering schemes are used for the object streams?
Using pdf-parser:
pdf-parser.py fcexploit.pdf | grep Filter
The object streams use the following filters ‘/FlateDecode /ASCII85Decode /LZWDecode /RunLengthDecode ‘.
Answer: 4
7. What is the number of the 'object stream' that might contain malicious JS code?
Using pdf-parser:
pdf-parser.py fcexploit.pdf | grep -i java -B 10 -A 10
The only JavaScript detected by pdf-parser is in Object 4. Object 4 references Object 5, which is where the actual JavaScript is.
Answer: 5
8. Analyzing the PDF file. What 'object-streams' contain the JS code responsible for executing the shellcodes?
The JS code is divided into two streams. Format: two numbers separated with ‘,’. Put the numbers in ascending order
To save time,we can dump all objects that have filters. Pdf-parser fails to decompress ASCII85 in Python3, but it works in python2:
python2 /usr/local/bin/pdf-parser.py --raw -o 7 -f fcexploit.pdf -d obj7
python2 /usr/local/bin/pdf-parser.py --raw -o 9 -f fcexploit.pdf -d obj9
python2 /usr/local/bin/pdf-parser.py --raw -o 10 -f fcexploit.pdf -d obj10
The dumped files have a specific format.
Object 7 is a long hex string. Object 9 is a long line formatted as ‘X_170987743**’.
Object 10 is long line formatted as ‘U_155bf62c9aU_7917ab39**’.
To figure out how the code operates we can look at the JavaScript code in Object 5:
python2 /usr/local/bin/pdf-parser.py --raw -o 5 -f fcexploit.pdf -d obj5
Here, I pasted the contents of obj5 into CyberChef with the recipe ‘JavaScript Beautify’:
Var SSS looks for Annotations at ‘Page’. Object 3 is a Page, and is suspicious because it references three objects:
Object 3’s Annots reference Object 6 and 8.
The ‘Subject’ of both these Annotations are Object 7 and Object 9:
In the second part of the code, the variable arr splits the string using the same format found in Object 10. We can do the same for Object 10:
sed 's/U_155bf62c9aU_7917ab39//g' obj10 | xxd -r -p > obj10.out
The result is a JavaScript Code:
The ‘replace‘ commands match the formatting in Object 9 and Object 7. We can reformat using the same method as before:
sed 's/X_17844743X_170987743/%/g' obj9 | xxd -r -p > obj9.out
sed 's/89af50d/%/g' obj7 | xxd -r -p > obj7.out
The result is a Powershell script split into two, that executes shellcodes. For example, in Object 7:
Given the way the code operates, the objects that execute shellcodes are Object 7 and Object 9.
Answer: 7,9
9. The JS code responsible for executing the exploit contains shellcodes that drop malicious executable files.
What is the full path of malicious executable files after being dropped by the malware on the victim machine?
I wrote a script to help automate the whole thing. It relies on the library ‘pylibemu’. Object 10 calls to merge both codes in Object 7 and Object 9.
cat obj7.out obj9.out > fullobj.out
python3 ShellCodeExtract.py -f fullobj.out
This is the log of the execution. The log shows that the filepath would’ve been ‘C:\Windows\system32\a.exe ‘
Answer: C:\Windows\system32\a.exe
10. The PDF file contains another exploit related to CVE-2010-0188.
What is the URL of the malicious executable that the shellcode associated with this exploit drop?
Back to the Pcap file, the next request is for favicon.ico, and then the_real_malware.exe:
Answer: http[://]blog[.]honeynet[.]org[.]my/forensic_challenge/the_real_malware[.]exe
11. How many CVEs are included in the PDF file?
Just searching for some of the code in Google is enough:
function updateE -> Adobe – ‘Collab.getIcon()’ Local Buffer Overflow (Metasploit) – CVE 2009-0927
function gX -> Adobe – ‘Collab.collectEmailInfo()’ Local Buffer Overflow – CVE-2007-5659
function cN -> Adobe Reader – ‘util.printf()’ JavaScript Function Stack Overflow – CVE-2008-2992
function cG -> Adobe – ‘Doc.media.newPlayer’ Use-After-Free – CVE-2009-4324
Finally, if you look at the full objects again, object 11 contains an Embedded file. The TIFF image is base64 encoded in a suspicious way.
python2 /usr/local/bin/pdf-parser.py --raw -o 11 -f fcexploit.pdf -d obj11
cat obj11 #copy the base64
echo -n '#base64string' | base64 -d > obj11.out
md5sum obj11.out
#returns 80ed6b2d0c26cabcc7b869a524690d05
Looking for the hash in Virus Total flags the file as an exploit on CVE-2010-0188.
Object11 – Tiff – Adobe Acrobat – Bundled LibTIFF Integer Overflow – CVE-2010-0188
Answer: 5
TLDR
– A Network Capture containing various malicious files.
– The main tools I used to solve the challenge were pdfid.py, and pdf-parser.py (with python2 to extract JavaScript).
– I wrote my own script ShellCodeExtract.py to automate shell code extraction.
5 thoughts on “CyberDefenders: GetPDF”
hey bro
question 9
can you share your python scripte for me?
thx.
it’s on my github https://github.com/forensicskween/CyberDefenders/blob/main/GetPDF/ShellCodeExtract.py
thx!
respect
hi
i don’t find python2’s pdf-parser
is it on github?
No idea, I just executed it with python2 but it as already installed and setup on my vm… maybe you can try downloading it and running it with python2?