Analyzing CVE-2010-0188 exploits: Context aware malware (Part 2)

...Continued from Part 1.

Advancing shellcode

I was able to extract the shellcode from the attacks in June and July. I opened the X86 code in IDA Pro and was baffled, I knew it would take time and research to figure out what the attacker was trying to accomplish. Though I could easily see that part of the code was trying to bootstrap a malicious binary. Since then I've read a paper titled "Understanding Windows Shellcode" (by skape which explained what our friend Pat Casey was doing, almost line by line.

Step 1: Since we are running code that is not linked we need to resolve function addresses manually. How? We locate the kernel32 DLL using knowledge of the the Process Environment Block (PEB) data structure. In this case, we are using the information belonging to Adobe which is standard for all Windows 32 PE files. In that, kernel32.dll is always the second module to be initialized. Now we can walk the list of initialized modules to the second entry.

Shellcode found in the June PDF Exploits: find_kernel32

Step 2: Resolve the locations of needed functions. This is a really cool technique in my opinion, though it may be old. The shellcode writer computes a hash of a required function name, then enumerates a list of functions exported by kernel32 and compares each hash. Now once the author knows the location of kernel32 they can also use knowledge of the PE header structure to locate the export table.

Shellcode found in the June PDF Exploits: find_function

Finally the shellcode makes a call to GetTempPathA [5B8ACA33h], then loads the urlmod library using LoadLibraryA [0EC0E4E8Eh], downloads a file to (you guessed it, to the Temp folder) using URLDownloadToFile [702F1A36h], then finally executes the file with WinExec [0E8AFE98h] and kills the running process. You can resolve any hash value found in shellcode using a script from FireEye. Awesome, but now for some bad news. The shellcode from the August exploits is now packed. That's not bad news meaning it's hard to analyze, it's bad news because the attack is evolving. Now let's take a look at another anti-reversing technique used by Pat Casey.

Context is key

This is the most important part of the article, as it adds another dimension of difficulty to malware analysis. (As if manual unpacking wasn't frustrating enough.) This dimention is context. Being new to the reverse engineering / malware analysis scene I'm not sure if this is entirely new, but I haven't seen or heard it addressed. Embedding context into an attack vector succeeds for legitimate attacks and cripples analysis. The reason being that malware samples are not stored with context information. Sites like Clean-MX will deliver malware properties but the context in which the malware was obtained or first observed is long gone. Sites that promote analyst contribution typically begin with a set of hashes, a date, and a file. The entry is then populated with properties, no matter how much analyst-time is spent, context cannot be recovered.

The set of CVE-2010-0188 exploits took advantage of this lack of context to deter would-be analysts from understanding the malware's intent. Rewinding a bit from the malicious PDF brings us to the delivery. As I mentioned before, malvertizing redirects unsuspecting victims to a malicious host. The redirection puts the victim at the mercy of a JavaScript probe for Internet Explorer, followed by an attack on the HCP protocol. It will then deliver PDFs, Flash, or Jars where applicable. The goal of the malware author is to have one of these attacks succeed. Enter the malware analyst.

Figure 1: A URL can be seen at the bottom of the shellcode.The analyst receives either the JavaScript, PDF, Flash, or Jar and begins their work. (I worked on the PDF.) After learning a bit about memory views in Olly (:)) the analyst defeats some obfuscation and observes an HTTP request. Running the PDF in a vulnerable version of Acrobat Reader will cause the application to crash and the shellcode to download and execute more code from specific URI. Previously I was racing to extract this URI from newly registered domains (containing fresh URIs), and download the executable to analyse. Every time I tried the download it would result in simple HTML that used a "meta-refresh" to Google. Eventually I observed a download in a controlled environment and was sure to mimic every detail of the HTTP GET request. I rushed to my sacrificial honeypot machine to start the download... still no malware, even with a 20 minute delta.

The solution lied in context.

It popped into my mind that perhaps the availability of the executable was triggered by the download of a PDF. Wrong, but close. After a few frustrating minutes I remembered the exploits from June using a HTTP POST to tell the server what vulnerability to try. Now in August the probing JavaScript contained the values of each malicious file. (I assume the author wanted a smaller footprint by eliminating POST traffic.) What if the POST served as a 'registration' of a victim? Once the server sees a POST it knows it has an infection candidate. Here in August the same 'registration' was taking place by recording a secondary request (for the PDF/Jar/Flash) object. Bingo! The executable download triggered by the shellcode only works if the host (IP let's say) first visits the probing page and downloads a respective malicious object.

Being a Newbie Analyst is Frustrating

Figure 2: Download File in IDA ProMy vigor sank when I opened the download and saw it was packed. I was expecting it, since the shellcode also changed this month to include obfuscation. Yet I wanted to try decompiling since I like C much more than X86 Assembly, too bad. PEiD gave a 'Packed' guess based off of high entropy as well. The entry point in the PE header givens 1111 which contains a bit of code (Figure 2) but immediately trickles into User32.DLL by calling EnumDisplayMonitors at 1147 which calls 114E. This is non-obvious in IDA but a few steps in Olly 1.10 do the trick. This moves 7FFE0300 into EDX (ntdll.KiFastSystemCall) with a stack of:

  1. 12FEEC) Return to malware.<EP>+35 from 116C
  2. 12FEF0) 0
  3. 12FEF4) 0
  4. 12FEF8) malware.114E

Stepping into this leads to a SYSENTER followed shortly by an INT 2B at USER32.7E4194A8. 2B leaves ring3 and thus Olly cannot continue. Instead I added a break point before ntdll.KiFastSystemCall returned. This allowed me to play the application and observe the stack after each system call. I could also watch memory allocations. I noticed DLL contents loaded into MemoryMaps (rpcss.dll, wbemprox.dll, wbemsvc.dll, etc). I also saw a meaningful registry Open and Write to: HTCU\Software\Microsoft\Windows\CurrentVersion\Run with a value and data being set to a copy of the application stored in C:\Documents and Settings\<Account Name>\Local Settings\Application Data\<random folder name>\<random file name>.exe. Finally it terminates and launches itself from the new location. Observe it in all it's majesty:

Then what happens? Well it waits, and waits, and waits. I have not investigated the trigger completely as I am still playing around with memory captures (using the SysInternals suite). But it seems like browsing the net will cause a DNS lookup for hxxp:// ( then a GET request to /check?pgid=5, followed by a GET request to /percer.php?login=NTcuNQ== (perhaps a base64 of 57.5). Searching clean-MX for AS2588 (which belongs to) shows similar hostnames such as antispycraft with similar logins NzUuNDI= being 75.42. Is this another step in context correlation? Either way, as soon as this happens the madness beings.

Figure 3: Snapshot of network traffic after unknown trigger.It seems that this PDF exploit was programmed to infect the host with Koobface. The PE that it downloads in non-other than a packed Rouge AntiVirus called "Security Suite". After the trigger, the malware being alerting the user that the system is infected and prevents new executables from launching.

In the end, much fun was had and much more learning is in store. I'm sure there are more malware authors out there like this Pat Casey kit; over the last month I've been searching for them. If I find anything interesting I'll be sure to report. :)