You can read part one of this series here. The last post, “Mutex Analysis: The Canary in the Coal Mine,” started off showing how you can use mutexes to discover malware that is difficult to locate using more traditional methods and tools. We used a live compromised system for the example and the post came to a relatively abrupt end when it seemed that we stumbled onto a new/unknown type of malware – or at least one that does not seem to have any public exposure or analysis. This post will be “part 2” of our analysis.
Some Logic Behind Memory Crash Dump Analysis
In the last post we forced a crash dump of the process (services.exe) that seemed to contain injected malware based on the mutex analysis steps proposed. I made a mention of examining the memory heaps as a first step in scenarios like this. Before picking up exactly where the first post left off, let’s discuss this point a bit. The first of the three main types of memory segments is the “code” (aka: “text”) segment. Simplistically, this will be the program itself and its associated DLLs, taken from the file system and properly realigned for memory addressing and alignment. In normal cases, these segments are relatively static. At the other end of the spectrum are “stack” segments. These segments are extremely volatile since they temporarily hold variables being manipulated inside specific functions as a program executes. Between those two types of segments (from a volatility perspective) are heap segments. The heap is an internal memory pool created to dynamically allocate memory as needed. Heap blocks of memory are allocated and freed in an arbitrary order. The pattern of allocation and size of blocks is not known until run time. Heap is usually being used by a program for many different purposes as the heap functions as shared memory modified during runtime. Based on the above [over-simplified] background information, it stands to reason that if something is maliciously injected or otherwise hidden in another process’s memory space, it (or traces of it)can probably be found in a heap block. Again, the two tools we’ll use in this article for memory and dump analysis will be: WinDbg – http://msdn.microsoft.com/en-us/windows/hardware/gg463009 PEBrowse Crash-Dump Analyzer – http://www.smidgeonsoft.com/download/PEBrowseDmp.zip While using PEBrowse, we’ll typically have many sub-windows open in the right-side pane, as the following screenshot illustrates. However, all PEBrowse screenshots in this article (after this one) will only show the window parts most relevant to our discussion.
On the other hand, Windbg is a hybrid CLI/GUI application as shown in the following screenshot. However, rather than show screenshots, I will likely be pasting text output into this article to make formatting easier.
Simple Initial “Scanning” Our Crash Dump Using PEBrowse
Loading the dump of the process we suspect of containing injected malware (based on the mutex analysis we did in the previous post) into PEBrowse shows it contains 11 heap segments.
The reason I like PEBrowse is that is makes it very easy to quickly do a visual scan the heaps for almost anything suspicious that catches your eye. Suspicious things tend to include URLs and IP addresses, names of other executables, and things like that. Much worse things can be found in the heap, as I have a feeling we’ll see in this example. **Note: Like the previous post, I’m doing this analysis (for the first time) concurrently while writing this. I really don’t know what to expect at this point! As I mentioned in the previous post, we hit something very suspicious in the first heap segment I dumped. It started off relatively normal (gibberish), as you can see below:
However, casually scanning the contents of this heap brings us to the following:
The above screenshot is a perfect example of the types of things to consider suspicious when scanning these segments of memory – especially taking into consideration the legitimate purpose and job of the process you’re examining (which in this case, probably should not contain strings like that). Scanning a little further, we finally encountered the following:
Dumping this section of memory (I’m sure we’ll get into this later when we find injected malware, and I want to save the discussion of carving until then) and formatting it gives us the following: At this point, we’d normally take a few field names like “injects_begin” or “block_by_crc_begin” and use Google to discover what type of malware this is. However, in this case, we found no results in Google for these hits (or many others I later did), indicating that we might have stumbled across a new family of malware. Our task at this point is to discover if this really is “new” (by “new” we mean not previously exposed or publically reversed, although even if “new,” this has likely been in the wild for quite some time). However, before we get to that point we need to find the malware first! Scanning other heap segments shows similar configuration type information, but there’s another interesting tactic to use when visually skimming memory dumps this way. That is, starting at the “top-most” segments shown in PEBrowse that aren’t resolved automatically. Those segments are shown in red below:
In the first segment (0x00010000) we have a very interesting find. At the start of this segment is another configuration file that seems malware-related:
Then, shortly after that configuration files ends, yet also contained in the same segment is the following:
Ruh roh Shaggy, this can’t be good!
Switching to WinDbg
Ok, so at this point we know the following:
Based on network traffic, we know the host was definitely compromised and exhibiting a range of C&C type of activity. Scanning the system with a couple of manual and automated tools showed nothing obviously wrong with the system. The process (services.exe) was running on the live system pointing to a very bizarre mutex that couldn’t be legitimized, usually an indication of process injection of some type. We forced a dump of that process using Process Explorer and found configuration files that look malware related and actually match some of the “tells” seen in network activity (based on host names seen). We found memory segments containing both malware configuration files and what appear to be executable files – a strong indication of injection.
Now we’ll use WinDbg to find likely bad PE files inside this memory dump. In this case, “bad” means “injected.” The trick is to find the injected ones. Of course there will be quite a few legitimate PE files in this dump associated with the main process. They can be enumerated in WinDbg using the “lm” command, as shown here: But we want to find the ones that WinDbg doesn’t know about. Everything above are the PE files WinDbg knows about. First, we simply search the dump for what appears to be the start of PE files. We do so using the “s” (search) command and passing it the first 4 bytes of a PE file as the search pattern. Keep in mind this pattern is the first 4 bytes of most PE files, but not all – and especially not all malware PE files, but it’s generally good. The full command is shown below: Let’s look at the first result as an example to dissect: 00018000 – The starting offset the pattern was found. 00905a4d – The first four bytes in little-endian order (our search pattern, also given in little-endian order) 00000003 00000004 0000ffff – A sampling of the 12 bytes following the initial pattern searched for. So as an example, if we want to enumerate some more information about one of those PE files, we can use the command “!lmi” with the starting address of the PE image. The third PE found in the list above starts at 0x00230000, so the command and results for that one would look like this: But how about for that first one found in the list? Hrmmm… Interesting… WinDbg knows nothing about this one. If you remember though, we already know something about this one. It’s the same one we found sharing a segment with that malware-looking configuration file (shown again below, this time with the address highlighted):
Well I think we found our first carving example! Ok… Here’s where things get a little thorny. 🙂 So we know we need to carve a file from this dump that starts at offset 0x00018000, but how do we find the end? PE files contain a field called SizeOfImage in the Optional Header that gives the size of the PE in memory. THIS APPLIES TO AN IN-MEMORY FILE ONLY! (The process of calculating its size on disk, in network traffic, or anywhere else is such a convoluted process that Kevin Douglas needs to explain it to me every other month or so.) 😉
I’m sure there’s a better way to do what I’m getting ready to show, but… This is the process I tend to go through. First we’re going to take a total shot in the dark by dumping the first 10,000 bytes of the file. Basically, we’re going to carve enough of the file to dissect it deep enough to get that actual SizeOfImage value, then go back and carve the actual correct size. We know the starting offset is 00018000. The ending offset for this first shot at carving will be: 10,000 decimal = 2710 hex; 2710 + 18000 = ending offset 1A710 So, the command to dump this file will be: And when issuing the command we see:
Next we’ll open the file in CFF Explorer and let it parse out the headers and tell us the actual size of the PE:
CFF Explorer is telling us the actual PE size is 5 KB. That is awfully small for a PE – malware or otherwise (malware tends to be smaller than legitimate PE files, generally speaking). My initial thought is this is not an actual PE file, but just a false positive left in the slack space of this memory segment, but… Not only can CFF Explorer correctly parse the imports for this PE, but the import list looks like those we see in packed malware:
Because of that, we’ll play along and dump a “correct” copy of this PE. But I’m still suspicious of this file so let’s check out the second PE found that WinDbg doesn’t know about (at starting offset 0x20000):
Well, as we see above, I picked some random number as an ending offset, and just happened to dump the file perfectly, down to the exact byte value. Time to go buy a lottery ticket, since that’ll never happen again. The bottom red box in the image above also shows that PE meta data could be correctly parsed from this file, indicating it is indeed a valid file. Not only that, but this looks very much like malware also based on the meta information alone. Now, to go back and correctly dump the first file:
So at this point we have two files and a couple questions left. First, are these valid files that we carved – can we do analysis on them? As we see by loading them into a tool like IDA Pro, we see they can be correctly and fully disassembled and parsed, so the answer to that question is YES – we carved valid PE files and we did it correctly:
Next, we need to find out if these files are known in the wild already. If you have the luxury of being able to use VirusTotal, there’s no quicker way to answer that question. Some people tend to get up-in-arms over the use of VirusTotal, but we need to be realistic here. Consider the following:
This is a scenario where non-attribution is not an issue. Many government agencies and other organizations involved in certain types of cases should not and cannot use ANY public service for analysis (VirusTotal or otherwise), but this is not the case here. As an alternative, opponents of VirusTotal suggest you should just search services like VirusTotal for the hash of the file to see if it has been scanned before. Candidly speaking, I think this is a silly alternative suggestion. Theoretically, because the same sample can be packed many different ways, the hash is likely to be different between campaigns for even the same sample. More importantly – the reality is that when you’re dealing with PE files that have been extracted from memory, the hash will almost always be different (read: the hash will be meaningless). When a PE is loaded into memory, sections are realigned and addresses are resolved at runtime for that particular system. Taking the same sample (same by MD5 hash), running it on three different systems, and extracting that PE from memory on those systems could give you three completely different hash sets.
The topic of using services like VirusTotal is a religious debate, and depending on the situation my opinion on the topic changes, but in this case it will unquestionably save us a considerable amount of time and focus our efforts intelligently.
Well, we see two things above. First, it has a 35% detection rate, which is not bad. Secondly, most of the alerts (of the 15 that generated alerts) are generic alerts, so we’re not really sure what this threat actually is. The screenshot is truncated, but there were a couple of hits for TDSS, which wouldn’t surprise me. So does this mean we really didn’t find a new family of malware? Well, we still have a lot of evidence indicating we’re dealing with some type of malware is not THAT well-known, based on the fact we cannot find any references to most of the configuration-related strings we examined earlier. This is a perfect example of how VirusTotal can save you considerable time. At this point, I’d pencil in these extracted samples as something akin to TDSS and use my time more valuably by continuing to examine the process dump. Doing so yields more interesting results. The next non-resolved PE file is found in a memory segment that also contains a reference to the suspicious mutex we found in “part one” of this series, as well as another configuration file in the “unknown” format. Extracting this executables gives us the following:
This file can be considered either an “unknown malware sample” or a legitimate file that’s not malware (which is why none of the AV engines alerted to it). However, when statically analyzing the PE file (using a tool we built for internal R&D), we see it contains several characteristics common to malicious executables:
In this case, considering the amount of evidence we have at this point, I think it’s more likely we just extracted a publicly “new” type of malware. However, we have to be fair and say while the list of evidence is fairly large supporting the notion this is relatively new, it’s all circumstantial evidence at this point! To prove this is actually a malicious sample just extracted and show it’s something new will be covered in a following part to this series, as this process will be quite involved and will draw heavily on both network traffic analysis and binary reversing and profiling. In other words, there is much more to come in this series! Posted with permission from the NetWitness Blog.