Precious Guidance and Reflection were both 3-star rated forensics challenges in the HTB Cyber Apocalypse CTF, and although I didn’t solve Reflection before the end of the CTF, I think they both warranted solutions. Neither was particularly long, but they were difficult to fully understand as each had their own kinks thrown into the challenge.
Precious Guidance involved malware analysis of SatelliteGuidance.vbs, which I later found out was based on the Ursnif dropper. I’ll have to reevaluate my usual methodology of cleaning out anything I think is junk by using echo statements and some deductive reasoning to find out that it runs a .NET dll. I can use the function that writes and deletes the DLL to grab it, and then decompile with dnSpy to find the flag.
Reflection involved some serious Volatility work in analyzing a memory dump from a machine. It’s pretty easy to find out that a Powershell script was loaded into memory with iex to download and reflectively inject a DLL into a notepad.exe process, but locating the DLL is a little tougher. After dumping the memory of notepad, I’ll reassemble the DLL, decompile, and find the flag in a powershell command.
Precious Guidance
Description
Miyuki has come across what seems to be a suspicious process running on one of her spaceship's navigation systems. After investigating the origin of this process, it seems to have been initiated by a script called "SatelliteGuidance.vbs". Eventually, one of your engineers informs her that she found this file in the spaceship's Intergalactic Inbox and thought it was an interactive guide for the ship's satellite operations. She tried to run the file but nothing happened. You and Miyuki start analysing it and notice you don't understand its code... it is obfuscated! What could it be and who could be behind its creation? Use your skills to uncover the truth behind the obfuscation layers.
Initial Analysis
We can open up the zip file and see a single SatelliteGuidance.vbs file. I’ll open it up with VS Code for some syntax highlighting, and we see the beast that we’re going to deal with.
This is ~700 lines long.
One of the major downsides to doing written solutions is that it’s really hard to highlight the methodology, mistakes, backtracking, etc. related to malware analysis and reverse engineering. I spent a long time on this challenge because of how massive it was and things I only noticed later. So although the writeup is going to look very streamlined, it definitely was not easy to find the solution immediately.
With that out of the way, the first thing I notice as I walkthrough this file is the repeated use of execute(polymerase(ARRAY));
The execute() function seems to be native to VBScript, but polymerase() is defined somewhere in the middle of the file. When I first looked at this, I completely forgot to scruntinize the polymerase() because I immediately jumped to printing out the output of the function, as opposed to trying to analyze it.
At this point, you might be wondering why these random words are strewn across the file. This is not some weird guessy CTF thing, it’s actually a technique used by threat actors to manipulate the entropy of the file (i.e. how random a file looks) to make it seem much more normal. Typically, a higher entropy, from a forensics perspective, could indicate that a file is packed, or compressed, which can give us insights into the nature of the file. However, in this case, the words are likely used for evasion, against AV/EDR that might check for high entropys.
Coming back to the main point, the polymerase function, from a cursory glance, appears to iterate over the input object, likely an array, and then builds up a string (KkF) based on what’s in the array. We’ll come back to this in a little bit.
Discovering Stage 2
Rather than execute the script, we can attempt to dump out the objects that are being made with polymerase and passed to execute by replacing all execute calls with wscript.echo, which is basically a print statement in VBScript. I can use Find and Replace to do this for me. Once we do that, we can run the script again and discover a second stage.
Once again, the file is too long to include in this writeup, but the short and simple of it is that a variety of functions are defined. More importantly, these are the names of the functions that are at the bottom of the original file.
I got caught in limbo here, struggling to really unpack everything that was going on. There are a lot of functions, a few of which are intended to just exit the program or delete the file, but I’ll give a brief synopsis of what each one does here.
femoral - Seems to be some kind of anti-sandbox function that checks the number of cores on the system
Kim - Some kind of time of day check, likely to stop debugger-based analysis
RKKOG - Checking the amount of RAM, again, another anti-sandbox technique
MWKz - Checking the Downloads folder for a specific file, 76795.txt (found using Procmon), just more anti-analysis.
LBUd - Checking for a bunch of processes like frida.exe or python.exe which can be used for analysis purposes
RCtu - Checking Disk size, yes, that’s right, more anti-analysis
hTGtM - Literally just a bait error message that sleeps for a while. This doesn’t do anything.
zWY - Sleep function to bypass EDR
DRYX - Creates an adobe.url shortcut in a temporary directory. This might just be another whole anti-analysis technique to check for internet connection, but it barely seemed to be used.
pooch - Writes the textual.m3u file using an “Adob.stream”(?)
serenade - Runs the textual.m3u (definitely a dll) with rundll32. There’s something else with calc.exe in a conditional statment, but it might be more anti-analysis
A coherent function was also called by many of these to try and delete the file.
There was also a SECRET that used the polymerase function, but my attempts to get it to work as is only resulted in errors for a while.
Understanding polymerase()
So clearly, there’s a lot of anti-analysis going on, which is very typical of a threat actor who has higher skill TTPs. However, since this malware is scripted, it’s a lot easier to choose what we want to execute. Now that we have a much better understanding of what these functions are doing, I really want to get to the bottom of this polymerase() function. I’ll copy the SECRET stuff and polymerase out to another file and see what I get back when I echo.
Running it, we get nothing back.
However, if I append this to the end of the original script, we get some lore.
C:\Users\sreisz\Desktop\precious_guidance
λ cscript SatelliteGuidance.vbs
...trim...
Dearest Miyuki,
If you are reading this message, it means that unfortunate events have led to our untimely death. Since Draeger's leadership, we have feared that a day might come when something horrible happens to us. For that reason, we wrote this obfuscated malware to be spawned automatically on the day of the Fifth Andromeda Alignment. We hoped it would spread via Intergalactic Communications and reach you one day. Having observed and admired your investigative skills since you started your training, we knew that you were the only one with the mindset, patience, and persistence required to receive this highly valuable message. We have been progressing a highly detailed map of Outer-galactic pathways for ultra-speed travel, a project initiated by your ancestors eons ago. These passages are hidden and unknown to anyone besides few of our trusted family and allies. All your life you have been destined to receive the key to this knowledge, and since we are not there to give it to you, this is our way of doing so. Dive deeper one more time, to retrieve the final key to this database. Whatever difficulties you face, we have always been proud and believe in you.
All our love,
Your Parents and Guardians.
Rip Miyuki’s parents. Weird way to communicate that but go off.
All of those random arrays throughout the script must actually come together to encode data!
Grabbing the Flag
The functions that we discussed earlier simply do a bunch of anti-analysis techniques and then run a DLL that is written to disk (and removed) using the polymerase function. If we can write that DLL to disk, we can maybe analyze it further and understand what the ultimate goal of this malware is. I’m going to modify the hNZCG function to write the dll to our current directory, and bypass all of the anti-analysis functions.
If I run this file, we get all ofthe output of our previous runs (because I didn’t really clean those up), but we get a new file.
Since the DLL is a .NET assembly, we can use dnSpy to decompile it, and we find that it is a backdoor. We also find that the file was originally was compiled as intcomm.dll.
Notice that the password is built as what seems to be a hexstring. If we decode it using CyberChef (or your preferred method of unhex-ing data), we get the flag.
Flag: HTB{TrAvEl_GuIdAncE_AftEr_LifE}
Reflection
Description
You and Miyuki have succeeded in dis-empowering Draeger's army in every possible way. Stopped their fuel-supply plan, arrested their ransomware gang, prevented massive phishing campaigns and understood their tactics and techniques in depth. Now it is the time for the final blow. The final preparations are completed. Everyone is in their stations waiting for the signal. This mission can only be successful if you use the element of surprise. Thus, the signal must remain a secret until the end of the operation. During some last-minute checks you notice some weird behaviour in Miyuki's PC. You must find out if someone managed to gain access to her PC before it's too late. If so, the signal must change. Time is limited and there is no room for errors. Download: http://134.209.177.115/forensics/forensics_reflection.zip
Initial Analysis
Unzipping the folder, we see one very large memory.raw file, which is very clearly a memory dump of the image. I’ve already showcased the basics of Volatility, the de facto open source tool for memory analysis doing HTB Cyber Santa CTF, so you can go read that if you’re not familiar with the tool.
The most important thing, that I’ve found, doing memory forensics and just DFIR as a whole, is that you need to be aware of what you know, and what you want to know (I’m pretty sure I stole this from 0xdf but it’s true). The inherent difficulty with memory forensics is being limited in what you can see, and figuring out how to piece those things together to assemble a timeline and idea of what TTPs might be at play, what might have been compromised, and to what extent. Lecturing aside, let’s do some initial checks.
I also recently discovered carlospolop’s autoVolatility script which can help automate the process in the sense that you run a lot of plugins and save the output. It’s kind of like the AutoRecon of memory forensics where it’s not great with context, but it can help save some time.
We always want to start off with an imageinfo scan (or alternatively, kdbgscan if you really want to get it right).
The first profile suggested is usually the right one to go with. From there, I like to run various ps* scans, like psscan and pstree, to identify processes, and also netscan to check for any anomalous network activity.
Normally I truncate the outputs of these, but I thought it would be helpful to see the whole thing to get a better understanding of methodology. The currently running process seem very typical of a normal Windows system. By that, I mean, there are no processes named something like svchost.exe or lsass.exe that are originating from processes other than those created by the operating system, there are no connections to weird IP addresses that are abnormal for a Windows machine, etc. Understanding what the machine normally looks like, and should look like, is key for investigative forensics.
Malware authors and threat actors will frequently try and blend in by hiding in plainsight. Maybe their Meterpreter shell is called svchost.exe to try and get you to glance over it, even though it has a weird parent PID. Maybe there’s some port open like 8443, which is normally used as an alternate HTTPS socket, but you realize the machine isn’t a webserver.
In our case, it’s not really a case of hiding in plain sight, we mostly have to focus on the powershell.exe (PID 3424) and notepad.exe (PID 3244) as these are two processes that don’t have to be there, but aren’t immediately malicious.
Identifying “The Bad”
Volatility has a couple of plugins to help us recover information from Powershell and Notepad. The notepad plugin can be used to show what information might be stored in active notepad processes, but in this case, we don’t get anything back. Keep this in the back of your mind though.
We can use the consoles and cmdline plugins to return the output of anything that was in a terminal, and what command line arguments were run, respectively. Specifically honing in on stuff related to our processes of interest, we see the following:
It appears that Miyuki might have run a Powershell script called update.ps1. We can try to locate this file if it’s on disk using the filescan plugin. We can then take the offset that is returned to dump the file out to our machine. It won’t always look perfect, but it’s usually legible.
Well, well, well. It appears we’ve finally found the bad thing. windowsliveupdater.com is not a real Microsoft domain, it’s actually owned by one of the HTB staff, maklaris, and simply redirects to a Rick Roll. But, here it’s been used to host a sysdriver.ps1 file which has probably been executed in memory, and the cmdlet Invoke-ReflectivePEInjection has been run. Some googling shows that it’s part of the PowerSploit suite of tools.
Aside: What is Reflective DLL Injection?
Warning: Windows Internals content ahead.
Windows is complicated, so there are a lot of TTPs out there that do very seemingly complex things if you don’t know the OS very well. I’m going to explain this at a high-level, but I encourage you to do more reading if you want to learn more about some things that are super important to the DFIR and red teaming space.
Recall that DLL stands for “Dynamic Link Library”, and functions in the similar way that a shared object (.so) file in Linux does, that is, it contains a library of functions for a .exe file to import and pull from. For a DLL to be used, it must be parsed by a loader, which will then execute functions based on the DLL main function, which differs from a normal main function in an executable file.
DLL Injection (not reflective), is where we take a DLL on-disk, and inject it into the running memory of another process. Here, we need our DLL to be parsed by the loader, so we allocate memory in the target process and drop in the path to the DLL we want to load, so that we can eventually call something like CreateRemoteThread to execute some function in the DLL while that other process is still running. Reflective DLL injection differs in that we are loading the DLL from memory, where it is not on the disk. We make use of the ReflectiveLoader function and do some memory gymnastics to get similar results, except there is very little evidence of the DLL having ever entered the system.
For further reading/viewing, you should check out ired.team and Sektor7.
From our perspective, the analyst’s perspective, reflective DLL injection means that we are not going to find this malicious DLL on disk. We can run all of the scans we want, we’re just not going to find it. The best resource I found to sum up this idea are the slides from this BlackHat conference where they talk about the detection of this. The main point I want to reiterate from this talk is what they cite as the “Rootkit Paradox”:
“In Essence: While the rootkit tries to hide its existence, in order to do nasty stuff, its
code must (at least once) be locatable and executable” (paraphrasing this paper)
While we’re not dealing with a rootkit, the point still stands. Although reflective DLL injection can be pretty evasive, there is some thread we have to be able to pull so that the OS knows that it’s there. Otherwise, it’s dead space.
The Search Continues - Failure
This is point where I got stuck and couldn’t really progress. Normally, I’d use something like the malfind plugin to locate the dll, but I wasn’t seeing the MZ magic bytes (at the top of every PE file). Knowing that the DLL was injected into notepad, I can dump the process executable and memory using procdump and memdump.
I’ll start by examining the memory dump using xxd. We know, based on the update.ps1 script, that the file is likely called winmgr.dll or something similar. However, we should also keep an open mind in case the name somehow changes in the middle because of decoupling strategies. I can pipe the output of xxd to less, and I’ll find this after searching for the winmgr string.
Although the magic bytes still aren’t here, observe the various headers like .text and .data, both of which apply to PE files. When solving this during the CTF, I actually found this exact slice, but dismissed it because it didn’t look entirely like a DLL, but there are more reasons why this makes sense to be our target.
WinExec is typically used to execute commands, which doesn’t really make sense to be used in notepad like this, but does make sense for malware.
Originally, I dismissed this for being too small, but in fact, this is one of the reasons I should have looked at this further. It would make sense for a threat actor to keep their payload smaller to minimize the detection surface, and with how sparse this DLL is, things just seem more suspicious.
Another one of my many faults when looking at this during the CTF was not realizing that the DLL headers are with this, they’re not just random bytes related to the DLL. We can scroll up to find the relevant DOS header, but we’re just lacking the magic MZ bytes.
This really isn’t that big of an issue, all we have to do is add them back in. I’ll use this StackExchange post to help me write a Python script to extract the bytes, because this hexdump is large. I’ll calculate the the beginning and end, and dial them in by comparing the beginning to a normal DLL so I can make sure something like ghidra can read it.
If I check how my output.dll is doing, we see that it looks pretty normal.
Unlike the previous challenge, this one is not .NET, meaning it’ll probably be harder to recover a full original code. However, we can stick it in ghidra and hope all goes well.
Unfortunately, it doesn’t. Ghidra doesn’t like the fact that there are so many null bytes so the file just gets treated as a DLL with literally nothing in it. I tried to remove the null bytes too just until it “felt right”, but even that wasn’t working.
The Search Ends - Grabbing The Flag
After consulting some people afterward, I found that, because of the paging at play here, I should have been using the vaddump plugin to investigate the Powershell process. I won’t explain everything about it because this has already been dragged on long enough, but I’ll link you to this Andrea Fortuna cheatsheet highlighting some basic usage, and this blog explaining what VAD is. To keep it very, very simple, vaddump will show up cleaner because it specifically looks at the VAD tree, which is a data structure that keeps track of various pages in memory. I’m checking Powershell not only because Goomba/st4ckh0und on the HTB discord said so, but because it makes more sense to check Powershell as that is the thing injecting the DLL, and is more likely to contain all of the bytes, properly.
Update - 5/20/22
So apparently I’ve gotten a few things wrong with my explanation, and even the author of the challenge also had a misconception that Goomba/st4ckh0und explained this morning/last night, and I thought it was worth interrupting this explanation to clarify some things up to this point.
(1) So, the memory dump approach actually does work if you use something like IDA. The reason vaddump works more nicely is because we’re dumping the virtual pages (which are aligned on a page boundary). When a PE file is loaded from disk, the in-file sections are written to these virtual pages, which is why the vaddump requires much less cleanup work.
When you dump it from memory like we were doing, we’re getting the raw binary data, as opposed to the desired PE file in it’s regular structure.
In conclusion, memdump still works, but you’re not actually dumping the DLL, you’re getting raw binary data as opposed to the full file. Hence, we see all of the null bytes in the middle that wouldn’t normally be there.
According to the author (thewildspirit), using dlldump with the --force option would have retrieved it despite the DLL not being present in the PEB list (due to how it was loaded, but I haven’t tested it, nor do I really plan to because it’s already taken me long enough to do this once :)
(2) Invoke-ReflectivePEInjection, the Powershell Script that injected the DLL, doesn’t actually do reflective DLL injection. I’ll let the screenshot explain.
It might seem overkill to draw that line, but obviously each method has different symptoms that will change exactly what you need to hunt for. This distinction is likely the reason it wasn’t easily discoverable by the malfind (or similar) plugin, as it wasn’t actually reflective, as many of the examples I was looking at online were able to use malfind/malfinddeep/etc. to locate the exact location.
Hopefully by this point, I’ve covered all of my bases and I’m more correct about stuff than I was before. If not, feel free to reach out and clarify. Back to your normally scheduled programming.
Back to the Lab Again
I’ll start by hunting down which of my vaddumps have the DLL inside it by running a quick one-liner.
I get two files back from it, and I’ll start with the one with the most occurences of the string, powershell.exe.3e74e488.0x01b20000-0x03b1ffff.dmp. I’ll use xxd to identify where exactly the DLL might be, based on the offsets identified by strings.
Also note that for this one, the magic bytes are actually there.
I’ll write another Python script to extract the DLL, tinkering with values until I have it just where I want it.
And drum roll for ghidra…
It worked! We have a clean decompile. Now, if we look at VoidFunc, we see many, many single characters/bytes that are eventually concatenated and called by WinExec. I’ll copy the bytes out into CyberChef, and move them around so that I can decode them to ASCII.
All that work, for a small encoded powershell command? I’ll copy the base64 string and decode it on the command line.
I’ve gotta say, this was the most disappointing result possible, but at least we’ve done it. I’m pretty sure we all learned something along the way.