Intro
After sacrificing my soul, sanity, and sleep to play this year’s Cyber Apocalypse CTF, InactiveDirectory got 92nd! To be totally honest, not the best, but given that some of our better players were busy this week, I’ll take a top 100 finish. As per usual, I’ll be doing writeups on some of my favorite challenges throughout the event.
One of my favorite challenges from the event was in the Reversing category: Alien Saboteur. In addition to the vm
binary we get, we also get a file of an unknown format that needs to be processed by the binary. Some initial analysis makes it clear vm
is a virtual machine to run a novel file format, so the real challenge is taking apart the mystery file, as opposed to the vm
binary. I’ll write a scuffed disassembler by parsing the binary file, and slowly reverse this new layer of assembly to figure out where a password and the flag are coming from.
Description
You finally manage to make it into the main computer of the vessel, it's time to get this over with. You try to shutdown the vessel, however a couple of access codes unknown to you are needed. You try to figure them out, but the computer start speaking some weird language, it seems like gibberish...
Initial Analysis
Poking Around
When we unzip the download file, we’re presented with more than one file, but unfortunately, we have no idea what that second file is.
The executable is not stripped, so that definiely saves us time. However, I still wanted to figure out if bin
was any format I already knew about. Looking at the first few bytes in a hexdump, however, it appears to be something custom.
It’s definitely not just encrypted data. The first three bytes are “UwU”, which seems like made up magic bytes. A keen observer will also see that those other printable characters seem to print out “[Main Vessel Terminal]
”, but I’m not sure what purpose the other bytes serve. Running strings also doesn’t reveal all too much other than the symbols that are already present in the file.
Running It
I’ll get a quick overview of the behavior of the binary by running it, but it looks like we need to feed vm
the mystery bin
file to do anything.
It seems just like a standard “crackme” format, just asking for a password.
If I run the binary against any other file on my system, we get an error.
That seems like all of the information we’re going to get without disassembling.
Reversing vm
My preferred tool for disassembly/decompiling is Ghidra, although I mix in Cutter since Ghidra is garbage at finding main()
and lacks a nice graph view. Since the binary isn’t stripped, reversing isn’t all too challenging, since the function names are all there, so it’s mostly about how it’s all organized and used. I’ll be be presenting functions as far as I reversed them, so it will not be perfect, but hopefully good enough to understand.
The main
function is fairly simple. We open the file passed via command line arguments, and then we allocate some space on the heap equal to the size of the binary file. We then pass the pointer to this data and the size of the data to the vm_create
function, whose return value is then given to vm_run
.
vm_create
looks like a whole bunch of malloc
shenanigans, but ultimately, it’s reading the file from the first three bytes onward, which makes sense since the first three bytes were “UwU”, so we can skip that function for now. vm_run
is a little bit more interesting.
We are looping everytime the 4th byte after whereever our pointer is pointing is a null byte, and then calling vm_step
. Interestingly, nothing here actually increments or decrements that value, so you’d think it would just run infinitely. Looking at vm_step
, we see some fancy C going on.
The first part of this seems fairly clear. Not entirely sure what the 0x24
stuff is, but it generally seems like if the byte pointed to by our pointer is greater than 0x19
(25), then print “dead
”, and exit. This lines up with what we were seeing with our attempts to use this against any other file, as the metadata/ASCII in those files is already much greater than that hex file.
For the second part, I think it’s more helpful to look at the assembly.
Here, all we’re really doing is taking whatever is located at original_ops
, some kind of global variable, sticking it in the RAX register, and taking some offset from that, and putting it into the RDX register. The value of the pointer to our data is then restored, and we call RDX directly, which is interesting. Anytime we call a function like puts
, we do some lazy loading from libc and do a whole song and dance with the Global Offset Table that you read a quick summary about here.
I really want to understand what’s being put into that register, so I’m going to open my binary in gdb
, and then set a breakpoint at that instruction.
I can then run the binary, passing in the file via r bin
, and we see something interesting at the breakpoint.
The RDX register contains the address of the vm_putc
function, which we never saw in control flow from Ghidra. In fact, if we look at what functions are in the binary, we have stuff like vm_add
, vm_mov
, vm_input
, etc. which all have striking similarities to assembly instructions in x86. If I continue far enough with the same breakpoint, I see this vm_putc
function being called, while printing out that “Main Terminal” message from earlier, and eventually hit vm_mov
.
At this point, I think it’s abundantly clear we’re working with a virtual machine. For the unfamiliar, a virtual machine can be more than just what you see in VirtualBox/VMWare. Virtual Machines are simply defined as the emulation of a computer system. It doesn’t need to be a full operating system, it can be like the Java Virtual Machine (JVM) or the .NET Common Language Runtime (CLR), which take these custom formats that their languages produce, and translate them to machine instructions that the operating system knows how to use.
Our job, then, is to understand how this translation is happening, and what the instructions are doing. For that, we’ll need to take apart the bin
file ourselves, because I’m not good enough with GDB to debug any individual instruction.
Building a Disassembler
I’m not trying to make the IDA of what I’m going to call “UwU files”- all I want to do is dump out all of the virtual instructions that are being called, see what those functions are doing, and see what happens from there. Our first order of business is figuring out what opcode corresponds to what function.
If we go to original_ops
, we can get a sense for how the functions are indexed.
original_ops XREF[3]: Entry Point(*),
vm_step:00102400(*),
vm_step:00102407(R)
00105020 64 1a 10 undefine
00 00 00
00 00 24
00105020 64 undefined164h [0] ? -> 00101a64 XREF[3]: Entry Point(*),
vm_step:00102400(*),
vm_step:00102407(R)
00105021 1a undefined11Ah [1]
00105022 10 undefined110h [2]
00105023 00 undefined100h [3]
00105024 00 undefined100h [4]
00105025 00 undefined100h [5]
00105026 00 undefined100h [6]
00105027 00 undefined100h [7]
00105028 24 undefined124h [8] ? -> 00101b24
00105029 1b undefined11Bh [9]
0010502a 10 undefined110h [10]
0010502b 00 undefined100h [11]
0010502c 00 undefined100h [12]
0010502d 00 undefined100h [13]
0010502e 00 undefined100h [14]
0010502f 00 undefined100h [15]
...trim...
Ghidra didn’t exactly catch it, but based on what we know so far, it looks like it reads the byte at the current pointer, and then checks where that lands us in the original_ops
“array”. I copied these bytes out (and fixed the endianness), and put them in an array in my disassembler script I’m writing in Python. One problem: I don’t know what value corresponds to what function. If I do info functions
in pwndbg (before running the binary), I get some interesting output.
These offsets are very similar to the values stored in original_ops
, with the exception of an additional 0x100000
. Common sense tells us how these map, so we can just use Python to figure all of this out for us. Excuse the CTF-quality code.
Now, the indices of that original_ops
list in Python correspond with what’s in the virtual machine. Our next order of business is to recreate the vm_step
function. If this was an ideal world where I had infinite time to work on this, I would have loved to build out a proper debugger and disassembler that executed instructions and whatnot, but for the time being, we’ll have to settle with printing out the instruction, followed by the bytes after the instruction.
So, we start our vm_run
from the first byte, validate it, check the opcode, and print everything out accordingly. Looking at some of the vm_[instruction]
functions, it seems like every time they execute, they increment what is essentially the instruction pointer by 6, so I put that in as well. If we run it, we get a good chunk of assembly.
Well would you look at that! The first ~40-ish instructions are all vm_putc
, which looks like it’s just printing the character after the opcode. As I begin to understand what each instruction is doing, I’ll add it to a larger if
statement to translate the arguments to a nicer format.
Analyzing the Assembly in the Assembly
Stage 1
Finding the Password
After the initial banner printing, we come to this block of instructions.
We could try to break down each individual function from the virtual machine, but that’s a little excessive, and common sense can help us figure out the big picture. Notice the constant reuse of hex values such as 0x1e
, 0x1c
, 0x19
, etc. Intuitively, these must be register values. Assuming the functions were named as they should be, the structure lines up with normal ASM languages. The next order of business to really understand for each instruction is what the bytes after the first argument really mean. Looking at vm_mov
, for instance:
We’re taking the first byte after the opcode as a register, and then the values after that are being interpreted as some 32-bit value (u8
, u16
, and u32
are all “casting” functions for our VM), and the register is set to that value.
I repeated this process for every new instruction as it came up, and added specific parsing to my disassembler to get nicer output. You can find the full, CTF-quality script on GitHub, but our new output now looks like this:
This is now much easier to read, and much easier to break down.
- Instructions
0xfc
-0x108
: Initialize the0x1e
,0x1c
, and0x1d
registers with the values0xfa0
,0x0
, and0x11
, respectively. - Instructions
0x10e
-0x114
: Take the user’s input, and store it in address0x1e
tells us. - Instructions
0x11a
-0x126
: Add 1 to the0x1e
and0x1c
registers, and jump backward0x2d
bytes if the value of0x1c
is less than0x1d
. - Instructions
0x12c
-0x162
: XOR the values stored at the address pointed to by0x19
with0xa9
(stored in0x1b
). Compare this XORed value with the value stored at0x18
.
TL;DR: Take the user’s input, store it at an address, check if it’s 17 bytes long. If it is, XOR the value at 0x1004
with 0xa9
, and check if they’re equal.
So where are any of these values? If we take a hexdump of the bin
file (removing the UWU bytes because the code does too), we get an idea of what’s going on.
The user-supplied value will be stored at the 0xfa0
offset, and that long hexstring is the value that’s being XORed. If we extract that hex string, and XOR correctly, we find the first passcode.
If we supply this password to the binary, we move to the next stage. However, using it as the password for stage 2 fails. Weirdly, I’m able to copy and paste this multiple times before the binary finally kicks me out. it doesn’t look like it’s giving me more attempts, so it’s probably waiting for me to input the exact number of bytes it’s looking for before moving on.
Dealing with ptrace
Looking back at the dumped assembly, we can see what gets executed after the initial password check.
I’ll be honest, not entirely sure what purpose instructions 0x1d4
to 0x1e6
really serve, but the next instructions are interesting. vm_push
pushes the value stored in 0xf
(which is 0), onto the virtual stack three times, before finally calling vm_inv
. Looking at the source code, we see one of our longer functions.
The only thing you really need to focus on is that line including syscall()
. If we apply the values from our assembly, the line basically becomes:
Syscalls are essentially functions you can call to the operating system to do something. Every programmed function is a form of abstraction for these system calls. For instance, the puts()
function is built on top of the write
syscall in Linux. These calls are different from operating system to operating system, and architecture to architecture, so if you don’t know what corresponds to what, you may want to look up a table. Luckily, the Chromium docs have this laid out nicely for us here, although it is weird that that’s the spot where it was presented the best.
The syscall for 0x65
is ptrace()
. If you’re newer to more sophisticated reversing challenges (like myself), you’ve likely never encountered it before. According to the man pagesand , ptrace
controls a process’ ability to observe and view the execution of another process, and is fundamental to the ability to run stuff like gdb
. We can get a sense of the real implications of this by attempting to run it in gdb
.
Although we gave the correct password, we were locked out of the system. If we run strace
, we get an even better idea of what’s going on.
ptrace
is blocking our ability to debug the binary. So far, this isn’t really a problem, as we’ve just been doing static analysis, but in the event that we need to debug, I’d like to figure out a way around this. Luckily, the bytes dictating this are in bin
, so we can just create a patched-bin
with the relevant bytes changed. With some tweaking, I did this.
The braces in the hexdump diff were added by my to highlight what bytes I changed. The vm
binary has additional instructions that never actually get used. Instead of calling vm_inv
for a syscall, we call vm_nop
to skip the instruction. But, the program also checks for a successful ptrace
, so instead of jne
, we flip that to be the opposite je
. Now, if we open the patched version in gdb
, we should be able to debug as normal.
Decrypting Stage 2
With that out of the way, we have one last bit of assembly to look at.
This follows a very similar pattern to what we were seeing with the encrypted password. We’re looping starting from where 0x1e
points to, which is 0x77 * 6 = 0x2ca
, and then XORing with the value stored in 0x1b
, that is, 0x45
. Looking at the hexdump, we see the relevant ciphertext.
Knowing the properties of XOR, we can already see a lot of null bytes in between different bytes.
Which happens to be just like how the other opcodes were structured.
Which means this is probably just the beginning of the end.
Oh no.
Shoot, there’s more.
Well, we can add the decrypted bytes to our script, and run our disassembler on those new bytes.
Welp, we’re going to stage 2.
Stage 2
Permutations
I’ll modify my disassembler script to just add the location of the last instruction we were on for consistency, since I’m just running it on both distinct chunks. After the new phrase prompt, we see a single, unbroken chunk of instructions.
As per usual, this writeup is going to make me look very intelligent because I can just tell you exactly what’s happening now that I know the answer, but at this time, this took me 6+ hours to truly figure out, mainly due to a misunderstanding on how the jump instructions were working.
This first section is storing 36 bytes of user input into the location 0x1130
by looping over the set of bytes in the input.
This next section took me a long time to finally understand because of my inabililty to understand instructions 0x3c0
to 0x3d2
. We start off by initializing registers 0x1c
and 0x1a
at zero, acting as counter variables. For every character loaded in, we are moving it to the address 0x1130 + 0x15[0x1a]
. We repeat this 36 times.
If we look at the addresses pointed to by 0x1e
and 0x1f
, we get an idea of what the final task really is, and this movement will make more sense.
We see that we’re probably going to be working with four byte arrays. From what we’ve already seen, we know that 0x1130
contains the value we input (at this point, probably the flag) and that the value around 0x1190
defines how we’re moving characters around. If you notice, not one byte in those 36 bytes is greater than 0x24
, which we can reasonbly infer at this point must be the length of the flag. We can extract these bytes and convert them to a list of ints.
What’s really happening here is that this list is defining a permutation. In the first loop, the first character of the flag gets moved to the 19th position, the second gets moved to the 25th position, so on and so forth. Since we know how the flag is being permuted, we can write functions in Python to permute and then “unpermute” the characters.
We still need to figure out the next sections.
XOR
The next block we need to look at is this.
Now we’re working with the byte array located at 0x11f8
. Here, we actually have nested loops here, as we can we with the two vm_jle
instructions. The 0x10
register will store a character at a point at 0x11f8
, and the 0x14
register is storing the current character/byte at the 0x1130
array. We XOR those two values together, and we first increment the register keeping track of 0x1130
. Once we have looped through all of the values of 0x1130
, only then do we move on in the bytes at 0x11f8
.
In summary, we are XORing each byte in our now permuted flag, with every single byte in the bytes stored at 0x11f8
. Instead of rewriting this in Python, we can simplify this by XORing all of the bytes in the 0x11f8
byte array together, and then XORing with our mixed flag.
The final section just compares everything we’ve done to the final set of bytes.
Let’s summarize what we know:
- Our input is stored in
0x1130
, and goes through a permutation defined in0x1194
. - We then XOR every byte in
0x1130
with every byte stored at0x11f8
- The value at
0x1130
is compared to0x125c
Final Solve
Lucky for us, everything we’ve done is fully invertible, so we can just do the reverse of everything to find the flag.
Flag: HTB{5w1rl_4r0und_7h3_4l13n_l4ngu4g3}
This is probably the hardest rev challenge I’ve solved, so I’m super happy with how I pushed through and worked for this. I’m sure there are ways I could have used more dynamic analysis after patching ptrace
, but static was the only thing that was really making sense until the very end, where debugging helped show me the permutation that was happening.