Intro
Despite the fact that I couldn’t solve more than 5 real challenges during corCTF last year either, it’s one of my favorite CTF events, and I played again for this edition, albeit mostly alone. The event was just as hard, if not harder, than last year, and while I didn’t have enough time to solve some of the challenges I wanted to, I thought I’d at least writeup the relatively easy challenges that I did do. Got 78th out of ~600+ teams though, which is pretty funny considering I only solved maybe 3 challenges during the event.
force
Web - 118 solves
Description
Unbreakable vault door!
Author: larry
Challenge
We’re given source code for this challenge, and it’s a very, very minimal web application.
The task is fairly simple. When spawned, the server will choose a random number from 0 to 100,000. Using GraphQL, a user is able to submit a request using the flag()
query, and if the numbers match up, it will print the flag. However, you’re only allowed 10 requests, meaning if you pick 10 distinct numbers, that is about a 0.01% chance of happening.
We can open the app up to get a better idea of what this looks like.
If I submit this request as is, we’re told we have the wrong answer.
Solution
If you don’t know, GraphQL is an open-source querying language for APIs that is not intrinsically tied to any one database or storage engine, the motivation for this being the ability to be highly specific about what data you want to pull. In a more traditional REST API, you have set structures in place to query resources, which is nice because of how easy it is, but not efficient. For instance, a developer might set up an endpoint like http://example.com/api/v1/users/<ID>
to access information about a user by ID. However, what if you just want the email of that user, or just the registration date? This might not seem like a big problem, but when your user list becomes insanely large, optimization will become necessary.
In GraphQL, we could define our API with certain types and structures, and create a query like so:
Now, anytime we call that query, we get the exact data back. When I see GraphQL, my first instinct is to approach it like I would an SQL database. GraphQL features introspection, a technique that allows you to use GraphQL queries to get more information about the database schema, and John Hammond shows this off a lot better in a video on an HTB Business CTF Challenge. However, it won’t be very useful here. I can submit the payload from PayloadsAllTheThings, and then use GraphQLVoyager to visualize this data, but all it returns is a single box.
If we think about the actual function of GraphQL here, it’s not actually storing any data. Although we’re allowed to submit arbitrary GraphQL requests to this server there is only one defined query in the source code. flag
is the only defined query, and when we submit that with a pin, the server receives that number, and compares it with what was stored in memory. TL;DR: There’s no way to leak the number out like this.
However, GraphQL was created as a way to hone in API requests, so surely there’s a way to make multiple requests in one go? GraphQL actually has batching, meaning we can make multiple queries in a single request to the backend. We can try this with two just to confirm.
Nice! Given that there are 100,000 values we need to try, I’ll submit them 10,000 at a time so that we don’t overload the server.
Running the script, we get our flag:
flag: corctf{S T O N K S}
This flag is definitely going to break the blog but it’s fine (Update: Looks like the whitespace is getting truncated, trust the longer one from the shell command).
fizzbuzz100
Crypto - 134 solves
Description
lsb oracles are pretty overdone... anyway here's fizzbuzz
Author: willwam
Challenge
We’re given source code to a remote server.
This was the first of a series of three challenges that built on the idea of FizzBuzz, a classic coding interview question and game that kids in the UK play (I guess they have nothing better to there). This challenge starts out with a pretty standard RSA implementation to encrypt the flag, this time adding random bytes to the beginning and end of the flag to prevent unintended solutions in the later versions. We’re then given access to a decryption oracle, with a twist. We can give it any RSA ciphertext as an integer, and it will use the private key to decrypt it. However, if it’s the original flag, the program will exit. Otherwise, if the plaintext is a multiple of 3, the oracle will print “Fizz”, a multiple of 5 will print “Buzz”, and a multiple of 15 will print “FizzBuzz”. If none of these conditions are met, we’re given the computed plaintext.
Solution
Like last year’s corCTF, the easiest crypto challenge is a simple case of modular arithmetic. We may not be able to directly submit the ciphertext, but we can always modify it. We know that the ciphertext we receive is . Now suppose we submit the ciphertext . Then, the decryption oracle works out like this:
We can easily compute the modular inverse of mod , so we can multiply that by the output of the decryption oracle to get the flag.
flag: corctf{h4ng_0n_th15_1s_3v3n_34s13r_th4n_4n_LSB_0r4cl3...4nyw4y_1snt_f1zzbuzz_s0_fun}
cbc
* I solved this immediately after the CTF was over because I just got too busy with life on Sunday, so I’m going to retroactively give myself the solve (they’re fake internet points anyway)Crypto - 59+1* solves
Description
who on earth is putting CLASSICAL BORING CRYPTOGRAPHY in my ctf
Author: willwam
Challenge
We’re given the output of the below Python program.
This code might seem a bit complicated with all of the list comprehensions, but the encryption scheme is very simple. Going from top to bottom, the random_alphastring
function will generate a random string of uppercase English characters with a specified length. The add_key
function is where the encrypting happens: for each character in the key and the plaintext block, we take the index of the letters, and add them together modulo the length of the alphabet. In this case, A = 0, B = 1, C = 2, etc. So, if we did add_key("C", "Z")
, that would work out to be , and corresponds to “B”. The TL;DR here is that this is basically just a Caesar Cipher, except there is a different shift per character.
The interesting part is the cbc()
function. In Three-Eyed Oracle, I briefly discussed various block cipher modes, and this function is effectively applying Cipher Block Chaining (CBC) mode to this add_key()
encryption. Normally, CBC looks like this:
Courtesy: Wikipedia
The basic idea, in its normal use, is that every plaintext is XOR-ed with the previous ciphertext block, and then run through the block cipher. For the first message block, we use a random set of bytes called an Initialization Vector (IV), which helps add to the overall randomness of the encryption. Here, however, the algorithm has been slightly modified.
Instead of XORs, we’re just applying the add_key()
function again. The only other thing that hasn’t been mentioned here is that the plaintext is padded out to a 16 byte boundary using X’s as seen in the pad()
function (e.g. pad("ABCDEF") = "ABCDEFXXXXXXXXXX"
).
Solution
The big problem with this script is that the add_key()
function is fully reversible. In the add_key()
function, I can just do the reverse of k_a + pt_a
, which is pt_a - k_a
, and get back to the original value.
This is a good first step; we can easily decrypt any ciphertext knowing the key. That said, we still don’t know the key. However, we might not even need the key. Consider the general way to decrypt a block cipher using CBC.
Courtesy: Wikipedia
Unlike CBC encryption, which can only be done one block at a time because it needs to be chained with the previous ciphertext block, CBC decryption can be done in parallel, as we already know all of the ciphertext. In our case, this means . Observe that we can actually compute that inner remove_key()
function, leaving it at a single key being applied to each block: . If this was AES, we’d be stuck, but because the algorithm is basically just a Caesar/Vigenere/Shift cipher, we can crack this using frequency analysis. If you’re interested in learning exactly how frequency analysis works here, I recommend looking at Cryptopals Challenge 6, as they do this with XOR. However, writing your own algorithm to score the “English-ness” of candidate plaintexts can be a bit tedious and error prone, so it’s much easier to use someone else’s implementation like this online Vigenere cipher tool.
In summary, to solve this challenge, we can compute for every ciphertext block, and then put the resulting “unchained ciphertext” into the Vigenere cipher solver, which should give us the key. The tool I linked didn’t apply the key it found to the whole plaintext, so I did that part in Python to get the output.
Fun plaintext.
flag: corctf{ATLEASTITSNOTAGENERICROTTHIRTEENCHALLENGEIGUESS}
utterly-deranged
Rev - 60 solves
Description
Can you untangle my twisted ~~CFG~~ web of lies?
Author: clubby789
Challenge
The binary is a seemingly innocent ELF executable. It’s stripped, which is annoying, but luckily PIE is disabled, meaning functions will always load into the same spots, making reversing a bit easier.
We don’t get much from strings, except what seems to be some messages that the program uses and a fake flag.
If we run it, we’re given a pretty basic “crackme”-style prompt, albeit with a weird warning. We can also confirm that the fake flag is indeed fake.
I can also try to run strace
and ltrace
, but nothing much comes of that either.
Solution
Unintended
Most people, including myself, solved it much differently than the author intended. I’ll start by walking through how I originally approached this, and then, with knowledge of the challenge post-event, explain why the unintended works and why deobfuscating the binary is pain.
My go-to program for decompilation is Ghidra, and the warning immediately makes sense. After clicking around the identified functions to try and find the entry point or main
, I found a function that refused to decompile.
If I look at the disassembly in the center window, it’s clear something’s suspicious. Rather than see a variety of instructions moving values and registers around, I see a repeating pattern for a very long time.
It looks like we’re only going to be looking at assembly for this challenge. If I don’t have access to a decompiler, my preferred disassembly program is Rizin’s Cutter, a program that used to just be a frontend GUI for radare2, but has now evolved into its own ecosystem of tools.
Cutter is actually able to find the main function itself, and pretty easily produce a control flow graph (CFG), things that Ghidra isn’t as good at.
Oh. That’s what they meant in the description. The craziest part about this is that the gif doesn’t even do the CFG justice, it just keeps going. If we scroll through the disassembly of main, we find some code that resembles what we saw when running the program for the first time
From a high level, after printing the “Abandon hope all ye who decompile here”, we zero out ~128 bytes of memory at a specified address, and then move data stored at 0x417070
into that buffer. If locate that address, it looks like random bytes, which could very likely be some kind of encrypted text.
Trying to guess some basic XOR keys didn’t work, so we’ll have to keep looking at the program. This is where the unintended path begins. During the CTF, I wasn’t really able to gather much from the obfuscated assembly, aside from the fact that it was doing basically nothing. I assumed that the real instructions were interspersed throughout the obfuscated/garbage assembly, so trying to actually deobfuscate the program would be a challenge. However, if we go to the end of the program, we see a very important function call.
memcmp()
will just compare the bytes located at two blocks of memory up to a specific length. If the memcmp()
returns true, then “Well done for peeling back the layers” is printed, otherwise, we get the “Better luck next time” message. If we open the program in GDB, we can actually see exactly what’s being compared. I’ll submit a random flag that’s 128 bytes long, since that’s the length parameter being passed into memcmp()
.
In the DISASM part of the output, we can see the addresses of the two blocks being compared, 0x7fffffffb3a0
and 0x7fffffffb310
. Note that these values are stored in the RDI and RSI, respectively. If we inspect memory at both of those points, it seems like one of these buffers is the random bytes we saw from earlier, and the other is what seems to be our input, but encrypted.
See how the bytes in the second buffer exactly line up with what we saw in the hexdump earlier? As for the first buffer, notice that the first 7 bytes and the last byte match the second buffer. Those bytes in our plaintext are corctf{
and then the }
. This likely means that there’s some kind of XOR cipher at play here. Unfortunately, we already tried doing some guessing earlier and couldn’t find the key. In many cases, we wouldn’t be able to progress much further here. However, notice what happens when I submit corctf{BAAAAAA...}
instead of corctf{AAAAAAA...}
.
By changing one character, only one byte in our ciphertext changed (look at the 8th byte, the top right of the first buffer)! Knowing this, we can write a script to repeat this process that we’re doing to gradually brute force each character of the flag. I know other people solved this using angr
, but I found it easier to use gdb’s Python scripting capabilities. This video from SloppyJoePirates helped me get a better grasp on some of the syntax used to script this stuff, but my final GDB script looks like this.
The script honestly isn’t as complicated as I thought it would be. We can run gdb commands using gdb.execute()
, the to_string
parameter can be used to save the output which can be parsed later, and there’s some additional work you can do to read the registers. There’s definitely more documentation out there for this.
To run the script, we simply enter the GDB prompt, and then run source solve.py
.
flag: corctf{s331ng_thru_my_0p4qu3_pr3d1c4t35}
Failing to Deobfuscate
This was a funny interaction after everyone telling clubby about the unintended.
After the CTF was over, clubby shared what the source code was, mentioning that most of the effort went into modifying the compiler to obfuscate the code.
Surprisingly, in my analysis, I managed to identify most of these things. What I originally thought was a simple XOR cipher was actually a stream cipher (Salsa20), and is integral to why the unintended works. On this blog, we’ve mentioned AES a few times, which is basically the block cipher that everyone uses today. A block cipher is exactly what it sounds like, a cipher that works on blocks, more specifically, consistently sized groups of bytes. AES will encrypt the first 16 bytes, then the next 16, and so on until it’s done. In addition to this, pretty much every strong block cipher is designed so that a small change in the block will produce a completely different ciphertext. This concept goes by a lot of names, the Avalanche effect, confusion and diffusion, etc. but the core idea is not wanting to leak anything from the result of minor changes.
Secure stream ciphers also incorporate these principles, but they apply them differently. Stream ciphers work by using the key to generate a series of random bytes and then proceeding to combine that keystream with the plaintext, most commonly via XOR. It’s effectively a weaker, but more practical one-time pad. Salsa20 is one of the more popular stream ciphers today, and while it’s a more recently published cipher (published in 2007 compared to 1998 for AES), Salsa is not considered weak.
Looking back at the source code, it is definitely a bad move to make the key and nonce an array of nothing. However, this isn’t the crux of the problem. Since Salsa’s randomness is coming from the way the keystream is generated, and not the actual manipulation of the plaintext, no matter how the key or nonce was set, the encryption can be simplified to XOR with an unknown key, which we can easily bruteforce.
In an alternate world where clubby understands how Salsa20 works, the intended is as brutal as it sounds: deobfuscate the binary. Even though clubby confirmed that there was compiler-level obfuscation, we already suspected such a thing from the assembly. Let’s take a closer look at an example of these unnecessary instructions.
The reason these instructions are useless is because they’re blatantly redundant. After moving the value of RSP into RAX, we add a huge hex value to it, move that value back, do a right bitshift of 0x3f
bits, and simply test if the lowest 8 bits are equal to 1. If it’s not, continue with the execution of the program, otherwise, exit. The program is simply guaranteed to operate as normal, but is forced to run through all of these instructions.
If we reverse engineer the actual challenge by looking at the flag, these bits of assembly are called opaque predicates. A blog post from Binary Ninja (CTF sponsors, who would have guessed), defines them as “a series of instructions which always evaluate to the same boolean result”. The impact on performance is rather minimal, but, as we’ve seen, they’re very effective at making static analysis tools like Cutter or Ghidra waste time analyzing code that doesn’t really matter. Doing some more searching on the topic, there is actually a good amount of academic literature on the subject, like this paper or this thesis.
Now, the Binary Ninja blog only highlights their tool, which is expensive and inaccessible to me, and most of the academic papers don’t really provide concrete deobfuscators. Unfortunately, I wasn’t able to find a neat and clean way to deobfuscate the binary without breaking many other things. I’ve spent at least 6-8 hours outside of the CTF to try and upsolve this without abusing the memcmp()
, but I truly haven’t found a solution, and am looking forward to reading anyone’s solution that manages to get through this mess of a binary.
touch-grass
misc - 89 solves (how???)
Description
Vitamin D deficiency and complications from a sedentary lifestyle are plaguing the CTF community.
To help change this, the esteemed CoR tribunal has decided that touching grass in sunlight is a vital part of this CTF experience.
Trying to trick this system will result in your team being placed on a hall of SHAME - the entire world will know that you don't touch grass!
Authors: strellic, FizzBuzz101, 0x5a
Challenge
We’re presented with a website, but this is not a web challenge. After signing in with rCTF, we’re basically told this:
Solution
Although only 89 teams managed to solve this challenge, it’s much simpler than you think it is.
Go outside. Touch grass. Realize that you forgot to write the code on a paper to take the photo. Decide not to go back inside. Take in the fresh air. Be relieved you’re not looking at clubby’s monstrosity of a program. Lie down. Appreciate what little time we have on this planet. Sleep. Maybe you don’t need to slave over these challenges until 2 am. Maybe it’s time to really become an outdoors person and embrace what the Earth has to offer. Maybe the true solution all along was-
Ah, you forgot to take the photo didn’t you? Just follow the instructions and touch grass. It’s that simple.
“Next CTF is smoke grass” —jazzpizazz