The stack and calling conventions in x86

what is the stack ?

  • The Stacks in low level terms is a hardware stack data structure ,however in the x86 architecture the stack is just a chunk of memory that is reserved for that use and there's actually no specialized hardware ,the Stack is a Last In First Out (LIFO) data structure. In x86 the stack grows downwards , provided by the operating system for each process.

what do the stack do ? and how it is split for each function call?

  • The main function of the stack is to keep track of execution environment of the entire process and to maintain the control-flow between function calls. This means keeping track of the important details for every function call. This is done by using "Stack Frames". Essentially a new "Stack frame " is created on the stack when a new function is executed.

calling conventions

  • if a assembly program was to be completely self contained in it's files , it would be free to use any calling conventions it wants , however , since these programs often need to interface with other programs and shared libraries (such as libc),a standard way of passing arguments is required ,this is the role of calling conventions .
  • in linux binaries , there's mainly two calling conventions :
    • for 32 bit binaries : the standard is the cdecl convention , in which(this is a summary of all that happens inside a function call not just the calling convention) :
      1. the arguments for a functions call are pushed into the stack .
      2. the function call is performed which in it's turn pushes the return address into the stack so as to keep the control flow after the function returns (saving the value of the rip register.
      3. then the ebp pointer is pushed into the stack
      4. ebp is assigned the curent value of the esp registers so now the state of the program before the function call is preserved .
      5. the stack pointer is now set to ebp since it was the last value to be pushed into the stack
      6. then we substract from esp to make space for our local vars so that the can be set and accessed relative to ebp.
      7. the function's local addresses will be below ebp , since the stack grows down in memory , this effectively creates a new Stack frame .
      8. when the function completes what it does , the esp is reset to ebp
      9. then the value of ebp previously pushed is popped into ebp , restoring the stack state that was before the functions call .
      10. the the return address previously pushed by the function call is popped into rip , this restores the program's control flow to it's precall state
      11. the state of the program before the call is effectively restored
    • CamScanner 09-07-2024 01.00-1.pdf
      Pasted image 20250203092656.png
    • Screenshot 2024-09-07 at 14-43-29 Stack Frames in x86 — SignalShore 1.png
    • for 64bit binaries , the ,most common calling convention is as follows :
        • The first four integer or pointer parameters are passed in the rcx, rdx, r8, and r9 registers.
      • The first four floating-point parameters are passed in the first four SSE registers, xmm0-xmm3.
      • The caller reserves space on the stack for arguments passed in registers. The called function can use this space to spill the contents of registers to the stack.
      • Any additional arguments are passed on the stack.
      • An integer or pointer return value is returned in the rax register, while a floating-point return value is returned in xmm0.
      • rax, rcx, rdx, r8-r11 are volatile.
      • rbx, rbp, rdi, rsi, r12-r15 are nonvolatile.
    • otherwise all is similar to the 32bit calling convention

some of the common uses of registers

  • EAX,RAX : Called the Accumulator register. It is used for I/O port access, arithmetic, interrupt calls, etc...

  • ebx,rbx: Called the Base register It is used as a base pointer for memory access Gets some interrupt return values

  • ecx ,rcx: Called the Counter register It is used as a loop counter and for shifts Gets some interrupt values

  • edx,rdx: Called the Data register It is used for I/O port access, arithmetic, some interrupt calls.

  • CS : Holds the Code segment in which your program runs. Changing its value might make the computer hang.

  • DS : Holds the Data segment that your program accesses. Changing its value might give erronous data.

  • ES ,FS,GS : These are extra segment registers available for far pointer addressing like video memory and such.

  • SS : Holds the Stack segment your program uses. Sometimes has the same value as DS. Changing its value can give unpredictable results, mostly data related.

  • ES:EDI RDI : Destination index register Used for string, memory array copying and setting and for far pointer addressing with ES

  • DS:ESI RDI : Source index register Used for string and memory array copying

  • SS:EBP RBPP : Stack Base pointer register Holds the base address of the stack

  • SS:ESP RSP : Stack pointer register Holds the top address of the stack

  • CS:EIP RIP : Index Pointer Holds the offset of the next instruction It can only be read

The EFLAGS register

  • The EFLAGS register hold the state of the processor. It is modified by many intructions and is used for comparing some parameters, conditional loops and conditionnal jumps. Each bit holds the state of specific parameter of the last instruction. Here is a listing :

Bit Label Desciption

0 CF Carry flag
2 PF Parity flag
4 AF Auxiliary carry flag
6 ZF Zero flag
7 SF Sign flag
8 TF Trap flag
9 IF Interrupt enable flag
10 DF Direction flag
11 OF Overflow flag
12-13 IOPL I/O Priviledge level
14 NT Nested task flag
16 RF Resume flag
17 VM Virtual 8086 mode flag
18 AC Alignment check flag (486+)
19 VIF Virutal interrupt flag
20 VIP Virtual interrupt pending flag
21 ID ID flag

  • note from the linux programming interface book:

    • Function arguments and local variables: In C these are referred to as automatic variables, since they are automatically created when a function is called. These variables also automatically disappear when the function returns (since the stack frame disappears), and this forms the primary semantic distinction between automatic and static (and global) variables: the latter have a perma- nent existence independent of the execution of functions

    #assembly #computer_architecture