Pages

Sunday, November 21, 2010

Memory Segmentation

Writing down what you think you know is a good way of finding out what you don’t.

    When debugging a program, we have seen address of variables differ, whether it's a local variable or global, static or not, initialized or not, dynamically allocated....The memory of a compiled program is composed of 5 segments: text, data, bss, heap and stack. Each one contains specific things and has different properties. 
  1. The Text Segment:
    • also know as code segment. Here are located the program instructions (assembled machine code)
    • execution of instructions from here is non-linear, controlled by the EIP (instruction pointer) register. After an instruction is read, EIP is incremented with the byte length of the instruction and the instruction is executed.
    • write permission is disabled (prevent modifications of the code). If modification is attempted, an alert is generated and program is killed. 
    • being read-only, can be shared by multiple copies of the program running simultaneously
    • has a fixed size (does not need to change)
  2. The Data Segment:
    • used to store static and global variables
    • contains initialized global & static variables
    • is writable
    • has fixed size
  3. The BSS Segment:
    • used to store uninitialized global and static variables
    • is writable (same as data segment)
    • has fixed size (same as data segment)
  4. The Heap Segment:
    • can be directly controlled 
    • programmer can allocate blocks from this segment
    • it's not fixed, it can grow or shrink as needed
    • the memory here is managed by the allocator/deallocator algorithms 
    • it grows toward higher addresses (downward by convention)
  5. The Stack Segment:
    • has variable size
    • stores variables and contexts  (stack frame) for every function
    • a stack frame contains:
      • the variables that are passed to the functon
      • the location the EIP should point after the function finishes
      • all the local variables used by the function
    • LIFO (last-in first-out) structure containing all the stack frames [1]
An image of the program memory:
An example to show variables addresses in different segments of memory. This shows how the variables are placed in the memory, according to the picture. An execution with gdb reveals also the address where the instructions are to be the lowest.
#include 

int global_var;

int global_initialized_var = 5;

void function() {  
   int stack_var; // This variable has the same name as the one in main() !

   printf("the function's stack_var is at address 0x%08x\n", &stack_var);
}

int main() {
   int stack_var; // Same name as the variable in function()
   static int static_initialized_var = 5;
   static int static_var;
   int *heap_var_ptr;

   heap_var_ptr = (int *) malloc(4);

   printf("These variables are in the data segment.\n");
   printf("global_initialized_var is at address 0x%08x\n", &global_initialized_var);
   printf("static_initialized_var is at address 0x%08x\n\n", &static_initialized_var);

   printf("These variables are in the bss segment.\n");
   printf("static_var is at address 0x%08x\n", &static_var);
   printf("global_var is at address 0x%08x\n\n", &global_var);

   printf("This variable is in the heap segment.\n");
   printf("heap_var is at address 0x%08x\n\n", heap_var_ptr);

   printf("These variables are in the stack segment.\n");
   printf("stack_var is at address 0x%08x\n", &stack_var);
   function(); 
}

References:
  1. The call stack  (Wikipedia)
  2. Jon Erickson - Hacking: The Art of Exploitation, 2nd Edition
  3. Toby Opferman - Debug Tutorial Part 2: The Stack  (on  Codeproject) 

No comments:

Post a Comment