Saturday, 17 February 2018

Buffer overflow vulnerability

Stack overflow and Buffer overflow are very interesting vulnerabilities, basically if exploited correctly; Hacker could gain complete access to program or machine. This class of vulnerability is also hard to exploit.

Today, In this post I will show few examples of buffer vulnerability, take an example program and then exploit it. Let's first jump to what is buffer overflow?

 void echo()  
   char msg[100];  
   scanf("%s", msg);  
   printf("echo: %s\n", msg);  

Looks good right? How can someone hack this program? And even if they can, what can they do with it?

scanf doesn't know the size of buffer, if a user enters a string more than 100 chars (or even 100), code will corrupt program's stack. So what? program might crash? yep, it will. But a perfect exploit can a do lot with this. Let me give you an example, you have a webserver running an echo service something like that. Because most of time webservers are running as a root program. Hacker can enter a malicious message that let hacker control your root "shell". Hacker will run arbitrary commands on your web server with root privilege. Woah! Practically, Hacker can do anything after that.

How the hackers do that? To understand working of the concept we need to jump into machine's code. I am assuming intel x86 cpu (32bit).

     push    ebp  
     mov    ebp, esp  
     sub    esp, 136  
     lea    eax, [ebp-108]  
     mov    DWORD PTR [esp+4], eax            ; *msg
     mov    DWORD PTR [esp], OFFSET FLAT:LC0  ; "%s\0"
     call    _scanf  
     lea    eax, [ebp-108]  
     mov    DWORD PTR [esp+4], eax            ; *msg
     mov    DWORD PTR [esp], OFFSET FLAT:LC1  ; "echo: %s\12\0"
     call    _printf  
command: gcc -S main.c -m32 -masm=intel

As expected, compiler make room for "char msg[100]" on stack and then pass pointer to this array on stack along with pointer to format string and then call "_scanf" function. Let's look at user's stack

 <-- caller's program counter -->  
 <--       EBP (4 bytes)      -->    ; push ebp  
 <--  empty room (136 bytes)  -->    ; sub esp, 136  

It looks like compiler allocated extra space on stack for optimization reason (stack alignment). Next instruction in series is interesting because it tell us about location of "msg" on stack.

 lea  eax, [ebp-108]  

lea stands for load effective address, so basically "eax = ebp - 108", eax contains pointer to msg because this is passed as the second argument while calling _scanf. Great! so we know out of 136 bytes, higher 108 bytes is for msg. Now our stack looks something like this

 <-- caller's program counter -->  
 <--       EBP (4 bytes)      -->  
 <--      msg (108 bytes)     -->  
 <--  empty room (28 bytes)   -->  

if a user enters 108 bytes of data in input, it will overwrite EBP pointer on the stack which will crash program later but what if we overwrite caller's program counter? Look at last few instructions


"leave" is short-hand for:
     mov esp, ebp  
     pop ebp  
exactly opposite of first few instruction where we copy esp to ebp and push ebp.

so if we have corrupted saved ebp on stack then we can control the value of ebp register after execution of "leave" instruction.

Next instruction we have is "ret", it simply pop "callers instruction pointer" and make a jump there. This is where things go interesting. We can corrupt stack and corrupt caller's saved pointer. so effectively we can control program counter and ebp (base pointer). controlling program counter is like making a jump instruction in the program with the privilege program is running with.

So, here's the exploit; we write machine code in memory that will execute "shell" using execv system call. then we corrupt base pointer and instruction pointer and make it jump to our shellcode. Awesome! after doing that we will be running a shell inside this program.


  1. great post , thanks , what is the information source ?

    1. I am already familiar with this. but you can read more about this from wikipedia.