Hacking lessons from Malcolm in the Middle

June 5, 2019
security asm exploits

Anyone who has been around computers for long enough has probably heard the term buffer overflow or the related term _stack overflow (_the buffers in question live on the stack). It’s one of the most basic but also the most prevalent computer security issues.

The stack is a fundamental data structure that every computer science student learns early on. It’s called FIFO or First-In-First-Out because that’s how data is added and removed (think of it as a crowded elevator; the person standing next to the door has to get out before anyone else can).

The stack is ideal for in a program where repetitive tasks are broken into separate functions and called hierarchically. The stack is used to remember where you are in the overall sequence of tasks and sub-tasks. This is best explained by a clip from Malcolm in the Middle.

https://www.youtube.com/watch?v=AbSehcT19u0

Replacing the light bulb requires securing the shelf, which requires greasing the drawer, which requires driving to the hardware store, which requires fixing the car. The stack is used to remember to grease the drawer when you’re done fixing the car, to secure the shelf when you’re done greasing the drawer, and to replace the light bulb when you’re done securing the shelf.

As each partially-completed task triggers a new, different task, the current one is put on hold. The place in the old one is saved, to be resumed later, and execution jumps to the start of the new task. If the new task calls a third, then the second one is put on hold like the first, the place within the second where the interruption happened is stored, and execution jumps to the third.

Each time a task finishes, the note from when it started is used to resume where things left off in the previous task. This is why a FIFO model works well. Each new frame gets pushed onto the top of the stack when it starts executing, and popped off when it finishes, leaving the previous top of the stack to be the top once more.

A graphic of the stack in memory
Each frame represents a new task that interrupted the previous

Notice how the “return link” is stored adjacent to the data. This is what allows a data buffer overflow to hijack code execution.

When one task needs to call another before it can continue, like in the video, a new stack frame is pushed onto the stack for each task, and a note about the previous task is stored. Within each frame, the compiler also sets aside some scratch space, for notes related to the task, and sometimes the data stored in this region is user-entered.

Here’s the rub: sometimes when the program reads user data into this space, it doesn’t verify where there’s enough space available to store everything the user entered, and the scratch space is immediately adjacent to the space that remembers what the previous task was. I think you see where this is going.

If the programmer wasn’t careful, we can ask the program to store more data in the stack frame than will fit. The excess data will spill over into somewhere adjacent.

If the program says “enter your name” and allocates enough space for twenty characters, but you write thirty, those extra ten characters can clobber adjacent memory regions with garbage data. The rest of the system wasn’t expecting this to happen; whatever checks are done on the data in those adjacent regions have already been done, and the system will take whatever data happens to be there and use it (garbage or not).

If you overwrite the location of the previous task with the last ten letters of your name, the program will crash when it tries to execute that. However, if, for example, you write the first twenty characters of your name normally and then instead of the last ten characters you put some machine code instructions, properly encoded, the system will continue to execute albeit with your instructions in place of the old ones.

If we carefully craft what we enter such that the excess that spills over happens to look like a record of what the program was doing before, we can hijack the process and have it do whatever arbitrary thing we want.

This could be as simple as skipping ahead to later levels in a game to injecting some custom code that downloads and executes malware. Once you have the CPU doing what you told it instead of what the programmer told it, you can pretty much do whatever you want.