Tutorial #4 - System Calls and Buffer Overflows

In this tutorial we are going to investigate the way in which a number of Unix system calls work. This will allow you to more clearly understand the security implications mentioned in the lectures. Additionally, we'll walk through a buffer overflow example and look at the way in which it works in detail.

In all cases you should carefully examine the source code and make sure that you can understand how it works!

Fork

This example demonstrates how the fork(2) system call functions, also making use of wait(2). The source code is available here - fork.c

Compile the source code:
gcc -o fork fork.c
Run the binary:
./fork

The fork(2) system call will result in a second identical process being cloned, one being the parent, the other the child. Run ps xa | grep fork^[1] to observe both processes, after running the binary.

The parent process wait(2)'s for the child to complete, hence both processes will terminate at the "same time"^[2].

Exec

This example demonstrates how the execl(2) system call can be used to invoke another binary and replace the process that is currently executing. This example builds on the previous fork code, with the child process executing /bin/date. The code can be found here - exec.c

Compile the source code:
$ gcc -o exec exec.c
Run the program:
$ ./exec

A second process will be spawned, which is then replaced with the /bin/date binary. Note that the code after the execl(2) system call is never executed, since the entire process image is replaced with the new binary. Try modifying the execl(2) code so that it attempts to execute a non-existent file (or one that is not executable). What happens?

Open

This example demonstrates one of the potential gotchas with using hardcoded filenames and not checking to see if the file already exists. The code is available here - open.c

Compile the source code:
$ gcc -o open open.c
Run the binary:
$ ./open

A file called test will have been created, made world readable and writable, before being deleted (unlinked).
Create another file and create a symbolic link called test which points to your file. For example:
touch passwd ln -s passwd test
Make the file so that it is only readable/writable by you:
$ chmod 600 passwd
Check the permissions on the file and symbolic link:
$ ls -l passwd test
Run the binary again, then check the permissions on the file and symbolic link again - what happened?

Root Shell

This example demonstrates the way in which real and effective UIDs impact on code execution. For more information read the man pages for setuid(2). The code is available here - rs.c. Please note that you will need a system on which you are permitted to have root access (eg. your own!).

Compile the source code:
$ gcc -o rs rs.c
Make the file owned by root and setuid. An attacker would achieve this via privilege escalation, however we'll cheat and login as root first:
$ su root $ chown root rs $ chmod 4711 rs
Run the binary from a normal user account:
$ ./rs

What happened? The id(1) or whoami(1) system tools might help you figure things out...
What happens if you run the binary without it being setuid root (this part you can try in the lab!)?

PS. It would be a sensible idea to remove the binary (or at least the setuid permissions) once you've finished playing!

Buffer Overflow

Okay, now it is time to investigate an effective buffer overflow. Grab the source code from here - overflow.c

Compile the source code with debugging enabled (-g option to gcc):
$ gcc -g -o overflow overflow.c
Start the GNU debugger (GDB) with the binary as an argument:
$ gdb overflow
Set a breakpoint on the dumbFunction function:
(gdb) break dumbFunction
Run the program, which will break once execution reaches dumbFunction:
(gdb) run
Locate the current stack frame:
(gdb) info frame

The first part of the output should look something like the following:
Stack level 0, frame at 0xbfef85d0: eip = 0x80483ba in dumbFunction (overflow.c:15); saved eip 0x8048436

The first line provides us with the address of the stack frame and the saved return address is detailed at the end of the second line.
You can inspect four 32-bit words at the top of the stack, as follows:
x/4w 0xbfef85d0

Where 0xbfef85d0 is the address of the stack frame, as given by the info frame command.
Try viewing 16 32-bit words, remembering that the stack grows towards lower memory, hence we have to decrease the address in addition to increasing the number of words:
x/16w 0xbfef85a0

You should be able to see the three function arguments pushed onto the stack in reverse order (0x00000003, 0x00000002, 0x00000001). The next word should be the return address, followed by the frame pointer. The return address will match the saved eip value as displayed by the info frame command.
Determine the address of the buffer. This can be done using the print command:
print &buffer

In both C and GDB, the ambersand character (&) means "address of". You should be able to locate the buffer within the stack memory.
Use the step command to step execution of the program.
After running step reinspect the stack and note any changes that have occurred. You should be able to explain the changes (compare the source code to the observed behaviour as you go).
After the program has finished type quit to exit GDB.

The program should print out 5 since increasing the return address by seven bytes should skip the machine instructions that assigns 6 to the variable i. You can view the assembly code in question by running gcc -S -o overview.S overview.c and inspecting the overview.S file. Alternatively, run disassemble main at the GDB prompt. The latter will give you the addresses/offsets of the machine instructions.

Couple of hints - gdb has a help command and a reasonable man page. The code and examples given have been designed with the Intel x86 (IA32) architecture in mind. It will work on other architectures, however some changes may need to be made regarding memory address offsets. You may also prefer to use a visual debugger - DDD - a visual frontend to GDB, is one such option and is installed on the machines in B1.11.

1. The arguments to ps vary between operating systems. Linux and Mac OS typically use ps xa whereas you may need to use ps -edalf on Solaris or IRIX. Consult man ps if in doubt.

2. Nothing happens at exactly the same time on a timeshared system, however it is close enough for a human observer.