Secure coding is a very difficult practice. Usually, when programmers develop software, their aim is to make the software work rather than break. In this process, vulnerabilities may develop in cases where a legacy function was used instead of a more secure one. Consequently, legacy software is especially vulnerable.
C is one of those languages which is inherently very versatile and powerful but it has one critical drawback — C-based software security depends on the knowledge of the programmer. This means that if the programmer is well aware of secure coding, their software will be secure as well. On the other hand, and this forms the major chunk, if the programmer is not sophisticated enough, there will be loopholes in their software ultimately leading to an exploit.
For people like me, who know programming but are new to the security industry, it is very important to study vulnerable code and understand the possible consequences. This helps refine coding skills and develop an attacker attitude DURING the coding phase rather than AFTER coding the entire software.
In all honesty, it is quite cumbersome to study the complete source code of an application when looking for vulnerabilities like buffer overflows. Although this method has its own merits, it is not the easiest method to find simple vulnerabilities which may be critical. Such vulnerabilities must be immediately resolved and the easiest way to find them is through a technique called Fuzzing.
Fuzzing is a technique for finding “easy” vulnerabilities in code by sending “randomly” generated data to an executable. In general, there are three types of fuzzers:
- Mutation: A type of “dumb” fuzzing where malformed input samples are generated and provided to the executable. This input may or may not conform with the type of input that is expected by the application, so the probability of finding real bugs is not high.
- Generation: A type of “smart” fuzzing which requires some initial test data from which the fuzzer algorithm can generate malformed input from scratch. This type of fuzzing is better than dumb fuzzing in many cases because the program receives the input that it expects.
- Evolutionary: These type of fuzzers use feedback from each “fuzz” to learn over time the format of the input.
In this post, we’ll look at fuzzing with American Fuzzy Lop (AFL). It is a type of evolutionary fuzzer which is suited to fuzz programs that take input from STDIN or a file.
Why AFL?
There are a host of fuzzers in the wild including Peach and syzkaller. So, why AFL?
- The use-case. This is the most important point to consider. My use-case was to fuzz an application that takes input from a file. It is important to note that AFL does not have the capability to fuzz over networks.
- It is simple to install.
- AFL’s UI interface packs a ton of information including real-time statistics of the fuzzing process.
Setup AFL
Setting up AFL is easy and I’ve made it easier for you by writing a simple (but crude) shell script which will install it for you! Run the script with your user privileges and it will install all dependencies, AFL and related tools. The shell script can be found here: https://github.com/nikhilh-20/enpm691_project/blob/master/install_afl.sh
Choose the Application to Fuzz
In this post, we’ll only look at fuzzing those applications for which we have the source code. This is because AFL instruments the source code to monitor execution, errors and other stuff related to performance. It is also possible to directly fuzz an executable but that is experimental and outside the scope of this post (tip: it requires QEMU).
Choose any open source system from GitHub for fuzzing. The more well-known your choice, the less number of vulnerabilities it will likely have. Others are looking for bugs too! One simple method that I use to find vulnerable code is to use GitHub Search. This is what I do:
- Search for a vulnerable function, say strcpy.
- The results will be in the millions. Go to the commits category of the results. This is where you’ll find those repositories where strcpy was used (or maybe removed). These repositories are a good starting point to begin fuzzing.
Instrumenting the Application
For privacy reasons, I cannot disclose the repository that I’m using.
Clone the git repository.
nikhilh@ubuntu:~$ git clone https://github.com/vuln; cd vuln
Set an environment variable, AFL_HARDEN=1. This activates certain code hardening options in AFL while compiling which makes it easier to detect memory corruption bugs.
nikhilh@ubuntu:~/vuln$ export AFL_HARDEN=1
Set certain compiler flags, so that the application is compiled in a manner that makes it easy for us to find (and exploit) vulnerabilities. Ideally we would use environment variables to set what we need but there’s quite a bit of customization. so we’ll directly edit the Makefile.
Ensure that the compiler used is afl-gcc or afl-clang instead of gcc and clang respectively. This is what allows AFL to instrument the source code.
Add compiler flags:
-fno-stack-protector turns off the stack protector which will enable us to exploit buffer overflows.
-m32 is only needed if you’re using a 32-bit machine, otherwise no.
Once you’re done with these changes, it’s time to compile the application. Run make. When you do, you MUST see statements such as these in the log:
Instrumented 123 locations (32-bit, hardened-mode, ratio 100%).
If you don’t see such statements, it means that AFL has not activated the application’s code for fuzzing. In other words, it has not instrumented the source code successfully.
Test Samples
AFL is an evolutionary type of fuzzer. It means that, like generation-based fuzzers, it also requires initial test data to understand what kind of data the target application expects. When targeting open source systems, this is easy to find. Just look in their test directory and you’ll find all the test data you need.
nikhilh@ubuntu:~/vuln$ mkdir afl_in afl_out
nikhilh@ubuntu:~/vuln$ cp test/* afl_in/
Fuzzing Begins
Now that we have our test samples, we’re ready to fuzz!
Oh, wait… we also need to change where the application’s crash notifications go to. By default, when an application crashes, the core dump (basically, the contents of RAM are stored in a file to help in debugging) notification is sent to the system’s core handler. We don’t want this. Why? By the time this notification reaches AFL, it’ll be classified as a timeout rather than a crash.
nikhilh@ubuntu:~/vuln$ sudo su
[sudo] password for nikhilh:
root@ubuntu:/home/nikhilh/vuln# echo core > /proc/sys/kernel/core_pattern
root@ubuntu:/home/nikhilh/vuln# exit
NOW, we are ready to fuzz!
nikhilh@ubuntu:~/vuln$ afl-fuzz -i afl_in -o afl_out -S slaveX — ./vuln @@
Command line flags used:
- -i — This marks the test input directory. This is where we stored the initial test data.
- -o- This is the directory where the AFL writes useful information regarding crashes, hangs, etc.
- -S — This is the Slave mode. Basically, AFL will randomly tweak the input causing non-deterministic fuzzing.
- The -M option is the Master mode which is deterministic fuzzing, which basically means that every bit of the input is being modified in some way. (This is slow! … Obviously.)
- @@ — This is the position where the input test file will be. AFL substitutes this for you automatically. If your executable takes input from STDIN, then this is not needed.
Fuzzing Results
This will take some time to show. Many a time, people fuzz for more than 24 hours (and may end up with nothing). In my case, I think the application was a little too vulnerable, so I had 516 unique crashes within the hour. However, that doesn’t mean that there are 516 vulnerabilities!
You can quit the fuzzing session with a Ctrl-C.
Analysis Phase
Now that we have results, we need to analyze them to see which ones are exploitable. To this end, we will use one of AFL’s utilities called afl-collect. This will have been installed through the installation script as well.
nikhilh@ubuntu:~/afl-utils$ afl-collect -d crashes.db -e gdb_script -r -rr ~/vuln/afl_out/slaveX ./output_dir — ~/vuln/vuln
To understand what each command line flag does, refer to its help section.
nikhilh@ubuntu:~/afl-utils$ afl-collect — help
If you see lines such as these in the output, celebrate! You’ve found something interesting to try and exploit.
*** GDB+EXPLOITABLE SCRIPT OUTPUT ***
…
…
[00001] slaveX:id:000000,sig:11,src:000000,op:havoc,rep:2……………:EXPLOITABLE [PossibleStackCorruption (7/22)]
…
[00041] slaveX:id:000046,sig:11,src:000004,op:havoc,rep:4……………:EXPLOITABLE [StackBufferOverflow (6/22)]
…
AFL will show you what input caused the application to crash. In this case, the file: id:000046,sig:11,src:000004,op:havoc,rep:4 caused a StackBufferOverflow in the application. Such files can be found under ../afl_out-slaveX/crashes/
Done!
That’s it for a quickstart into fuzzing! The process is really simple and very convenient since it is all automated. The next step would be to analyze why the input caused a Buffer Overflow and search for a way to exploit it. Remember that not all vulnerabilities can lead to an exploit.
If you have any questions, please leave them in the comments section below and I’ll get back to you as soon as possible! If you like the article, don’t forget to like and follow! Thanks for reading!