- Walking bit flips: the first and most rudimentary strategy employed by afl involves performing sequential, ordered bit flips. The stepover is always one bit; the number of bits flipped in a row varies from one to four. Across a large and diverse corpus of input files, the observed yields are:
- Flipping a single bit: ~70 new paths per one million generated inputs,
- Flipping two bits in a row: ~20 additional paths per million generated inputs,
- Flipping four bits in a row: ~10 additional paths per million inputs.
- Walking byte flips: a natural extension of walking bit flip approach, this method relies on 8-, 16-, or 32-bit wide bitflips with a constant stepover of one byte. This strategy discovers around ~30 additional paths per million inputs, on top of what could have been triggered with shorter bit flips. It should be fairly obvious that each pass takes approximately one execve() call per one byte of the input file, making it surprisingly cheap, but also limiting its potential yields in absolute terms.
- Simple arithmetics: to trigger more complex conditions in a deterministic fashion, the third stage employed by afl attempts to subtly increment or decrement existing integer values in the input file; this is done with a stepover of one byte. The experimentally chosen range for the operation is -35 to +35; past these bounds, fuzzing yields drop dramatically. In particular, the popular option of sequentially trying every single value for each byte (equivalent to arithmetics in the range of -128 to +127) helps very little and is skipped by afl. When it comes to the implementation, the stage consists of three separate operations. First, the fuzzer attempts to perform subtraction and addition on individual bytes. With this out of the way, the second pass involves looking at 16-bit values, using both endians - but incrementing or decrementing them only if the operation would have also affected the most significant byte (otherwise, the operation would simply duplicate the results of the 8-bit pass). The final stage follows the same logic, but for 32-bit integers. The yields for this method vary depending on the format - ranging from ~2 additional paths per million in JPEG to ~8 per million in xz. The cost is relatively high, averaging around 20 execve() calls per one byte of the input file - but can be significantly improved with only a modest impact on path coverage by sticking to +/- 16.
- Known integers: the last deterministic approach employed by afl relies on a hardcoded set of integers chosen for their demonstrably elevated likelihood of triggering edge conditions in typical code (e.g., -1, 256, 1024, MAX_INT-1, MAX_INT). The fuzzer uses a stepover of one byte to sequentially overwrite existing data in the input file with one of the approximately two dozen "interesting" values, using both endians (the writes are 8-, 16-, and 32-bit wide). The yields for this stage are between 2 and 5 additional paths per one million tries; the average cost is roughly 30 execve() calls per one byte of input file.
- Stacked tweaks: with deterministic strategies exhausted for a particular input file, the fuzzer continues with a never-ending loop of randomized operations that consist of a stacked sequence of:
- Single-bit flips,
- Attempts to set "interesting" bytes, words, or dwords (both endians),
- Addition or subtraction of small integers to bytes, words, or dwords (both endians),
- Completely random single-byte sets,
- Block deletion,
- Block duplication via overwrite or insertion,
- Block memset.
- Test case splicing: this is a last-resort strategy that involves taking two distinct input files from the queue that differ in at least two locations; and splicing them at a random location in the middle before sending this transient input file through a short run of the "stacked tweaks" algorithm. This strategy usually discovers around 20% additional execution paths that are unlikely to trigger using the previous operation alone. (Of course, this method requires a good, varied corpus of input files to begin with; afl generates one automatically, but for other tools, you may have to construct it manually.)
August 08, 2014
Successful fuzzers live and die by their fuzzing strategies. If the changes made to the input file are too conservative, the fuzzer will achieve very limited coverage. If the tweaks are too aggressive, they will cause most inputs to fail parsing at a very early stage, wasting CPU cycles and spewing out messy test cases that are difficult to investigate and troubleshoot. Designing the mutation engine for a new fuzzer has more to do with art than science. But one of the interesting side effects of the design of american fuzzy lop is that it provides a rare feedback loop: you can carefully measure what types of changes to the input file actually result in the discovery of new branches in the code, and which ones just waste your time or money. This data is particularly easy to read because the fuzzer also approaches every new input file by going through a series of progressively more complex, but exhaustive and deterministic fuzzing strategies - say, sequential bit flips and simple arithmetics - before diving into purely random behaviors. The reason for this is the desire to generate the simplest and most elegant test cases first; but the design also provides a very good way to quantify how much value each new strategy brings in to the table - and whether we need it at all. The measurements of afl fuzzing efficiency for reasonably-sized test cases are remarkably consistent across a variety of real-world binary formats - anything ranging from image files (JPEG, PNG, GIF, WebP) to archives (gzip, xz, tar) - and because of this, I figured that sharing the data more broadly will be useful to folks who are working on fuzzers of their own. So, let's dive in: