Preprint survey -- Fuzzing: On the Exponential Cost of Vulnerability Discovery

27 May, 2020

Paper: Fuzzing: On the Exponential Cost of Vulnerability Discovery
Authors: Marcel Böhme and Brandon Falk

Summary

This is a very interesting paper that takes a look at the cost of vulnerability discovery as a function of the input generation power (which the authors summarize as the number of machines used to generate inputs) brought to bear against the problem. The authors' analysis is informed by a large set of fuzzing experiments, conducted using what seems to me as a sound methodology. Let's take a program for which a certain number of vulnerabilities are known. The gist of the phenomena observed hinges around at what cost some things may be found.

The number of already-found vulnerabilities (and already covered code) that can be found again in a certain time budget grows linearly with the input generation power.
The time to find a new vulnerability grows exponentially with the input generation power.

My interpretation

The authors' theoretical analysis of this phenomenon looks a lot like the birthday paradox: given the fuzzing methodologies under study (coverage-guided, mutation-driven, uninformed by application constraints), low-hanging fruits are easy to pick. The authors go so far as comparing vulnerability discovery on a program to building a collectible card collection: your first pack obviously gives you nearly only new cards, but after you bought several packs, your collection will largely look the same and contain a bunch of duplicates whatever the pack contents.

This metaphor presented in the paper's conclusion illuminates approaches to breaking the exponential barrier to new vulnerability discovery. Collectible cards are not equal: many are made rarer than others by the publisher, so that they may become coveted collection items, and drive people to buy ever more packets. Thus, when building a collection, it is true that most people get roughly the same core of common cards, but in addition, they get several rarer cards, and maybe one very rare card. When comparing collections, groups of medium-rare cards have a much smaller intersection than the common cores, and the rarest cards are seldom shared. This is seemingly where the metaphor breaks: it seems everybody has to pay the same exponential cost to find the next vulnerability.

I posit, and I am participating to an research project that will test this hypothesis, that the exponential cost barrier can be broken by making fuzzing smarter. The AFL approach is "dumb" in the sense that it generates inputs to maximize a coverage function, without heed to the expected structure or semantics of inputs for the application. Thus, by taking such constraints into account, we could avoid generating and testing many inputs that otherwise fall onto highly common rejection code paths. In addition, the walking over the application's code by an input generates state information. Using mere coverage of features or branches to guide the generation of new inputs drops this state information. Taking into account the order that code branches are visited (in addition to the number of times they are visited) could help generate inputs that drive towards more vulnerabilities.

However, increasing smartness remains a losing arms race for vulnerability research. Should my hypothesis hold and one manages to find new vulnerabilities to software at a cost less than an exponential function of the input generation power, the barrier would merely be displaced. After a while, the set of vulnerabilities that can be found using the smarter fuzzer nears exhaustion and the cost for finding the next vulnerability becomes exponentially high again. So if we go the make-it-smarter route, it ultimately stretches as long as the dumb road; except that we do manage to find a bunch more vulnerabilities for our trouble.