Differential paper quickie: LangFuzz vs. Grammarinator

03 Jul, 2020

Papers under consideration:

Holler et al., Fuzzing with Code Fragments, presented at Usenix 2012

Hodován et al., Grammarinator: A Grammar-Based Open Source Fuzzer, presented at A-TEST '18, USA, 2018, DOI 10.1145/3278186.3278193. -- On GitHub

The use of generators for synthesizing complex fuzzing inputs needs no more justification: so-called dumb generation is helpless to make samples that satisfy the most elementary syntactic constraint, and simple mutation approaches do not fare much better. However, generators are a far cry from a silver bullet: they are complex pieces of software meticulously tailored to a single system under test that require constant tweaking to exhaustively explore it. Also, their typical outputs are bloated code rambles that trigger bugs unreliably, and demand tedious programmer pruning to zero in on the issue it triggers. They thus stand as high-risk/high-reward gambits.

The LangFuzz meta-generator tool of Holler et al. changes this risk-reward trade-off by lowering the cost of generator development. LangFuzz generates a fuzzer from two input artifacts: a context-free grammar for the samples to generate, and a corpus of reference samples. The former artifact is a common building block of the system under test, and the latter is typically produced as part of the development of the system, in the form of a regression test suite. LangFuzz works by combining generation with mutation. On the one hand, it leverages the grammar to parse the sample corpus, building a pool of code fragments that can derive from every non-terminal. On the other hand, it generates random abstract syntax trees from the context-free grammar, down to a certain depth: non-terminals not yet resolved after this step are replaced with appropriate code fragments from the pool. The main issue with this approach is that, much like naïve regular generators, the samples made through LangFuzz generators can be hamstrung by trivial semantic problems, such as identifiers used before being defined. LangFuzz handles only this one particular semantic issue by tracking defined identifiers throughout the generation process. The reason is that the authors wanted to keep the input set to the meta-generator as simple as possible, and thereby avoid imposing any unduly tedious requirement on the user.

The Grammarinator tool is very similar to LangFuzz. Given how Grammarinator was presented at a seemingly less prestigious conference, some 6 years after LangFuzz, suggests that the authors built the tool unaware of this previous contribution, and learnt of it during the paper review process. In any case, Grammarinator goes one step beyond LangFuzz: in addition to building a generator out of a context-free grammar and a corpus of samples, Grammarinator leverages ad hoc semantic rules, which can be woven through grammar definitions in the ANTLR format.

The LangFuzz authors have run a very good performance analysis of their generators, in comparison with hand-tailored generators. In the case of Javascript generation in particular, the authors found that LangFuzz-based generators found defects that eluded their comparison point, jsfunfuzz. However, the opposite was also true, and the rate of defect discovery was higher with jsfunfuzz. Their conclusion is thus that meta-generators are complementary to tailored generators. Moreover, their leveraging of any test case as input makes meta-generators uniquely apt at finding defects similar to others previously found. This ability effectively eludes tailored fuzzers, which require hand tweaking to select classes of samples to emulate.

I believe this concept, and its open source realization in Grammarinator, is none short of extraordinary. I still mean to explore the use of simple semantic rules to facilitate the ingestion of random grammar derivations, and Grammarinator seemingly gives me the opportunity on a silver plate -- and similarity generation is icing on this cake. I'm going to dive in, hopefully I come up with pearls.