- The code behind many of the most-cited, seminal publications on security-themed symbolic execution remains non-public; this is particularly true for Mayhem and SAGE. Implementation secrecy is fairly atypical in the security community, is usually viewed with distrust, and makes it difficult to independently evaluate, replicate, or build on top of the published results.
- The research often fails to fully acknowledge the limitations of the underlying methods - while seemingly being designed to work around these flaws. For example, the famed Mayhem experiment helped identify thousands of bugs, but most of them seemed to be remarkably trivial and affected only very obscure, seldom-used software packages with no significance to security. It is likely that the framework struggled with more practical issues in higher-value targets - a prospect that, especially if not addressed head-on, can lead to cynical responses and discourage further research.
- Any published comparisons to more established vulnerability-hunting techniques are almost always retrospective; for example, after the discovery of Heartbleed, several teams have claimed that their tools would have found the bug. But analyses that look at ways to reach an already-known fault condition are very susceptible to cognitive bias. Perhaps more importantly, it is always tempting to ask why the tools are not tasked with producing a steady stream of similarly high-impact, headline-grabbing bugs.
February 04, 2015
There is no serious disagreement that symbolic execution has a remarkable potential for programatically detecting broad classes of security vulnerabilities in modern software. Fuzzing, in comparison, is an extremely crude tool: it's the banging-two-rocks-together way of doing business, as contrasted with brain surgery. Because of this, it comes as no surprise that for the past decade or so, the topic of symbolic execution and related techniques has been the mainstay of almost every single self-respecting security conference around the globe. The tone of such presentations is often lofty: the slides and research papers are frequently accompanied by claims of extraordinary results and the proclamations of the imminent demise of less sophisticated tools. Yet, despite the crippling and obvious limitations of fuzzing and the virtues of symbolic execution, there is one jarring discord: I'm fairly certain that probably around 70% of all remote code execution vulnerabilities disclosed in the past few years trace back to fairly "dumb" fuzzing tools, with the pattern showing little change over time. The remaining 30% is attributable almost exclusively to manual work - be it systematic code reviews, or just aimlessly poking the application in hopes of seeing it come apart. When you dig through public bug trackers, vendor advisories, and CVE assignments, the mark left by symbolic execution can be seen only with a magnifying glass. This is an odd discrepancy, and one that is sometimes blamed on the practitioners being backwardly, stubborn, and ignorant. This may be true, but only to a very limited extent; ultimately, most geeks are quick to embrace the tools that serve them well. I think that the disconnect has its roots elsewhere: