Hackers break parsers, exploiting flaws in these ubiquitous software tools to compromise systems. Galois created DaeDaLus, a toolchain that uses formal methods to produce precise parsers securing vulnerable systems.
Parsers are used in nearly every system to convert external data (packets, files, or bits) to internal data the software can use. These critical tools are everywhere – from loading web browsers (rendering HTML and CSS) to AI/ML natural language processing (tokenizing language) to IoT systems (handling data exchanges) and autonomous systems receiving command instructions. Most data formats lack clear and explicit specifications, and few parsers are written in safe-by-default languages. As a result they are prime targets for memory-safety attacks. In addition, their ambiguity allows parsers to process malicious data files as if they were safe, leading to security breaches, system failures, and data corruption.
Galois’ DaeDaLus toolchain eliminates parser vulnerabilities by using formal methods to automatically define precise format specifications and generate safe-by-construction, high-performance parsers that prevent most significant types of parser errors.
DaeDaLus is a toolchain and a memory-safe language designed to precisely describe even the most detailed features of complex data formats. Engineers use the Format Analysis Workbench (FAW) to understand existing formats, then write detailed format descriptions – eliminating ambiguity and preventing security risks, such as buffer overflow attacks.
The compiler uses those DaeDaLus descriptions to automatically generate correct, executable parsers written in a conventional language (e.g., C++), that can be deployed into existing bodies of code exactly where needed. Developers can modify and refine the code if needed, maintaining correctness with the built-in type system.
By eliminating format ambiguity and generating safe, human-auditable parsers, DaeDaLus dramatically improves the security of otherwise unsafe systems. This highly automated toolchain safeguards software ecosystems from a wide range of malware attacks, ensuring data integrity and system reliability.
This material is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) under Contract No. HR0011-19-C-0073. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Defense Advanced Research Projects Agency (DARPA).