ISCAP Proceedings - 2025

Louisville, KY - November 2025



ISCAP Proceedings: Abstract Presentation


BinWORDS: Binary Word Origination for Rapid Dictionary Synthesis


Andrew Kramer
Dakota State University

Abstract
Fuzzing has emerged as a highly effective technique for identifying vulnerabilities in compiled binaries; however, its effectiveness is strongly dependent on the fuzzer’s ability to discover and explore new program states. Incorporating a tailored dictionary of relevant tokens enables the mutation engine to generate inputs that are more likely to trigger new states, accelerating path exploration and increasing the likelihood of exposing subtle flaws. Unfortunately, building such dictionaries is often a non-trivial, labor-intensive process, particularly for black-box binaries where no source code or documentation is available and manual reverse engineering is required. This limits the speed and scale at which dictionaries can be constructed for fuzzing real-world software targets. To address this challenge, we present BinWORDS (Binary Word Origination for Rapid Dictionary Synthesis), a novel framework for aiding in the construction of fuzz dictionaries through automated binary analysis. BinWORDS employs a variety of static and dynamic analysis techniques to identify candidate tokens directly from compiled executables without source code or documentation. Core methods include extracting string literals, identifying integer constants in comparison operations, tracking stack variable initialization patterns, and scanning data segments at run-time for dynamically generated values. Additionally, the framework provides a plugin interface to support future extensions. Using these techniques, BinWORDS constructs a baseline dictionary of relevant tokens with minimal manual effort. By augmenting manual dictionary generation with automated tooling, BinWORDS reduces the barrier to launching effective fuzzing campaigns against closed-source targets. Although this research is still ongoing, we anticipate that BinWORDS will provide faster path discovery, improved code coverage, and increased rates of vulnerability detection compared to fuzzing without a dictionary, at a fraction of the traditional time investment.