Binary Analysis

Members and Collaborators


There currently exists a large gap between the available manpower required in order to mitigate the attack surface of software products deployed for real-world applications, and the actual requirements that would be necessary to consistently evaluate the risks and mitigate the chances of security breaches, or mitigate the span of potential damage under successful attack. A subset of these problems includes vulnerability discovery, i.e., the process of discovering existing, known or unknown security bugs (aka vulnerabilities) in commodity software, and intrusion detection, i.e., the process of detecting attacks that attempt to exploit vulnerabilities. A partial long-term solution these problems is to train more security experts. However, because of the ever-growing tendency of modern software in terms of size, complexity, application scope and unpredictability (to name only some of the parameters at stake), automated techniques for program analysis and vulnerability discovery are the foundation of any future scalable approach.

Our current focus is on the following subprojects.

Detecting and patching vulnerabilities in binary executable programs

We are investigating novel approaches for detecting and patching vulnerabilities directly in binary executable programs, at scale. Our initial focus has been the sanitization of input parsing routines against buffer overflows. We are currently extending this work to additional classes of vulnerabilities.

Reconstructing program semantics from binary

There exists a large semantic gap between the source-code of a program and its binary representation. The lack of semantic information in binary makes reverse engineering a daunting task, and limits the scope and accuracy of analysis techniques. We aim to automatically reconstruct information about program semantics n order to bridge this gap, and therefore reintroduce parts of what was lost during the compilation process, such as variable types, as well as the identification of known library functions.

Mitigating algorithmic-complexity-based DoS attacks

In this research we look to identify vulnerabilities in common applications, where a well-crafted input may slow down the application. For example, this may occur if the application uses hash-tables and the inputs all hash into the same bin. This effort is part of the larger Leader project, decribed here.

Software and Datasets


This material is based upon work supported by the National Science Foundation under grant #1319215 and #1815495, and by the Science and Technology Directorate of the United States Department of Homeland Security under contract number D15PC00184. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation nor the Department of Homeland Security.