Supporting Live and Versatile Malware Analysis in Testbeds
Members
Overview
We advocate for publicly accessible live
malware experimentation testbeds. We introduce new
advancements for high-fidelity transparent emulation
and fine-grain automatic containment that make such experimentation safe and useful to researchers, and we are working on a complete, extensible live-malware experimentation framework. Our framework, aided by our new technologies, facilitates a qualitative leap from current experimentation practices. It enables specific, detailed and
quantitative understanding of risk, and safe, fully automated experimentation by novice users, with maximum
utility to the researcher.
Above Figure shows our proposed architecture for live malware experimentation on a public testbed. While the layout is similar to other frameworks, the
functionalities of the highlighted components (shown in
blue in the Figure) are novel and provide a qualitative
advancement.
In our framework, all malware communication with the
Internet is examined and contained by a dedicated Warden machine. The Warden sits between the Inmate Network where malware executes on testbed machines
and the outside world. Using a fine-grained firewall and
policy engine, the Warden chooses one of
the following actions to take with malware traffic: (1)
drop, (2) rewrite, (3) rate-limit, (4) forward on to the In-
ternet and (5) redirect to Smart Impersonator services
which mimic public Internet servers.
In addition to the firewall and policy engine, the Warden supports monitoring and data persistence functionalities. These functionalities are supported through continuous collection and storing of network traces to capture all communication exchanged between the framework and the Internet. The Warden also keeps experimental history for each experiment information such as the testbed user, the malware studied, the experimental environment, etc.
The Inmate Network consists of a mix of machines,
some of which run VM software (e.g., QEMU,
VMWare, OpenVZ and Xen), while others
are bare metal machines. Since malware limits its be-
havior if it detects virtualization, machines that run VM
images would also run our Hi-Fidelity Emulator (HFE)
to defeat virtualization checks.
Hi-Fidelity Emulation
The challenges for hi-fidelity emulation are:
- create a
comprehensive list of differences between a VM and a
bare metal machine.
- create "lies" to hide these
differences.
For more information, take a look at
Cardinal Pill Testing of System Virtual Machines
Managing Malware's External Communication
Malware's external communication must be tightly managed to balance the utility of experimentation to the researcher with the risk external communication poses to
the Internet. To support a range of testbed users, with differing experimental needs, traffic from each experiment
needs to be subject to its own set of policies affecting
external communication. But knowledge about malware
behavior learned in one experiment, is shared between
experiments. This allows for evolution of policies from
more restrictive to permissive as testbed learns what to
expect from each malware's communication.
We observe that once we allow traffic out of our framework it is impossible to guarantee that there will be no
risk to the Internet. Even the most benign looking traffic,
such as a single HTTP GET message, can be malicious
if generated by a multitude of machines, simultaneously,
to overwhelm a victim destination. We manage this risk
by using the following four-step containment approach
to hand each malware communication attempt:
- Contain it and evaluate if it is a necessary communication for malware
- If necessary, redirect it to a Smart Impersonator. Try
Random Impersonator at first. If that does not expose sufficient malware behaviors switch to a Custom Impersonator if available.
- If Custom Impersonator is not available, run Symbolic Execution Engine to build the Custom Impersonator.
- If Custom Impersonator cannot be built (i.e., malware communication is unforgeable) let the communication out to the Internet and observe.
To evaluate if a communication is necessary for malware we collect several measures of malware activity:
- Number of system calls.
- Number of unique system calls.
- Entropy of system calls.
For more information, take a look at:
Malware Communicatoin Analysis
Publications