Improving Experimentation on Testbeds


Members and Collaborators

Overview

Current testbed experiments are often ad-hoc, manual, complex and hard to repeat and reuse. This is due mostly to our current inability to capture, standardize and encode experiment behavior. In this research we look for ways to structure and automate testbed experimentation, to enable repeatability and reusability of experiments.

CLASSNET

Community Labeling and Sharing of Security and Networking Test datasets (CLASSNET) project supports network and security research with new, labeled, rich and diverse datasets to the research community. The project develops a framework for collaborative, community-driven enrichment and labeling of data, enabling use of our datasets for machine learning in networking and security. Second, the CLASSNET project makes data available to researchers through multiple methods, ensuring privacy of data while enabling flexible data computation. Finally, the project also generates diverse continuous (constantly, automatically updated) and curated (selected by human) datasets for research use.

SEARCCH

SEARCCH is a collaborative, community-driven platform for cybersecurity research artifact cataloguing that facilitates sharing and reuse. Artifacts that can be catalogued include tools, data, experiment methodologies and setups, publications, and the like.

SEARCCH builds and maintains a database of metadata about research artifacts that are housed in different places on the internet. It lowers the barrier for sharing these artifacts through automated submission assistant tools that process and extract metadata from artifacts stored in standard locations such as Github.

SEARCCH helps researchers to rapidly find relevant artifacts that will help with their own research by enabling searching over domain-specific keywords and other metadata. In addition to authors, license information, and keywords, SEARCCH also stores information about relationships between related artifacts, making it easier to find multiple artifacts associated with a particular research effort.

SEARCCH also facilitates a community around these artifacts. It allows researchers to share the location of their artifacts with the community and their experience with different artifacts.

DEW

Distributed Experiment Workflow, or DEW, is a new experiment representation. DEW enables the researcher to abstract the definition of an experiment from its realization. It encodes the desired behavior of an experiment at a high-level as a scenario (e.g. “generate attack from A to B, wait 10 seconds, turn on defense at C”), and provides sufficient details as to how each action in a scenario can be realized on the testbed, via bindings (e.g. use script attack.py with specific parameters for the attack action). DEW further encodes only those features of testbed topology, which matter for the experiment, via constraints (e.g. “use Ubuntu OS on C”).

When an experiment is to be realized on the testbed, the constraints section of DEW is used to generate a resource request for the testbed. Once the experiment is allocated—physical nodes are reserved and loaded with the operating system—the scenario and bindings are used, along with allocation details, to produce scripts, which run on the nodes.

When the researcher runs the experiment on the testbed they parameterize and run these scripts, possibly interspersed with manual actions, which produces a run history. Together, the DEW representation, node allocations and the run history represent a complete record of an experiment, which can be shared and reused by others.

LegoTG

LegoTG is a framework for composable traffic generation with custom blueprints. Our framework facilitates easy reconfiguration, sharing and customization of traffic generation in testbed experiments, and easy adoption of new code by users.

We decompose traffic generation into orthogonal tiers: traffic feature selection, feature models, and traffic realization. We then divide realization into separate functionalities, so that each functionality can be achieved by a stand-alone customizable component which we call a “TGblock”. This modularization facilitates high code reuse, as researchers can combine TGblocks from different traffic generators, modify, replace and add TGblocks to fit their needs.
.

Software and Datasets

Publications


This material is based upon work supported by the National Science Foundation under Grant #1127388 and #1835608. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.