Critter@home: Content-Rich Traffic Trace Repository from Real-Time, Anonymous, User Contributions

Members

Overview

Critter@home is a project to connect researchers to content-rich data from anonymous Internet end-users. Our goal is to facilitate the safe sharing of data and provide a platform for end-users to contribute data and be part of the research process.

Problem Statement:

Networking and cybersecurity research critically need publicly available, fresh and diverse application-level data, for data mining and for validation.

Content-rich network data has enormous privacy risks for sharing, because it is rich with personal and private information (PPI) that Internet criminals can monetize E.g., human names, social security numbers, phone numbers, usernames, passwords, credit card numbers, etc.

Key Insights

Critter@home is a continuously updated archive of content-rich network data, contributed by volunteer users. Data contributors join the Critter overlay whenever online, offering their data to interested researchers. Privacy of data contributors is protected in multiple ways:

Our work relies in part on the secure query framework called Patrol, developed by PI Mirkovic under another NSF-funded project. This framework allows only for queries about aggregate features of the data, such as counts, distributions, etc. and preserves user privacy by applying k-anonymity and l-diversity principles.

Critter@home Architechture

The Query Process

A query submitted to Critter will go through a five step process illustrated below.

  1. A researcher submits a query via the public portal.
  2. Critter clients connect and poll for new queries via an anonymizing network.
  3. The researcher's stored query is sent to clients.
  4. Patrol processes the query if the Query Policy permits, and returns encrypted results along with information on how a contributor wants its response aggregated.
  5. Aggregated results are stored and can be retrieved.

Software

Publications


This material is based upon work supported by the National Science Foundation under Grant No. 1224035. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.