Introducing Packet Café

Charlie L
Cyber Reboot
Published in
5 min readMay 1, 2020

--

IQT Labs is excited to open source a new Cyber Reboot project called Packet Café, an easy-to-use, automated network traffic analysis platform. This post provides the background, motivation, and description of the project.

Already know about the project and want to take Packet Café for a spin already? Go here: https://iqtlabs.gitbook.io/packet-cafe/deployment/prerequisites

In case you need more convincing, read on . . .

The strength of any machine learning model depends on the quality and diversity of the training data on which the model is trained. Poseidon and NetworkML — projects Cyber Reboot has been working on the last couple years — require network traffic data to enable machine learning to predict what types of devices are communicating on a network.

Cybersecurity research using computer network traffic relies on relatively few, often outdated public datasets. Not only does this inhibit cybersecurity-related network traffic research, but it also impedes Cyber Reboot’s own machine learning and network traffic research.

The network traffic dataset most often used for computer security research (KDD99) was created over 20 years ago and has been the subject of sustained criticism. While other research groups have created newer network traffic datasets, these efforts often use simulated rather than real computer networks. The result of this: stifled network traffic and machine learning research.

The increasing complexity of global computer networks over the last many years brings about numerous sorts of network vulnerabilities as well. In both the private and public sectors, new techniques for observing network traffic behavior are vital to safeguarding their assets. Machine learning can provide deeper understanding of behavior in network traffic data but requires modern datasets upon which to train.

Packet Café initially set forth to build a new, public dataset of high-quality, modern computer network traffic to drive cybersecurity research and innovation for our collective benefit. This effort quickly evolved into an exploration of the legal and policy challenges and risks associated with creating and making publicly available network traffic datasets created from real-world network traffic. Most significant issues among those include:

  • Lack of clarity as to whether, when, and where an IP address may be “personal data” subject to various data protection laws.
  • Lack of protections for much more tangible personal identifiers found in MAC addresses.
  • Broadly written laws that may include IP addresses as personal data, even when they do not reveal anything about an individual.
  • Broadly written laws that do not adequately distinguish between data that could conceivably identify a person and network traffic data that could not.
  • Lack of specificity, caselaw, or regulations around these laws to better anticipate how regulators or the public would view our efforts.

Collectively, these issues posed insurmountable hurdles for our research and innovation efforts — even though our efforts were solely for the public good, as well as making our efforts to assess the risk of such efforts nearly impossible.

Given the challenges and risks surrounding the creation and publication of a modern, high-quality, diverse, and real-world network traffic dataset, the Cyber Reboot team pivoted. Instead we built (and now open sourced!) Packet Café as a platform for running network traffic captures through a pipeline of analysis tools. The purpose of these tools is to increase the transparency of what is actually contained in network traffic captures. Furthermore, most of these tools provide analysis without using any payload data.

All of this provides teams with limited resources and time to get automated analysis of their own network traffic captures without having to be an expert in the tooling.

Packet Café is built for easy-to-use automated network traffic analysis. It is configured to be modular and allow for a pipeline of tools that are triggered by different inputs and outputs.

This service accepts PCAP files and then processes them against the pipeline of tools providing automated analysis that gets returned in JSON format. These objects can then be consumed via API directly and put into other systems such as SIEMs, searched and filtered through the included JSON viewer, or viewed through the included visualizations provided in the Packet Café frontend.

Packet Café is built as a series of components, each with their own purpose while still being flexible to add or remove components as needed. There are seven major components and any number of analytic processes, or tools, that can be included. By default, there are nine analytic processes included.

Packet Café Component Design

An end user interacts with the ui component for uploading PCAP files and interacting with the results of the analytic processes. The ui component was built with ReactJS and can be scaled out to n instances behind the load balancer.

The lb component is the load balancer which processes requests to both instances of the ui component as well as instances of the web component. It allows both of those components to scale out to n instances as needed.

The web component serves up a RESTful API for retrieving results from the analytic processes and passing them on to the ui component.

Optionally, there is an admin component that serves up a RESTful API for making requests about the service at a global level to identify sessions, IDs, and files.

The messenger component is a RabbitMQ server which brokers messages between the web components, the worker components, and the analytic processes.

The worker component is responsible for taking requests from the web component of files to process and spinning up analytic processes to process those files. These processes can be run in parallel as well as in a pipeline that feeds inputs and outputs in a chain of analytic processes. The worker component can be scaled out to n number of workers and maintains the state and status of jobs in the redis component.

Finally, the redis component is a Redis server which stores state and status of jobs across the entire system.

All of these components are built to be run in containers and controlled through docker-compose for ease of deployment.

Packet Café aims to lower the barrier to understanding what is actually in a network traffic capture file (PCAP) and provide insight without being a networking expert.

Continuing efforts include collaborating with our Viz team to visualize packet data and analysis more effectively.

Head on over to our GitHub page and give it a spin if you’re interested, it can run locally on a single machine, or be deployed into cloud environments. Feedback and contributions welcome!

--

--