R³ACE - Replicable and Reproducible Real-World Autonomous Cyber Environments

Motivation

Existing cyber environments incorporating low fidelity models and high levels of abstraction _[1],[2], are valuable for exploring more fundamental research problems in autonomous cyber defence (ACD), such as task and entity generalisation.

However, there remain concerns about the reality-gap between cyber environments and real-world cyber-systems _[3].

Environments attempting to close the reality-gap have been varied in their approach, robustness and public availability _[4],[5].

Moreover, the decision problems presented by the more realistic environments that are currently available, are extremely challenging. It is complex both to make progress and to measure it. In these environments research faces the complexity of realism combined with scale.

Contributions to Autonomous Cyber Defence (ACD) research:

Explore the challenges of realism, without the challenge of scale - focusing on replicability and reproducibility.
Specifically, train and evalutate ACD agents against continuous-time, conurrently-running cyber systems. This is a departure from cyber environments that expose discrete-time, turn-based finite state machines.
Investigate both physical and virtual systems, with accessible and reproducible infrastructure.

Overview

Objectives

Make publicly accessible a real-world cyber environment that models a minimum viable cyber system - minimum in it's scale and complexity; prioritising replicability, reproducibility and public accessibility.
Define a simple decision problem using the R³ACE infrastructure, providing a reference implementation.

A minimum viable cyber system

Our design for a minimum viable cyber system (there are surely others) consists of 3 machines networked together via a switch.

Central to the task of evaluating the defence of a cyber system is a consensus on it's utility - the value or service that the system provides, upon which stakeholders depend. Simulating this utility is an important and challenging task in the design of cyber environments.

We propose that one of the machines, the 'web client', finds utility in receiving OK (status code = 200) responses to GET requests it sends to another machine, the 'web server'. The third machine, the 'adversary' has the objective of carrying out a successful cyber attack. An attack is deemed successful if the utility of the system is comprimised, i.e. the web client does not receive OK (status code = 200) responses to GET requests sent to the web server.

The system can be instantiated with either physical hardware (e.g. a network of Raspberry Pi's) or virtualised hardware.

A Simple Decision Problem

Scenario

The web client sends a stream of GET requests at a constant rate, $R$. The adversary carries out a Denial-of-Service (DOS) attack. And a 'blue agent' (the blue program, see Reference Implementation section below), is running on the web server machine with the objective of ensuring that web client GET requests receive OK (status code = 200) responses.

Decision Problem

The blue agent has access to the following information:

The web client request rate, $R$.
The average rate of OK (status code = 200) responses received by the web client.
The IP addresses of the other two machines in the network, though it is not know which machine is the adversary.
A list of the IP addresses on the firewall blocklist. Requests to the HTTP server (on the web server machine) from these IP addresses are blocked by the system firewall - they do not reach the HTTP server.

The blue agent is able to execute the followingactions:

Add or remove IP addresses from the web server blocklist.
Do nothing.

Evaluation

The agent is evaluated against the cumulative number of OK (status code = 200) responses received by the web client, as a fraction of the total number of requests made by the web client.

TL;DR

The blue agent knows the current rate of OK responses received by the client and must figure out which of the two IP addresses belongs to the web client, blocking the other IP address which belongs to the adversary.

Reference Implementation

The reference implementation implements the decision problem described above, on R³ACE infrastructure. The following software components have been developed:

serving (HTTP server): An executable running on the web server machine, responding to GET requests with OK (status code = 200) responses.
getting (HTTP client): An executable running on the web client machine, sending GET requests to the HTTP server at a constant rate and emmitting a signal whenever an OK (status code = 200) response is received.
blue (blue agent): An executable running on the web server machine, managing the HTTP server blocklist, and parsing the signal emmited by the web client to evaluate the current average rate of OK responses.
policy-server (Python policy server): A Python HTTP server that can be used to run Python policies against the R3ACE environment. For ease, this repo implements the OpenAI gymnasium interface for training RL agents.
got (analysis): A repository with a library of helper functions for parsing and plotting the log files written by the above executables. A catalogue of experiments (each with a .ipynb notebook, plots and some basic analysis) serve as a reference point for designing, running and analysing experiments.
eyes (parsing): A dependency of got, a library of types and parsing functions. The got README describes how to install eyes as a dependency.

It is also necessary to extend the R³ACE infrastructure for this decision problem, to provide an out-of-band communication channel between the web client machine and the web server machine. This enables the blue program to receive the OK response signals emmited by the web client, reliably, even during a successful attack launched by the adversary.

In this reference implementation the out-of-band channel of communication is provided by:

For the physical infrastructure: a wired serial connection (using a Null Modem cable).
For the virtual infrastructure: a UDP connection on a seperate private LAN, isolated from the main R³ACE network.

As such, the getting software and the blue software support both serial and UDP protocols for emmiting and receiving information on the out-of-band channel.

Below is an example plot, generated by an R3ACE experiment. A basic policy solves the decision problem described above. After a period of exploration the policy correctly keeps the adversary in the firewall blocklist, allowing requests from the web client through to the HTTP server. The got library was used for parsing and plotting the information in the log files gathered from the environment.

Quick Questions

How is this different to other Cyber Environments?

In contrast with 'AI Gym' environments, such as those available in the OpenAI/gym project _[6], R³ACE does not expose a turn-based API (e.g. the environment, a finite state machine, 'waits' while the policy computes an action, after which the environment is 'stepped' forwards to the agents next turn).

Important

R³ACE is a real computer network (cyber infrastructure) with a cyber defence software program, blue, running on one of the machines.

The `blue` program

This software:

Fetches information from the cyber system (the surrounding compute network).
Uses a policy to decide what (if any) action to take.
Executes this action, causing a side effect in the cyber system (e.g. an IP address is added to a block list).

The `markov` library

The above design for a program should infact be useful for the application of ACD to many realistic, or indeed real-world cyber systems. As such, we have designed and documented an abstract software interface, the markov library, to generalise over different cyber systems and policies.

How do you train or evaluate policies?

In R³ACE, a policy is a software implementation of the RLPolicyType interface. This interface is one of the building blocks that make up the modular software interface for the blue program.

At present, two policies are implemented and the Command Line Interface (CLI) for the blue program determines which policy is used. One of the policy implementations, the ServerPolicy, makes HTTP requests to a policy server running elsewhere (e.g. on the same machine, or another machine on the same network). This may be useful for defining and running policies in another language, e.g. Python. We have developed a Python policy server (policy-server) that exposes the R3ACE environment as an OpenAI gymnasium environment.

Training or evaluation occurs when the cyber system is running (all network hosts are up with applications running, e.g. servers, clients, logging daemons) and the blue program is running (with an embedded policy, as discussed). The distinction between training and evaluation comes down to whether or not the policy implementation is 'self-optimising' at that point in time (e.g. the program mutating the policy based on the rewards returned by the reward function).

Getting Started

You could get started in a number of ways. In order of increasing ambition:

Run the reference implementation (with the basic policy provided), against the Simple Decision Problem to reproduce the plot above.
Use the reference implementation and the Simple Decision Problem as a Cyber Environment: Develop new agent policies, train and evaluate against different cyber attacks.
Design a more complex decision problem on the R³ACE minimum viable system. Implement the required functionality (perhaps new observations or actions). Train agents to solve this problem.
Inspired by the R³ACE approach to Replicable and Reproducible Real-World Cyber Environments, design an alternative cyber system, perhaps increasing the scale or complexity of the infrastructure.

Set Up R³ACE Infrastructure - Physical or Virtual

How are the machines reproducibly configured?

The machines run NixOS _[7], a Linux distribution based on the Nix package management system _[8]. Producing "reproducible, declarative and reliable systems" is central to Nix's design.

This repository contains operating system (OS) declaration files for each of the 3 machines. These OS configurations can be reproducibly installed onto both physical and virtual machines.

Find out more about NixOS at nixos.org, or see some quick tips at ./docs/nixos-tips.md: which includes tips on packaging your own software as a Nix Package, making it installable onto one of the NixOS machines.

TL;DR: An OS is declared with a flake.nix file, but most of the OS configuration can be found in a configuration.nix file, which is imported by the flake.nix file. In short, the two files together with any other anxillary documents (SSL certificates, config files for programs to be run on the machines, etc.) make up the full configuration of each machine.

Using physical infrastructure: Raspberry Pis

To install a configuration onto a Raspberry Pi Model 4B, following these steps:

Log on to a development machine (Linux or MacOS), i.e. not the target Raspberry Pi. This machine must have an SD reader (use an SD card reader USB dongle).
Download the latest release of NixOS as an SD card image from here (or instead select a particular release version).
Insert the Raspberry Pi's SD card and identify the path to the card (e.g. something like /dev/sda on Linux or /dev/disk4 on MacOS). On MacOS, running diskutil list can help to find the SD card device.
Decompress the SD card image with unzstd <the downloaded image.zst>.
Flash the image onto the SD card with sudo dd if=<path to decompressed sd image (a .img file)> of=<path to SD card (e.g. /dev/sda)> bs=4096 conv=fsync status=progress.
Insert the SD card into the Raspberry Pi and boot the device with a display (via the mini HDMI port) and keyboard connected. The Pi will boot into the nixos user, which has password-less sudo privileges.

Carry out the following steps on the Raspberry Pi:

On the Pi, create a new user with sudo useradd <new username>. For convenience, this username should match a user configured in the desired configuration.nix file that will be copied to the Raspberry Pi later (e.g. one of the configuration.nix files in this repo).
Add a password for that user with sudo passwd <new username>.
Run nixos-generate-config to generate two files: /etc/nixos/configuration.nix and /etc/nixos/hardware-configuration.nix.
Shutdown the device with sudo shutdown.

Carry out the folling steps on a seperate Linux machine:

Insert the SD card, and navigate to the NIXOS_SD partition of the SD card, which is the Linux file system.
Replace the auto-generated configuration.nix file with the desired replacement - the replacement should be edited to ensure the system.stateVersion number matches that of the auto-generated file, this is the NixOS release version.
Do not replace the hardware-configuration.nix file.
Add any other required files to /etc/nixos, e.g. a flake.nix file, or a certificate for a certificate authority that is referenced in configuration.nix (this is the case with kleene in this repo).

Carry out the following steps on the Raspberry Pi:

Insert the SD card and boot.
Configure a (temporary) Wifi connection:

sudo systemctl start wpa_supplicant

followed by

wpa_cli

to configure the interface

> add_network

(Which tells you which network you've added; in my case 0.)

> set_network 0 ssid "MY WIFI"
OK
> set_network 0 psk "NETWORK PASSWORD"
OK
> enable_network 0
OK
<3>CTRL-EVENT-SCAN-STARTED
<3>CTRI-EVENT-SCAN-RESULTS
<3>Trying to associate with SSID 'MY WIFI'
<3>Associated with xx:xx:xx:xx:xx:xx
... some other stuff
> save_config
OK

Test the new configuration before switching into it with: sudo nixos-rebuild test. This will likely throw an error. See below.

If using a flake.nix, the configuration targets a certain hostname, described by nixosConfigurations."<hostname>".<etc..>.
This hostname must match the system hostname at the time of running nixos-rebuild commands.
The hostname will be nixos until changed.
The system hostname is changed by setting the networking.hostName = "<new hostname>"; property in the configuration.nix file.
Workaround: Set the desired new hostname in configuration.nix, build the configuration for the current hostname (nixos), by setting that in flake.nix.
After the first reboot the system hostname will have been updated to the new hostname.
Future configurations must now be built for the new hostname, by updating the value in flake.nix.

Using virtual infrastructure

As an alternative to booting NixOS on physcial hardware, virtual machine (VM) images or 'virtual disks' (with OS pre-installed) can be built from the NixOS config files and the network between the machines can be virtualised.

Pre-built aarch64 VMWare VM disks are available here for the 3 machines in this project.

These VMs should run with minimal set-up when hosted on a laptop or PC with an ARM processor e.g. ARM-linux, or MacOS with Apple Silicon chips (M1, M2, ...).

./docs/building-vm-images.md includes details about building VM images and virtual disks (such as the pre-built VMWare VM disks made available). In principal, the software used to build the images (nixos-generators) also supports building images for many other platforms: VirtualBox, Amazon EC2, Docker, Azure - though building for these platforms has not been tested during this project.

Tutorial for MacOS

Download VMWare Fusion Pro 13 (available for free). Either directly from Broadcom (requires making an account with Broadcom), or via HomeBrew (easier download and installation with brew install --cask vmware-fusion).
Create a 'host-only' private LAN network within VMWare across which the guest VMs will be able to communicate:
- Open VMWare.
- Go to the application settings.
- Open the 'Network' tab.
- Add a new custom network with the + button.
- Disable the box Allow virtual machines on this network to connect to external networks (using NAT). We do not want to allow the guest VMs to be connected to the external networks that your host machine is connected to.
- Disable the box to Connect the host Mac to this network.
- Enable the box to Provide addresses on this network via DHCP.
- Enter the Subnet IP: 172.0.0.0.
- Enter the Subnet Mask: 255.255.255.0.
- Leave MTU: System Configuration.
- Double-click on the name of the network to change it to r3ace.
- Click Apply.
- Now Disable Provide addresses on this network via DHCP. The setting for the Subnet IP and Subnet Mask with go grey but remain configured with the values above. This is important.
- Click Apply again.
Create another private network that will be shared between only the 'web client' and 'web_server'. This network, isolated from the first, will be an out-of-band communication channel between the client and server. To do so, repeat step two with the following differences:
- Enter the Subnet IP: 172.0.1.0.
- Name the network r3ace-udp. Follow all of the other steps, as before, in sequence.
Create a third private network that will enable us to send requests from the VM host to the VMs, or open SSH connections to them. The VMs may be disconnected from this network at a later date to 'air-gap' the VMs from your host Mac for safety (e.g. running experiments involving harmful software). To do so, repeat step two with the following differences:
- Enable the box to Connect the host Mac to this network.
- Enter the Subnet IP: 172.0.2.0.
- Name the network ssh.
Create a new virtual machine from a pre-built VM disk of your choice. It's recommended to setup hilbert first, because the networking can be tested by making HTTPS requests to the web server installed on the hilbert VM disk.
- Open VMWare and click the + icon to create a new VM.
- Click Create a custom virtual machine and the Continue.
- For Choose Operating System, pick Other and Other 64-bit arm and Continue.
- Select Use an existing virtual disk, then click Choose virtual disk... to select the VM disk file. This should be a file with a .vmdk file extension. Ensure that Make a seperate copy of the virtual disk is selected. Click Continue.
- Click Customize Settings to bring up the setting for the VM once it has been created. Choose an appropriate name for the new VM and click Save to create it.
Adjust the setting of the newly created VM:
- Under Processors and Memory, allocate appropriate resources to the VM guest (e.g. If you have an 8-core M1 Mac with 32Gb of RAM then perhaps allocate 2 cores to the VM, and 4Gb of memory).
- Under Network Adapter ensure that appropriate network(s) are selected. post should be connected to only the r3ace network; hilbert and kleene to both the r3ace network and the r3ace-udp network.
- VMs can be connected to the ssh network during setup (if necessary) so that ssh and scp can be used between the VM host and the VMs.
- Before booting for the first time, enter the correct mac address for each network adapter. The mac addresses are configured in the Advanced options drop down of the setting for each network adapter. The mac addresses for each VM and each network adapter are as follows:

	`r3ace`	`r3ace-udp`	`ssh`
`hilbert`	`02:00:00:00:03:00`	`02:00:00:00:03:01`	`02:00:00:00:03:02`
`kleene`	`02:00:00:00:02:00`	`02:00:00:00:02:01`	`02:00:00:00:02:02`
`post`	`02:00:00:00:04:00`	n/a	`02:00:00:00:04:02`

Boot the VM.
The users setup on each vm are as follows:

	username	password
`hilbert`	`blue`	`changeme`
`kleene`	`green`	`changeme`
`post`	`red`	`changeme`

Once logged-in new passwords can be set for all users.

References

[1] “Cyber Operations Research Gym.” 2022. https://github.com/cage-challenge/CybORG; GitHub.

[2] Andrew, Alex, Sam Spillard, Joshua Collyer, and Neil Dhir. 2022. “Developing Optimal Causal Cyber-Defence Agents via Cyber Security Simulation.” In International Confernece on Machine Learning (ICML).

[3] Andrew Lohn, Ant Burke, Anna Knack, and Krystal Jackson. 2023. “Autonomous Cyber Defence: A Roadmap from Lab to Ops.” CETaS Research Reports.

[4] Oesch, Sean, Amul Chaulagain, Brian Weber, Matthew Dixson, Amir Sadovnik, Benjamin Roberson, Cory Watson, and Phillipe Austria. 2024. “Towards a High Fidelity Training Environment for Autonomous Cyber Defense Agents.” In Proceedings of the 17th Cyber Security Experimentation and Test Workshop, 91–99. CSET ’24. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3675741.3675752.

[5] Hammar, Kim, and Rolf Stadler. 2023. “Digital Twins for Security Automation.” In NOMS 2023-2023 IEEE/IFIP Network Operations and Management Symposium, 1–6. https://doi.org/10.1109/NOMS56928.2023.10154288.

[6] Brockman, Greg, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. “OpenAI Gym.”

[7] DOLSTRA, EELCO, ANDRES LÖH, and NICOLAS PIERRON. 2010. “NixOS: A Purely Functional Linux Distribution.” Journal of Functional Programming 20 (5–6): 577–615. https://doi.org/10.1017/S0956796810000195.

[8] Dolstra, Eelco, Merijn de Jonge, and Eelco Visser. 2004. “Nix: A Safe and Policy-Free System for Software Deployment.” In Proceedings of the 18th USENIX Conference on System Administration, 79–92. LISA ’04. USA: USENIX Association.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
docs		docs
figures		figures
hilbert		hilbert
kleene		kleene
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
references.bib		references.bib

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

R³ACE - Replicable and Reproducible Real-World Autonomous Cyber Environments

Motivation

Contributions to Autonomous Cyber Defence (ACD) research:

Overview

Objectives

A minimum viable cyber system

A Simple Decision Problem

Scenario

Decision Problem

Evaluation

TL;DR

Reference Implementation

Quick Questions

How is this different to other Cyber Environments?

The `blue` program

The `markov` library

How do you train or evaluate policies?

Getting Started

Set Up R³ACE Infrastructure - Physical or Virtual

How are the machines reproducibly configured?

Using physical infrastructure: Raspberry Pis

Using virtual infrastructure

Tutorial for MacOS

References

About

Releases

Packages

Languages

License

edchapman88/r3ace

Folders and files

Latest commit

History

Repository files navigation

R3ACE - Replicable and Reproducible Real-World Autonomous Cyber Environments

Motivation

Contributions to Autonomous Cyber Defence (ACD) research:

Overview

Objectives

A minimum viable cyber system

A Simple Decision Problem

Scenario

Decision Problem

Evaluation

TL;DR

Reference Implementation

Quick Questions

How is this different to other Cyber Environments?

The blue program

The markov library

How do you train or evaluate policies?

Getting Started

Set Up R3ACE Infrastructure - Physical or Virtual

How are the machines reproducibly configured?

Using physical infrastructure: Raspberry Pis

Using virtual infrastructure

Tutorial for MacOS

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

R³ACE - Replicable and Reproducible Real-World Autonomous Cyber Environments

The `blue` program

The `markov` library

Set Up R³ACE Infrastructure - Physical or Virtual

Packages