The primary module code is located inside the csl
directory.
At the module level:
datasets.py
- Adds datasets into the automatic training framework.
synthesizers.py
- Adds synthetic models into the automatic training framework.
data_generators
- Directory containing definitions of synthetic models that generate data. Contains GANs and VAEs.
utils
- Utility functions for visualizations, managing directories, etc.
Tutorials for running the module code are located in the notebooks
directory. These notebooks have runnable examples of the synthetic data generation with different datasets, models, and privacy options.
Research code and current experimentation that hasn't been formally added to the module are located in the experiments
directory. Particular experiments of interest include:
asynch
- Experiments for the collaborative training framework.
immediate_sensitivity
- Experiments for diffrential privacy implementations and evaluations.
task3
- Experiments for synthetic data generation.
The attack script attack.py
performs the adapted conditional Hayes attack on
conditional MNIST GANs and the standard Hayes attack on unconditional CelebA
WGANs developed by Two Six Technologies as part of DARPA's CSL program.
As stated in the docstring, the attack may be executed with a command like
python -m attack \
--gpu 0 \
--generate_samples \
--compute_fid \
--save \
--data_dir /path/to/data/ \
--model_dir /path/to/models/ \
--model_name MNIST-Baseline \
--checkpoint_min 100 \
--checkpoint_step 100 \
--checkpoint_max 1000
and should run without code modifications under two assumptions:
- The code is executed such that the repos csl-gan/ and csl-gan/opacus/ are subdirectories of your current working directory.
- The
--data_dir
and--model_dir
arguments are given to the directories containing your specific data and model directories. For instance, your csl-gan MNIST GAN might source data from /path/to/data/MNIST/ with model directory /path/to/models/MNIST-Baseline/.