Get the data from the drive. The easiest is to download the pickled water_dataset.pkl
and to place it in project2/pickled_data
. Then, run test.py
located in project2/src
. For quick results, we recommend setting the number of particles NMC
to 100,000 or less.
Beyond libraries used in the course, the following libraries were used:
- pandas for data management and analysis.
- sklearn for logistic regression.
- zipfile for zipped data parsing.
Contains all the machine learning related aspects of the code. The file cgan.py
contains superclasses for cGAN generators, critics and hyperparameters, as well as a few other convenience methods. Then, each component of the ML system has a file inside the same folder. To re-train a model, one can run its corresponding file.
Contains files for data parsing and handling.
Saves models to disk for future loading.