Sonification of convolutional neural networks
Currently, we've got a modified pretrained SqueezeNet v1.0 running some home images, and dumping the activations at certain layers out as numpy pickles.
To do this, run python3 test_squeezenet.py
from /src/models/
utils.py
has a function that writes an activation to a sound. It's
pretty rad. Right now it wrote out the audio for the first activation from a
1024 px image of mixing_bowl.jpg
- Dump images in
test_squeezenet.py
- Explore the activations at different layers
- Made sounds for each activation using two generation schemes:
- Concatenating each filter at a level
- Summing all filters at a level
- Made sounds for each activation using two generation schemes:
- Put as much of this on a GPU as possible:
- Use PyTorch as much as possible
- Make sure we can still use CPU (in case of use on Pi, for instance)
- activations in this repo are generated from images with small edge length of
256:
- This means that the activations are very small
- To get higher-fidelity activations, we must use larger images, but we quickly run out of memory (and storage on GitHub)
- I've tried putting larger activations in, but my machine (8GB RAM) runs out of memory on anything larger than 1024
- I put the activations for
image_size=1024
in here