Skip to content

Commit

Permalink
Merge pull request #305 from aditkumar72/readmes
Browse files Browse the repository at this point in the history
Added and updated README files for vision models #218
  • Loading branch information
ToucheSir authored Sep 28, 2021
2 parents 83d3015 + d7ad415 commit 2bd86ff
Show file tree
Hide file tree
Showing 18 changed files with 340 additions and 10 deletions.
45 changes: 45 additions & 0 deletions vision/cdcgan_mnist/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Conditional DC-GAN

<img src="../cdcgan_mnist/output/img_for_readme.png" width="440"/>

[Source](https://arxiv.org/pdf/1411.1784.pdf)

## Model Info

Generative Adversarial Networks have two models, a _Generator model G(z)_ and a _Discriminator model D(x)_, in competition with each other. G tries to estimate the distribution of the training data and D tries to estimate the probability that a data sample came from the original training data and not from G. During training, the Generator learns a mapping from a _prior distribution p(z)_ to the _data space G(z)_. The discriminator D(x) produces a probability value of a given x coming from the actual training data.
This model can be modified to include additional inputs, y, on which the models can be conditioned. y can be any type of additional inputs, for example, class labels. _The conditioning can be achieved by simply feeding y to both the Generator — G(z|y) and the Discriminator — D(x|y)_.

## Training

```shell
cd vision/cdcgan_mnist
julia --project cGAN_mnist.jl
```

## Results

1000 training steps

![1000 training steps](../cdcgan_mnist/output/cgan_steps_001000.png)

3000 training steps

![30000 trainig steps](../cdcgan_mnist/output/cgan_steps_003000.png)

5000 training steps

![5000 training steps](../cdcgan_mnist/output/cgan_steps_005000.png)

10000 training steps

![10000 training steps](../cdcgan_mnist/output/cgan_steps_010000.png)

11725 training steps

![11725 training steps](../cdcgan_mnist/output/cgan_steps_011725.png)

## References

* [Mirza, M. and Osindero, S., “Conditional Generative Adversarial Nets”, <i>arXiv e-prints</i>, 2014.](https://arxiv.org/pdf/1411.1784.pdf)

* [Training a Conditional DC-GAN on CIFAR-10](https://medium.com/@utk.is.here/training-a-conditional-dc-gan-on-cifar-10-fce88395d610)
7 changes: 5 additions & 2 deletions vision/cdcgan_mnist/cGAN_mnist.jl
Original file line number Diff line number Diff line change
Expand Up @@ -185,5 +185,8 @@ function train(; kws...)
return Flux.onecold.(cpu(fixed_labels))
end

cd(@__DIR__)
fixed_labels = train()
if abspath(PROGRAM_FILE) == @__FILE__
train()
end


Binary file added vision/cdcgan_mnist/output/img_for_readme.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
26 changes: 26 additions & 0 deletions vision/conv_mnist/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# LeNet-5

![LeNet-5](../conv_mnist/docs/LeNet-5.png)

[Source](https://d2l.ai/chapter_convolutional-neural-networks/lenet.html)

## Model Info

At a high level LeNet (LeNet-5) consists of two parts:
(i) _a convolutional encoder consisting of two convolutional layers_;
(ii) _a dense block consisting of three fully-connected layers_

The basic units in each convolutional block are a convolutional layer, a sigmoid activation function, and a subsequent average pooling operation. Each convolutional layer uses a 5×5 kernel and a sigmoid activation function. These layers map spatially arranged inputs to a number of two-dimensional feature maps, typically increasing the number of channels. The first convolutional layer has 6 output channels, while the second has 16. Each 2×2 pooling operation (stride 2) reduces dimensionality by a factor of 4 via spatial downsampling. The convolutional block emits an output with shape given by (batch size, number of channel, height, width).

## Training

```shell
cd vision/conv_mnist
julia --project conv_mnist.jl
```

## References

* [Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, "Gradient-based learning applied to document recognition," in Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998, doi: 10.1109/5.726791.](http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf)

* [Aston Zhang, Zachary C. Lipton, Mu Li and Alexander J. Smola, "Dive into Deep Learning", 2020](https://d2l.ai/chapter_convolutional-neural-networks/lenet.html)
5 changes: 4 additions & 1 deletion vision/conv_mnist/conv_mnist.jl
Original file line number Diff line number Diff line change
Expand Up @@ -157,4 +157,7 @@ function train(; kws...)
end
end

train()
if abspath(PROGRAM_FILE) == @__FILE__
train()
end

Binary file added vision/conv_mnist/docs/LeNet-5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
39 changes: 39 additions & 0 deletions vision/dcgan_mnist/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Deep Convolutional GAN (DC-GAN)

![dcgan_gen_disc](../dcgan_mnist/output/dcgan_generator_discriminator.png)
[Source](https://gluon.mxnet.io/chapter14_generative-adversarial-networks/dcgan.html)

## Model Info

A DC-GAN is a direct extension of the GAN, except that it explicitly uses convolutional and transposed convolutional layers in the discriminator and generator, respectively. The discriminator is made up of strided convolutional layers, batch norm layers, and LeakyReLU activations. The generator is comprised of transposed convolutional layers, batch norm layers, and ReLU activations.

## Training

```script
cd vision/dcgan_mnist
julia --project dcgan_mnist.jl
```

## Results

2000 training steps

![2000 training steps](../dcgan_mnist/output/dcgan_steps_002000.png)

5000 training steps

![5000 training steps](../dcgan_mnist/output/dcgan_steps_005000.png)

8000 training steps

![8000 training steps](../dcgan_mnist/output/dcgan_steps_008000.png)

9380 training steps

![9380 training steps](../dcgan_mnist/output/dcgan_steps_009380.png)

## References

* [Radford, A. et al.: Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, http://arxiv.org/abs/1511.06434, (2015).](https://arxiv.org/pdf/1511.06434v2.pdf)

* [pytorch.org/tutorials/beginner/dcgan_faces_tutorial](https://pytorch.org/tutorials/beginner/dcgan_faces_tutorial.html)
6 changes: 4 additions & 2 deletions vision/dcgan_mnist/dcgan_mnist.jl
Original file line number Diff line number Diff line change
Expand Up @@ -148,5 +148,7 @@ function train(; kws...)
save(@sprintf("output/dcgan_steps_%06d.png", train_steps), output_image)
end

cd(@__DIR__)
train()
if abspath(PROGRAM_FILE) == @__FILE__
train()
end

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
20 changes: 20 additions & 0 deletions vision/mlp_mnist/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Multilayer Perceptron (MLP)

![mlp](../mlp_mnist/docs/mlp.svg)

[Source](http://d2l.ai/chapter_multilayer-perceptrons/mlp.html)

## Model Info

An MLP consists of at least three of nodes: an input layer, a hidden layer and an output layer. Except for the input node each node is a neuron that uses a nonlinear activation function. MLP utilizes a supervised learning technique called backpropagation for training. Its multiple layers and non-linear activation distinguish MLP from a linear perceptron. It can distinguish data that is not linearly separable.

## Training

```script
cd vision/mlp_mnist
julia --project mlp_mnist.jl
```

## Reference

* [Aston Zhang, Zachary C. Lipton, Mu Li and Alexander J. Smola, "Dive into Deep Learning", 2020](http://d2l.ai/chapter_multilayer-perceptrons/mlp.html)
Loading

0 comments on commit 2bd86ff

Please sign in to comment.