Skip to content

Commit

Permalink
Merge branch 'dims' of https://github.com/aditkumar72/model-zoo into …
Browse files Browse the repository at this point in the history
…dims
  • Loading branch information
aditkumar72 committed Jun 19, 2021
2 parents a2bb58f + 27bf1a6 commit 87c934a
Show file tree
Hide file tree
Showing 18 changed files with 358 additions and 10 deletions.
45 changes: 45 additions & 0 deletions vision/cdcgan_mnist/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Conditional DC-GAN

<img src="..\cdcgan_mnist\output\img_for_readme.png" width="440"/>

[Source](https://arxiv.org/pdf/1411.1784.pdf)

## Model Info

Generative Adversarial Networks have two models, a _Generator model G(z)_ and a _Discriminator model D(x)_, in competition with each other. G tries to estimate the distribution of the training data and D tries to estimate the probability that a data sample came from the original training data and not from G. During training, the Generator learns a mapping from a _prior distribution p(z)_ to the _data space G(z)_. The discriminator D(x) produces a probability value of a given x coming from the actual training data.
This model can be modified to include additional inputs, y, on which the models can be conditioned. y can be any type of additional inputs, for example, class labels. _The conditioning can be achieved by simply feeding y to both the Generator — G(z|y) and the Discriminator — D(x|y)_.

## Training

```shell
cd vision/cdcgan_mnist
julia --project cGAN_mnist.jl
```

## Results

1000 training step

![1000 training step](../cdcgan_mnist/output/cgan_steps_001000.png)

3000 training step

![30000 trainig step](../cdcgan_mnist/output/cgan_steps_003000.png)

5000 training step

![5000 training step](../cdcgan_mnist/output/cgan_steps_005000.png)

10000 training step

![10000 training step](../cdcgan_mnist/output/cgan_steps_010000.png)

11725 training step

![11725 training step](../cdcgan_mnist/output/cgan_steps_011725.png)

## References

* [Mirza, M. and Osindero, S., “Conditional Generative Adversarial Nets”, <i>arXiv e-prints</i>, 2014.](https://arxiv.org/pdf/1411.1784.pdf)

* [Training a Conditional DC-GAN on CIFAR-10](https://medium.com/@utk.is.here/training-a-conditional-dc-gan-on-cifar-10-fce88395d610)
7 changes: 5 additions & 2 deletions vision/cdcgan_mnist/cGAN_mnist.jl
Original file line number Diff line number Diff line change
Expand Up @@ -185,5 +185,8 @@ function train(; kws...)
return Flux.onecold.(cpu(fixed_labels))
end

cd(@__DIR__)
fixed_labels = train()
if abspath(PROGRAM_FILE) == @__FILE__
train()
end


Binary file added vision/cdcgan_mnist/output/img_for_readme.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
32 changes: 32 additions & 0 deletions vision/conv_mnist/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# LeNet-5

![LeNet-5](../conv_mnist/docs/LeNet-5.png)

[Source](https://d2l.ai/chapter_convolutional-neural-networks/lenet.html)

## Model Info

At a high level LeNet (LeNet-5) consists of two parts:
(i) _a convolutional encoder consisting of two convolutional layers_;
(ii) _a dense block consisting of three fully-connected layers_

The basic units in each convolutional block are a convolutional layer, a sigmoid activation function, and a subsequent average pooling operation. Each convolutional layer uses a 5×5 kernel and a sigmoid activation function. These layers map spatially arranged inputs to a number of two-dimensional feature maps, typically increasing the number of channels. The first convolutional layer has 6 output channels, while the second has 16. Each 2×2 pooling operation (stride 2) reduces dimensionality by a factor of 4 via spatial downsampling. The convolutional block emits an output with shape given by (batch size, number of channel, height, width).

## Training

```shell
cd vision/conv_mnist
julia --project conv_mnist.jl
```

## References

* [Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, "Gradient-based learning applied to document recognition," in Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998, doi: 10.1109/5.726791.](http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf)

* [@book
{zhang2020dive,
title={Dive into Deep Learning},
author={Aston Zhang and Zachary C. Lipton and Mu Li and Alexander J. Smola},
note={\url{https://d2l.ai}},
year={2020}
})](https://d2l.ai/chapter_convolutional-neural-networks/lenet.html)
5 changes: 4 additions & 1 deletion vision/conv_mnist/conv_mnist.jl
Original file line number Diff line number Diff line change
Expand Up @@ -157,4 +157,7 @@ function train(; kws...)
end
end

train()
if abspath(PROGRAM_FILE) == @__FILE__
train()
end

Binary file added vision/conv_mnist/docs/LeNet-5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
39 changes: 39 additions & 0 deletions vision/dcgan_mnist/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Deep Convolutional GAN (DC-GAN)

![dcgan_gen_disc](../dcgan_mnist/output/dcgan_generator_discriminator.png)
[Source](https://gluon.mxnet.io/chapter14_generative-adversarial-networks/dcgan.html)

## Model Info

A DC-GAN is a direct extension of the GAN, except that it explicitly uses convolutional and transposed convolutions layers in the discriminator and generator, respectively. _The discriminator is made up of strided convolution layers, batch norm layers, and LeakyReLU activations. The generator is comprised of transposed convolutions layers, batch norm layers, and ReLU activations_.

## Training

```script
cd vision/dcgan_mnist
julia --project dcgan_mnist.jl
```

## Results

2000 training step

![2000 training steps](../dcgan_mnist/output/dcgan_steps_002000.png)

5000 training step

![5000 training steps](../dcgan_mnist/output/dcgan_steps_005000.png)

8000 training step

![8000 training steps](../dcgan_mnist/output/dcgan_steps_008000.png)

9380 training step

![9380 training step](../dcgan_mnist/output/dcgan_steps_009380.png)

## References

* [Radford, A. et al.: Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, http://arxiv.org/abs/1511.06434, (2015).](https://arxiv.org/pdf/1511.06434v2.pdf)

* [pytorch.org/tutorials/beginner/dcgan_faces_tutorial](https://pytorch.org/tutorials/beginner/dcgan_faces_tutorial.html)
6 changes: 4 additions & 2 deletions vision/dcgan_mnist/dcgan_mnist.jl
Original file line number Diff line number Diff line change
Expand Up @@ -144,5 +144,7 @@ function train(; kws...)
save(@sprintf("output/dcgan_steps_%06d.png", train_steps), output_image)
end

cd(@__DIR__)
train()
if abspath(PROGRAM_FILE) == @__FILE__
train()
end

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
26 changes: 26 additions & 0 deletions vision/mlp_mnist/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Multilayer Perceptron (MLP)

![mlp](../mlp_mnist/docs/mlp.svg)

[Source](http://d2l.ai/chapter_multilayer-perceptrons/mlp.html)

## Model Info

An MLP consists of at least three of nodes: an input layer, a hidden layer and an output layer. Except for the input node each node is a neuron that uses a nonlinear activation function. MLP utilizes a supervised learning technique called backpropagation for training. Its multiple layers and non-linear activation distinguish MLP from a linear perceptron. It can distinguish data that is not linearly separable.

## Training

```script
cd vision/mlp_mnist
julia --project mlp_mnist.jl
```

## Reference

* [@book
{zhang2020dive,
title={Dive into Deep Learning},
author={Aston Zhang and Zachary C. Lipton and Mu Li and Alexander J. Smola},
note={\url{https://d2l.ai}},
year={2020}
}](http://d2l.ai/chapter_multilayer-perceptrons/mlp.html)
Loading

0 comments on commit 87c934a

Please sign in to comment.