Merge branch 'dims' of https://github.com/aditkumar72/model-zoo into …

…dims
FluxML · Jun 19, 2021 · 87c934a · 87c934a
2 parents a2bb58f + 27bf1a6
commit 87c934a
Show file tree

Hide file tree

Showing 18 changed files with 358 additions and 10 deletions.
diff --git a/vision/cdcgan_mnist/README.md b/vision/cdcgan_mnist/README.md
@@ -0,0 +1,45 @@
+# Conditional DC-GAN
+
+<img src="..\cdcgan_mnist\output\img_for_readme.png" width="440"/>
+
+[Source](https://arxiv.org/pdf/1411.1784.pdf)
+
+## Model Info
+
+Generative Adversarial Networks have two models, a _Generator model G(z)_ and a _Discriminator model D(x)_, in competition with each other. G tries to estimate the distribution of the training data and D tries to estimate the probability that a data sample came from the original training data and not from G. During training, the Generator learns a mapping from a _prior distribution p(z)_ to the _data space G(z)_. The discriminator D(x) produces a probability value of a given x coming from the actual training data.
+This model can be modified to include additional inputs, y, on which the models can be conditioned. y can be any type of additional inputs, for example, class labels. _The conditioning can be achieved by simply feeding y to both the Generator — G(z|y) and the Discriminator — D(x|y)_.
+
+## Training
+
+```shell
+cd vision/cdcgan_mnist
+julia --project cGAN_mnist.jl
+```
+
+## Results
+
+1000 training step
+
+![1000 training step](../cdcgan_mnist/output/cgan_steps_001000.png)
+
+3000 training step
+
+![30000 trainig step](../cdcgan_mnist/output/cgan_steps_003000.png)
+
+5000 training step
+
+![5000 training step](../cdcgan_mnist/output/cgan_steps_005000.png)
+
+10000 training step
+
+![10000 training step](../cdcgan_mnist/output/cgan_steps_010000.png)
+
+11725 training step
+
+![11725 training step](../cdcgan_mnist/output/cgan_steps_011725.png)
+
+## References
+
+* [Mirza, M. and Osindero, S., “Conditional Generative Adversarial Nets”, <i>arXiv e-prints</i>, 2014.](https://arxiv.org/pdf/1411.1784.pdf)
+
+* [Training a Conditional DC-GAN on CIFAR-10](https://medium.com/@utk.is.here/training-a-conditional-dc-gan-on-cifar-10-fce88395d610)
diff --git a/vision/cdcgan_mnist/cGAN_mnist.jl b/vision/cdcgan_mnist/cGAN_mnist.jl
@@ -185,5 +185,8 @@ function train(; kws...)
     return Flux.onecold.(cpu(fixed_labels))
 end    
 
-cd(@__DIR__)
-fixed_labels = train()
+if abspath(PROGRAM_FILE) == @__FILE__
+    train()
+end
+
+
diff --git a/vision/cdcgan_mnist/output/img_for_readme.png b/vision/cdcgan_mnist/output/img_for_readme.png
diff --git a/vision/conv_mnist/README.md b/vision/conv_mnist/README.md
@@ -0,0 +1,32 @@
+# LeNet-5
+
+![LeNet-5](../conv_mnist/docs/LeNet-5.png)
+
+[Source](https://d2l.ai/chapter_convolutional-neural-networks/lenet.html)
+
+## Model Info
+
+At a high level LeNet (LeNet-5) consists of two parts:
+(i) _a convolutional encoder consisting of two convolutional layers_;
+(ii) _a dense block consisting of three fully-connected layers_
+
+The basic units in each convolutional block are a convolutional layer, a sigmoid activation function, and a subsequent average pooling operation. Each convolutional layer uses a  5×5  kernel and a sigmoid activation function. These layers map spatially arranged inputs to a number of two-dimensional feature maps, typically increasing the number of channels. The first convolutional layer has 6 output channels, while the second has 16. Each  2×2  pooling operation (stride 2) reduces dimensionality by a factor of  4  via spatial downsampling. The convolutional block emits an output with shape given by (batch size, number of channel, height, width).
+
+## Training
+
+```shell
+cd vision/conv_mnist
+julia --project conv_mnist.jl
+```
+
+## References
+
+* [Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, "Gradient-based learning applied to document recognition," in Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998, doi: 10.1109/5.726791.](http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf)
+
+* [@book
+{zhang2020dive,
+title={Dive into Deep Learning},
+author={Aston Zhang and Zachary C. Lipton and Mu Li and Alexander J. Smola},
+note={\url{https://d2l.ai}},
+year={2020}
+})](https://d2l.ai/chapter_convolutional-neural-networks/lenet.html)
diff --git a/vision/conv_mnist/conv_mnist.jl b/vision/conv_mnist/conv_mnist.jl
@@ -157,4 +157,7 @@ function train(; kws...)
     end
 end
 
-train()
+if abspath(PROGRAM_FILE) == @__FILE__
+    train()
+end
+
diff --git a/vision/conv_mnist/docs/LeNet-5.png b/vision/conv_mnist/docs/LeNet-5.png
diff --git a/vision/dcgan_mnist/README.md b/vision/dcgan_mnist/README.md
@@ -0,0 +1,39 @@
+# Deep Convolutional GAN (DC-GAN)
+
+![dcgan_gen_disc](../dcgan_mnist/output/dcgan_generator_discriminator.png)
+[Source](https://gluon.mxnet.io/chapter14_generative-adversarial-networks/dcgan.html)
+
+## Model Info
+
+A DC-GAN is a direct extension of the GAN, except that it explicitly uses convolutional and transposed convolutions layers in the discriminator and generator, respectively. _The discriminator is made up of strided convolution layers, batch norm layers, and LeakyReLU activations. The generator is comprised of transposed convolutions layers, batch norm layers, and ReLU activations_.
+
+## Training
+
+```script
+cd vision/dcgan_mnist
+julia --project dcgan_mnist.jl
+```
+
+## Results
+
+2000 training step
+
+![2000 training steps](../dcgan_mnist/output/dcgan_steps_002000.png)
+
+5000 training step
+
+![5000 training steps](../dcgan_mnist/output/dcgan_steps_005000.png)
+
+8000 training step
+
+![8000 training steps](../dcgan_mnist/output/dcgan_steps_008000.png)
+
+9380 training step
+
+![9380 training step](../dcgan_mnist/output/dcgan_steps_009380.png)
+
+## References
+
+* [Radford, A. et al.: Unsupervised Representation Learning with Deep Convolutional Generative    Adversarial Networks, http://arxiv.org/abs/1511.06434, (2015).](https://arxiv.org/pdf/1511.06434v2.pdf)
+
+* [pytorch.org/tutorials/beginner/dcgan_faces_tutorial](https://pytorch.org/tutorials/beginner/dcgan_faces_tutorial.html)
diff --git a/vision/dcgan_mnist/dcgan_mnist.jl b/vision/dcgan_mnist/dcgan_mnist.jl
@@ -144,5 +144,7 @@ function train(; kws...)
     save(@sprintf("output/dcgan_steps_%06d.png", train_steps), output_image)
 end
 
-cd(@__DIR__)
-train()
+if abspath(PROGRAM_FILE) == @__FILE__
+    train()
+end
+
diff --git a/vision/dcgan_mnist/output/dcgan_generator_discriminator.png b/vision/dcgan_mnist/output/dcgan_generator_discriminator.png
diff --git a/vision/mlp_mnist/README.md b/vision/mlp_mnist/README.md
@@ -0,0 +1,26 @@
+# Multilayer Perceptron (MLP)
+
+![mlp](../mlp_mnist/docs/mlp.svg)
+
+[Source](http://d2l.ai/chapter_multilayer-perceptrons/mlp.html)
+
+## Model Info
+
+An MLP consists of at least three of nodes: an input layer, a hidden layer and an output layer. Except for the input node each node is a neuron that uses a nonlinear activation function. MLP utilizes a supervised learning technique called backpropagation for training. Its multiple layers and non-linear activation distinguish MLP from a linear perceptron. It can distinguish data that is not linearly separable.
+
+## Training
+
+```script
+cd vision/mlp_mnist
+julia --project mlp_mnist.jl
+```
+
+## Reference
+
+* [@book
+{zhang2020dive,
+title={Dive into Deep Learning},
+author={Aston Zhang and Zachary C. Lipton and Mu Li and Alexander J. Smola},
+note={\url{https://d2l.ai}},
+year={2020}
+}](http://d2l.ai/chapter_multilayer-perceptrons/mlp.html)
-Original file line number
+Diff line change
@@ Expand Up / @@ -157,4 +157,7 @@ function train(; kws...) @@
         end
     end
-    train()
+    if abspath(PROGRAM_FILE) == @__FILE__
+        train()
+    end