Skip to content

Commit

Permalink
Merge pull request #149 from FluxML/darsnack/fix-docs-link
Browse files Browse the repository at this point in the history
Fix issues post move to Franklin.jl
  • Loading branch information
darsnack authored Oct 14, 2022
2 parents a2cedb3 + 157dcb2 commit e7d446f
Show file tree
Hide file tree
Showing 28 changed files with 23 additions and 23 deletions.
Binary file added _assets/blogposts/2019-03-05-dp-vs-rl/bptt.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion _layout/navbar.html
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
<a class="nav-link" href="/getting_started/">Getting Started</a>
</li>
<li class="nav-item">
<a class="nav-link" href="{{website_url}}/Flux.jl/" target="_blank">Docs</a>
<a class="nav-link" href="https://fluxml.ai/Flux.jl/" target="_blank">Docs</a>
</li>
<li class="nav-item">
<a class="nav-link" href="/blog/">Blog</a>
Expand Down
20 changes: 10 additions & 10 deletions blogposts/2019-03-05-dp-vs-rl.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ We've discussed the idea of [differentiable programming](https://fluxml.ai/2019/

Differentiation is what makes deep learning tick; given a function $y = f(x)$ we use the gradient $\frac{dy}{dx}$ to figure out how a change in $x$ will affect $y$. Despite the mathematical clothing, gradients are actually a very general and intuitive concept. Forget the formulas you had to stare at in school; let's do something more fun, like throwing stuff.

![](/assets/2019-03-05-dp-vs-rl/trebuchet-basic.gif)
![](/assets/blogposts/2019-03-05-dp-vs-rl/trebuchet-basic.gif)

When we throw things with a trebuchet, our $x$ represents a setting (say, the size of the counterweight, or the angle of release), and $y$ is the distance the projectile travels before landing. If you're trying to aim, the gradient tells you something very useful – whether a change in aim will increase or decrease the distance. To maximise distance, just follow the gradient.

Expand All @@ -31,19 +31,19 @@ Now we have that, let's do something interesting with it.

A simple way to use this is to aim the trebuchet at a target, using gradients to fine-tune the angle of release; this kind of thing is common under the name of _parameter estimation_, and we've [covered examples like it before](https://julialang.org/blog/2019/01/fluxdiffeq). We can make things more interesting by going meta: instead of aiming the trebuchet given a single target, we'll optimise a neural network that can aim it given _any_ target. Here's how it works: the neural net takes two inputs, the target distance in metres and the current wind speed. The network spits out trebuchet settings (the mass of the counterweight and the angle of release) that get fed into the simulator, which calculates the achieved distance. We then compare to our target, and _backpropagate through the entire chain_, end to end, to adjust the weights of the network. Our "dataset" is a randomly chosen set of targets and wind speeds.

![](/assets/2019-03-05-dp-vs-rl/trebuchet-flow.png)
![](/assets/blogposts/2019-03-05-dp-vs-rl/trebuchet-flow.png)

A nice property of this simple model is that training it is _fast_, because we've expressed exactly what we want from the model in a fully differentiable way. Initially it looks like this:

![](/assets/2019-03-05-dp-vs-rl/trebuchet-miss.gif)
![](/assets/blogposts/2019-03-05-dp-vs-rl/trebuchet-miss.gif)

After about five minutes of training (on a single core of my laptop's CPU), it looks like this:

![](/assets/2019-03-05-dp-vs-rl/trebuchet-hit.gif)
![](/assets/blogposts/2019-03-05-dp-vs-rl/trebuchet-hit.gif)

If you want to try pushing it, turn up the wind speed:

![](/assets/2019-03-05-dp-vs-rl/trebuchet-wind.gif)
![](/assets/blogposts/2019-03-05-dp-vs-rl/trebuchet-wind.gif)

It's only off by 16cm, or about 0.3%.

Expand All @@ -55,7 +55,7 @@ This is about the simplest possible control problem, which we use mainly for ill

A more recognisable control problem is [CartPole](https://gym.openai.com/envs/CartPole-v0/), the "hello world" for reinforcement learning. The task is to learn to balance an upright pole by nudging its base left or right. Our setup is broadly similar to the trebuchet case: a [Julia implementation](https://github.com/tejank10/Gym.jl) means we can directly treat the reward produced by the environment as a loss. ∂P allows us to switch seamlessly from model-free to model-based RL.

![](/assets/2019-03-05-dp-vs-rl/cartpole-flow.png)
![](/assets/blogposts/2019-03-05-dp-vs-rl/cartpole-flow.png)

The astute reader may notice a snag. The action space for cartpole – nudge left or right – is discrete, and therefore not differentiable. We solve this by introducing a _differentiable discretisation_, defined [like so](https://github.com/FluxML/model-zoo/blob/cdda5cad3e87b216fa67069a5ca84a3016f2a604/games/differentiable-programming/cartpole/DiffRL.jl#L32):

Expand All @@ -74,22 +74,22 @@ In other words, we force the gradient to behave as if $f$ were the identity func

The results speak for themselves. Where RL methods need to train for hundreds of episodes before solving the problem, the ∂P model only needs around 5 episodes to win conclusively.

![](/assets/2019-03-05-dp-vs-rl/cartpole.gif)
![](/assets/blogposts/2019-03-05-dp-vs-rl/cartpole.gif)


## The Pendulum & Backprop through Time

An important aim for RL is to handle _delayed reward_, when an action doesn't help us until several steps in the future. ∂P allows this too, and in a very familiar way: when the environment is differentiable, we can actually train the agent using backpropagation through time, just like a recurrent net! In this case the environmental state becomes the "hidden state" that changes between time steps.

![](/assets/2019-03-05-dp-vs-rl/bptt.png)
![](/assets/blogposts/2019-03-05-dp-vs-rl/bptt.png)

To demonstrate this technique we looked at the [pendulum](https://github.com/openai/gym/wiki/Pendulum-v0) environment, where the task is to swing a pendulum until it stands upright, keeping it balanced with minimal effort. This is hard for RL models; after around 20 episodes of training the problem is solved, but often the route to a solution is visibly sub-optimal. In contrast, BPTT can beat the [RL leaderboard](https://github.com/openai/gym/wiki/Leaderboard#pendulum-v0) in _a single episode of training_. It's instructive to actually watch this episode unfold; at the beginning of the recording the strategy is random, and the model improves over time. The pace of learning is almost alarming.

![](/assets/2019-03-05-dp-vs-rl/pendulum-training.gif)
![](/assets/blogposts/2019-03-05-dp-vs-rl/pendulum-training.gif)

Despite only experiencing a single episode, the model generalises well to handle any initial angle, and has something pretty close to the optimal strategy. When restarted the model looks more like this.

![](/assets/2019-03-05-dp-vs-rl/pendulum-dp.gif)
![](/assets/blogposts/2019-03-05-dp-vs-rl/pendulum-dp.gif)

This is just the beginning; we'll get the real wins applying DP to environments that are too hard for RL to work with at all, where rich simulations and models already exist (as in much of engineering and the sciences), and where interpretability is an important factor (as in medicine).

Expand Down
4 changes: 2 additions & 2 deletions blogposts/2020-06-29-acclerating-flux-torch.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ For popular object detection models - ResNet50, ResNet101 and VGG19 - we compare

~~~
<p float="middle">
<img src="/assets/2020-06-29-acclerating-flux-torch/combined_benchmarks_2.png">
<img src="/assets/2020-06-29-acclerating-flux-torch/resnet101.png" height="300">
<img src="/assets/blogposts/2020-06-29-acclerating-flux-torch/combined_benchmarks_2.png">
<img src="/assets/blogposts/2020-06-29-acclerating-flux-torch/resnet101.png" height="300">
</p>
~~~

Expand Down
12 changes: 6 additions & 6 deletions blogposts/2020-12-20-Flux3D.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Performing 3D vision tasks involve preparing datasets to fit a certain represent

~~~
<div style="text-align:center">
<img width="400" src="/assets/2020-12-20-Flux3D/visualize_anim.gif">
<img width="400" src="/assets/blogposts/2020-12-20-Flux3D/visualize_anim.gif">
</div>
~~~

Expand All @@ -36,9 +36,9 @@ Kaolin is a popular 3D vision library based on PyTorch. Flux3D.jl is overall fas

~~~
<p float="middle">
<img src="/assets/2020-12-20-Flux3D/bm_pcloud.png">
<img src="/assets/2020-12-20-Flux3D/bm_trimesh.png">
<img src="/assets/2020-12-20-Flux3D/bm_metrics.png">
<img src="/assets/blogposts/2020-12-20-Flux3D/bm_pcloud.png">
<img src="/assets/blogposts/2020-12-20-Flux3D/bm_trimesh.png">
<img src="/assets/blogposts/2020-12-20-Flux3D/bm_metrics.png">
</p>
~~~

Expand Down Expand Up @@ -158,7 +158,7 @@ Additonally, 3D structures and all relevant transforms, as well as metrics, are

~~~
<div style="text-align:center">
<img width="300" src="/assets/2020-12-20-Flux3D/fitmesh_anim.gif">
<img width="300" src="/assets/blogposts/2020-12-20-Flux3D/fitmesh_anim.gif">
</div>
~~~

Expand All @@ -183,7 +183,7 @@ julia> vbox(

~~~
<div style="text-align:center">
<img src="/assets/2020-12-20-Flux3D/visualize.png">
<img src="/assets/blogposts/2020-12-20-Flux3D/visualize.png">
</div>
~~~

Expand Down
2 changes: 1 addition & 1 deletion blogposts/2021-12-1-flux-numfocus.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ author = "Dhairya Gandhi, Logan Kilpatrick"

~~~
<p float="middle">
<img src="/assets/2021-12-1-flux-numfocus/flux_numfocus.png" height="300">
<img src="/assets/blogposts/2021-12-1-flux-numfocus/flux_numfocus.png" height="300">
</p>
~~~

Expand Down
6 changes: 3 additions & 3 deletions tutorialposts/2021-10-08-dcgan-mnist.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ This is a beginner level tutorial for generating images of handwritten digits us

A GAN is composed of two sub-models - the **generator** and the **discriminator** acting against one another. The generator can be considered as an artist who draws (generates) new images that look real, whereas the discriminator is a critic who learns to tell real images apart from fakes.

![](/assets/2021-10-8-dcgan-mnist/cat_gan.png)
![](/assets/tutorialposts/2021-10-8-dcgan-mnist/cat_gan.png)

The GAN starts with a generator and discriminator which have very little or no idea about the underlying data. During training, the generator progressively becomes better at creating images that look real, while the discriminator becomes better at telling them apart. The process reaches equilibrium when the discriminator can no longer distinguish real images from fakes.

Expand All @@ -25,7 +25,7 @@ This tutorial demonstrates the process of training a DC-GAN on the [MNIST datase
~~~
<br><br>
<p align="center">
<img src="/assets/2021-10-8-dcgan-mnist/output.gif" align="middle" width="200">
<img src="/assets/tutorialposts/2021-10-8-dcgan-mnist/output.gif" align="middle" width="200">
</p>
~~~

Expand Down Expand Up @@ -361,7 +361,7 @@ save("./output.gif", gif_mat)
```
<br>
<p align="center">
<img src="/assets/2021-10-8-dcgan-mnist/output.gif" align="middle" width="200">
<img src="/assets/tutorialposts/2021-10-8-dcgan-mnist/output.gif" align="middle" width="200">
</p>

## Resources & References
Expand Down

0 comments on commit e7d446f

Please sign in to comment.