Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update slides #233

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions DP/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@

**Required:**

- David Silver's RL Course Lecture 3 - Planning by Dynamic Programming ([video](https://www.youtube.com/watch?v=Nd1-UUMVfz4), [slides](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/DP.pdf))
- David Silver's RL Course Lecture 3 - Planning by Dynamic Programming ([video](https://www.youtube.com/watch?v=Nd1-UUMVfz4), [slides](https://www.davidsilver.uk/wp-content/uploads/2020/03/DP.pdf))

**Optional:**

Expand All @@ -47,4 +47,4 @@

- Implement Gambler's Problem
- [Exercise](Gamblers%20Problem.ipynb)
- [Solution](Gamblers%20Problem%20Solution.ipynb)
- [Solution](Gamblers%20Problem%20Solution.ipynb)
2 changes: 1 addition & 1 deletion DQN/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@

- [Human-Level Control through Deep Reinforcement Learning](http://www.readcube.com/articles/10.1038/nature14236)
- [Demystifying Deep Reinforcement Learning](https://ai.intel.com/demystifying-deep-reinforcement-learning/)
- David Silver's RL Course Lecture 6 - Value Function Approximation ([video](https://www.youtube.com/watch?v=UoPei5o4fps), [slides](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/FA.pdf))
- David Silver's RL Course Lecture 6 - Value Function Approximation ([video](https://www.youtube.com/watch?v=UoPei5o4fps), [slides](https://www.davidsilver.uk/wp-content/uploads/2020/03/FA.pdf))

**Optional:**

Expand Down
2 changes: 1 addition & 1 deletion FA/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@

**Required:**

- David Silver's RL Course Lecture 6 - Value Function Approximation ([video](https://www.youtube.com/watch?v=UoPei5o4fps), [slides](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/FA.pdf))
- David Silver's RL Course Lecture 6 - Value Function Approximation ([video](https://www.youtube.com/watch?v=UoPei5o4fps), [slides](https://www.davidsilver.uk/wp-content/uploads/2020/03/FA.pdf))
- [Reinforcement Learning: An Introduction](http://incompleteideas.net/book/RLbook2018.pdf) - Chapter 9: On-policy Prediction with Approximation
- [Reinforcement Learning: An Introduction](http://incompleteideas.net/book/RLbook2018.pdf) - Chapter 10: On-policy Control with Approximation

Expand Down
9 changes: 5 additions & 4 deletions Introduction/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@
### Summary

- Reinforcement Learning (RL) is concerned with goal-directed learning and decision-making.
- In RL an agent learns from experiences it gains by interacting with the environment. In Supervised Learning we cannot affect the environment.
- In RL an agent learns from experiences it gains by interacting with the environment. In Supervised Learning we cannot affect the environment. It learns from data.
- Moreover, RL provides evaluative feedback whereas, supervised learning provides instructive feedback.
- In RL rewards are often delayed in time and the agent tries to maximize a long-term goal. For example, one may need to make seemingly suboptimal moves to reach a winning position in a game.
- An agent interacts with the environment via states, actions and rewards.

Expand All @@ -18,12 +19,12 @@
**Required:**

- [Reinforcement Learning: An Introduction](http://incompleteideas.net/book/RLbook2018.pdf) - Chapter 1: The Reinforcement Learning Problem
- David Silver's RL Course Lecture 1 - Introduction to Reinforcement Learning ([video](https://www.youtube.com/watch?v=2pWv7GOvuf0), [slides](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/intro_RL.pdf))
- David Silver's RL Course Lecture 1 - Introduction to Reinforcement Learning ([video](https://www.youtube.com/watch?v=2pWv7GOvuf0), [slides](https://www.davidsilver.uk/wp-content/uploads/2020/03/intro_RL.pdf))
- [OpenAI Gym Tutorial](https://gym.openai.com/docs)

**Optional:**

N/A
**Optional:**
- [RL vs Supervised Learning Blog](https://www.quora.com/What-is-the-difference-between-supervised-learning-and-reinforcement-learning)


### Exercises
Expand Down
4 changes: 2 additions & 2 deletions MC/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,8 @@

**Optional:**

- David Silver's RL Course Lecture 4 - Model-Free Prediction ([video](https://www.youtube.com/watch?v=PnHCvfgC_ZA), [slides](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/MC-TD.pdf))
- David Silver's RL Course Lecture 5 - Model-Free Control ([video](https://www.youtube.com/watch?v=0g4j2k_Ggc4), [slides](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/control.pdf))
- David Silver's RL Course Lecture 4 - Model-Free Prediction ([video](https://www.youtube.com/watch?v=PnHCvfgC_ZA), [slides](https://www.davidsilver.uk/wp-content/uploads/2020/03/MC-TD.pdf))
- David Silver's RL Course Lecture 5 - Model-Free Control ([video](https://www.youtube.com/watch?v=0g4j2k_Ggc4), [slides](https://www.davidsilver.uk/wp-content/uploads/2020/03/control.pdf))


### Exercises
Expand Down
2 changes: 1 addition & 1 deletion MDP/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
**Required:**

- [Reinforcement Learning: An Introduction](http://incompleteideas.net/book/RLbook2018.pdf) - Chapter 3: Finite Markov Decision Processes
- David Silver's RL Course Lecture 2 - Markov Decision Processes ([video](https://www.youtube.com/watch?v=lfHX2hHRMVQ), [slides](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/MDP.pdf))
- David Silver's RL Course Lecture 2 - Markov Decision Processes ([video](https://www.youtube.com/watch?v=lfHX2hHRMVQ), [slides](https://www.davidsilver.uk/wp-content/uploads/2020/03/MDP.pdf))


### Exercises
Expand Down
2 changes: 1 addition & 1 deletion PolicyGradient/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@

**Required:**

- David Silver's RL Course Lecture 7 - Policy Gradient Methods ([video](https://www.youtube.com/watch?v=KHZVXao4qXs), [slides](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/pg.pdf))
- David Silver's RL Course Lecture 7 - Policy Gradient Methods ([video](https://www.youtube.com/watch?v=KHZVXao4qXs), [slides](https://www.davidsilver.uk/wp-content/uploads/2020/03/pg.pdf))

**Optional:**

Expand Down
4 changes: 2 additions & 2 deletions TD/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,8 @@
**Required:**

- [Reinforcement Learning: An Introduction](http://incompleteideas.net/book/RLbook2018.pdf) - Chapter 6: Temporal-Difference Learning
- David Silver's RL Course Lecture 4 - Model-Free Prediction ([video](https://www.youtube.com/watch?v=PnHCvfgC_ZA), [slides](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/MC-TD.pdf))
- David Silver's RL Course Lecture 5 - Model-Free Control ([video](https://www.youtube.com/watch?v=0g4j2k_Ggc4), [slides](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/control.pdf))
- David Silver's RL Course Lecture 4 - Model-Free Prediction ([video](https://www.youtube.com/watch?v=PnHCvfgC_ZA), [slides](https://www.davidsilver.uk/wp-content/uploads/2020/03/MC-TD.pdf))
- David Silver's RL Course Lecture 5 - Model-Free Control ([video](https://www.youtube.com/watch?v=0g4j2k_Ggc4), [slides](https://www.davidsilver.uk/wp-content/uploads/2020/03/control.pdf))

**Optional:**

Expand Down