An implementation of basic RL algorithms
This repo only include simple algorithms on simple environments
Current algorithms:
- Offline Lambda-return (grid world)
- REINFORCE (CartPole-v1)
- Actor Critic (Cartpole-v1)
- Q-learning (Taxi-v3, FrozenLake-v1)
- Deep Q-Network (CartPole-v1)
- Priortized Experience Replay (PER) (Breakout-v5)
- N-step SARSA (FrozenLake-v1)
- SARSA (FrozenLake-v1)