- Gymnasium
- DeepMind Control Suite wrapper
- ELU activation
- Optional state-action merging layer index (Critic model)
- Optimized critic
- Optimized server
backend.epsilon()
from Keras backend
- update default
config.yaml
- .fit()
- AgentCallback
- Render environments to WanDB
- Grouping of runs in WanDB
- SampleToInsertRatio rate limiter
- Global Gradient Clipping to avoid exploding gradients
- Softplus for numerical stability
- YAML configuration file
- LogCosh instead of Huber loss
- Critic network with Add layer applied on state & action branches
- Custom uniform initializer
- XLA (Accelerated Linear Algebra) compiler
- Optimized Replay Buffer (google-deepmind/reverb#90)
- split into Agent, Learner, Tester and Server
- Fixed creating of saving path for models
- Fixed model's
summary()
- Reverb
setup.py
(package is available on PyPI)- split into Agent, Learner and Tester
- Use custom model and layer for defining Actor-Critic
- MultiCritic - concatenating multiple critic networks into one network
- Truncated Quantile Critics
- update Dockerfile
- update
README.md
- formatted code by Black & Flake8
- fixed Critic model
- Add Huber loss
- In test mode, rendering to the video file
- Normalized observation by Min-max method
- Remove TD3 algorithm