Benchmarks

About Benchmarks

Benchmarks is a tool to monitor and log reinforcement learning experiments. You build/find any compatible agent (only need an act method), you build/find a gym environment, and benchmarks will make them interact together ! Benchmarks also contains both tensorboard and weights&biases integrations for a beautiful and sharable experiment tracking ! Also, Benchmarks is cross platform compatible ! That's why no agents are built-in benchmarks itself.

You can build and run your own Agent in a clear and sharable manner !

import benchmarks as rl
import gym

class MyAgent(rl.Agent):

   def act(self, observation, greedy=False):
      """ How the Agent act given an observation """
      ...
      return action

   def learn(self):
      """ How the Agent learns from his experiences """
      ...
      return logs

   def remember(self, observation, action, reward, done, next_observation=None, info={}, **param):
      """ How the Agent will remember experiences """
      ...

env = gym.make('FrozenLake-v0', is_slippery=True) # This could be any gym-like Environment !
agent = MyAgent(env.observation_space, env.action_space)

pg = rl.Playground(env, agent)
pg.fit(2000, verbose=1)

Note that 'learn' and 'remember' are optional, so this framework can also be used for baselines !

You can logs any custom metrics that your Agent/Env gives you and even chose how to aggregate them through different timescales. See the metric codes for more details.

metrics=[
     ('reward~env-rwd', {'steps': 'sum', 'episode': 'sum'}),
     ('handled_reward~reward', {'steps': 'sum', 'episode': 'sum'}),
     'value_loss~vloss',
     'actor_loss~aloss',
     'exploration~exp'
 ]

pg.fit(2000, verbose=1, metrics=metrics)

The Playground will allow you to have clean logs adapted to your will with the verbose parameter:

Verbose 1 : episodes cycles - If your environment makes a lot of quick episodes.
Verbose 2 : episode - To log each individual episode.
Verbose 3 : steps cycles - If your environment makes a lot of quick steps but has long episodes.
Verbose 4 : step - To log each individual step.
Verbose 5 : detailled step - To debug each individual step (with observations, actions, ...).

The Playground also allows you to add Callbacks with ease, for example the WandbCallback to have a nice experiment tracking dashboard using weights&biases!

Installation

Install Benchmarks by running:

pip install benchmarks

Documentation

See the latest complete documentation for more details.

See the development documentation to see what's coming !

Contribute

Issue Tracker.
Projects.

Support

If you are having issues, please contact us on Discord.

License

The project is licensed under the GNU LGPLv3 license.

See LICENCE, COPYING and COPYING.LESSER for more details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.rst

README.rst

Benchmarks

About Benchmarks

Installation

Documentation

Contribute

Support

License

Files

README.rst

Latest commit

History

README.rst

File metadata and controls

Benchmarks

About Benchmarks

Installation

Documentation

Contribute

Support

License