forked from ray-project/ray
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[RLlib] A2C + A3C move to
algorithms
folder and re-name into A2C/A3…
…C (from ...Trainer). (ray-project#25314)
- Loading branch information
Showing
30 changed files
with
181 additions
and
112 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,21 @@ | ||
from ray.rllib.agents.a3c.a3c import A3CConfig, A3CTrainer, DEFAULT_CONFIG | ||
from ray.rllib.agents.a3c.a2c import A2CConfig, A2CTrainer | ||
from ray.rllib.algorithms.a2c.a2c import ( | ||
A2CConfig, | ||
A2C as A2CTrainer, | ||
A2C_DEFAULT_CONFIG, | ||
) | ||
from ray.rllib.algorithms.a3c.a3c import A3CConfig, A3C as A3CTrainer, DEFAULT_CONFIG | ||
from ray.rllib.utils.deprecation import deprecation_warning | ||
|
||
__all__ = ["A2CConfig", "A2CTrainer", "A3CConfig", "A3CTrainer", "DEFAULT_CONFIG"] | ||
|
||
__all__ = [ | ||
"A2CConfig", | ||
"A2C_DEFAULT_CONFIG", # deprecated | ||
"A2CTrainer", | ||
"A3CConfig", | ||
"A3CTrainer", | ||
"DEFAULT_CONFIG", # A3C default config (deprecated) | ||
] | ||
|
||
deprecation_warning( | ||
"ray.rllib.agents.a3c", "ray.rllib.algorithms.[a3c|a2c]", error=False | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
# Advantage Actor-Critic (A2C) | ||
|
||
## Overview | ||
|
||
[Advantage Actor-Critic](https://arxiv.org/pdf/1602.01783.pdf) proposes two distributed model-free on-policy RL algorithms, A3C and A2C. | ||
These algorithms are distributed versions of the vanilla Policy Gradient (PG) algorithm with different distributed execution patterns. | ||
The paper suggests accelerating training via scaling data collection, i.e. introducing worker nodes, | ||
which carry copies of the central node's policy network and collect data from the environment in parallel. | ||
This data is used on each worker to compute gradients. The central node applies each of these gradients and then sends updated weights back to the workers. | ||
|
||
In A2C, the worker nodes synchronously collect data. The collected data forms a giant batch of data, | ||
from which the central node (the central policy) computes gradient updates. | ||
|
||
|
||
## Documentation & Implementation of A2C: | ||
|
||
**[Detailed Documentation](https://docs.ray.io/en/master/rllib-algorithms.html#a2c)** | ||
|
||
**[Implementation](https://github.com/ray-project/ray/blob/master/rllib/algorithms/a2c/a2c.py)** |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
from ray.rllib.algorithms.a2c.a2c import A2CConfig, A2C, A2C_DEFAULT_CONFIG | ||
|
||
__all__ = ["A2CConfig", "A2C", "A2C_DEFAULT_CONFIG"] |
Oops, something went wrong.