-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate bigym training pipeline #1
Conversation
…, need to check if this is correct
@@ -218,6 +218,7 @@ def __init__( | |||
|
|||
@staticmethod | |||
def transform_from_tanh(action, action_stats, min_max_margin): | |||
action = action.clip(-1.0, 1.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious why this is needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this is because the action head of ACT is just a simple linear layer, which doesn't transform policy output to [-1, 1], and it may go out of bounds during execution so we have to clip it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be better to clip the action in def act
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@richielo Would this break Genima?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it could be better to keep this line, and don't modify ACT as we're not sure if it will break genima or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think it would be better to leave this line here. As we always assume actions to be bounded by [-1, 1] given transformations in robobase.
It could be good to check again if we should include low_dim_obs normalization for running ACT in BiGym |
Found that the official implementation actually does the normalization, but we didn't do that in our previous code. I can add a wrapper here to align with the official one. But it shouldn't matter too much I guess. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me with some minor comments
This PR integrates the BiGym training pipeline with robobase and verifies the pipeline with ACT