Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature(pu): add UniZero algo. and related configs/utils/envs/models #232

Merged
merged 184 commits into from
Jul 3, 2024

Conversation

puyuan1996
Copy link
Collaborator

@puyuan1996 puyuan1996 commented Jun 11, 2024

lzero/worker/muzero_evaluator.py Outdated Show resolved Hide resolved
lzero/policy/utils.py Outdated Show resolved Hide resolved
lzero/model/utils.py Outdated Show resolved Hide resolved
lzero/model/utils.py Outdated Show resolved Hide resolved
lzero/entry/eval_muzero.py Outdated Show resolved Hide resolved
lzero/entry/train_muzero.py Outdated Show resolved Hide resolved
lzero/entry/train_muzero.py Outdated Show resolved Hide resolved
lzero/entry/train_unizero.py Outdated Show resolved Hide resolved
lzero/model/common.py Outdated Show resolved Hide resolved
lzero/model/efficientzero_model.py Outdated Show resolved Hide resolved
lzero/model/efficientzero_model.py Outdated Show resolved Hide resolved
lzero/model/common.py Show resolved Hide resolved
lzero/model/common.py Outdated Show resolved Hide resolved
lzero/model/unizero_model.py Outdated Show resolved Hide resolved
lzero/model/unizero_world_models/transformer.py Outdated Show resolved Hide resolved
lzero/model/unizero_world_models/transformer.py Outdated Show resolved Hide resolved
lzero/model/unizero_world_models/tokenizer.py Outdated Show resolved Hide resolved
lzero/model/unizero_world_models/tokenizer.py Show resolved Hide resolved
lzero/model/unizero_world_models/tokenizer.py Show resolved Hide resolved
lzero/entry/train_unizero.py Outdated Show resolved Hide resolved
if cfg.policy.use_priority:
replay_buffer.update_priority(train_data, log_vars[0]['value_priority_orig'])

# Clear caches and precompute positional embedding matrices
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move these part to the __del__ method of world_model

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

因为是每个train epoch都需要调用一次,因为我在unizero中新建了一个recompute_pos_emb_diff_and_clear_cache() methid哈

lzero/model/unizero_world_models/kv_caching.py Outdated Show resolved Hide resolved
"""
assert embed_dim % num_heads == 0
self._n, self._cache, self._size = num_samples, None, None
self._reset = lambda n: torch.empty(n, num_heads, max_tokens, embed_dim // num_heads,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why use lambda function here, rather than directly writing the implementation in reset method

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里使用 lambda function,可以避免很多参数的传递,效果是类似的

lzero/model/unizero_world_models/kv_caching.py Outdated Show resolved Hide resolved
lzero/policy/unizero.py Show resolved Hide resolved
lzero/model/common.py Outdated Show resolved Hide resolved
lzero/model/common.py Show resolved Hide resolved
lzero/model/common.py Outdated Show resolved Hide resolved
lzero/model/common.py Outdated Show resolved Hide resolved
@puyuan1996 puyuan1996 added enhancement New feature or request style Code or comments formatting labels Jul 3, 2024
@puyuan1996 puyuan1996 merged commit 4e65afa into main Jul 3, 2024
0 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
algorithm New algorithm config New or improved configuration enhancement New feature or request style Code or comments formatting
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants