Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Openpcdet可能无法更新nn.parameter的可学习参数,对应A和D矩阵 #25

Open
magnetmoment opened this issue Dec 2, 2024 · 4 comments

Comments

@magnetmoment
Copy link

Openpcdet中有一个很经典的bug:如果使用某些学习器,比如adam_onecycle,将不会梯度更新nn.parameter中的参数。请问在这个项目中mamba中的A和D矩阵(nn.parameter)是否一直没有更新,理论上它们都是可学习的。

@AlmoonYsl
Copy link
Collaborator

AlmoonYsl commented Dec 2, 2024

@magnetmoment
您好,parameter都是更新的。我们针对OpenPCDet的这个问题已经做了修复,您可以参考仓库中的如下代码

modules = named_children(('', model), '')
params = model.named_parameters()
other_params = list()
for p in params:
if p[0] not in modules:
name = p[0].split('.')
m = model
for n in name:
if n.isnumeric():
m = m[int(n)]
else:
m = getattr(m, n)
p = p[1]
if isinstance(m, torch.nn.parameter.Parameter) and hasattr(m, '_no_weight_decay'):
setattr(p, '_no_weight_decay', m._no_weight_decay)
other_params.append(p)
optimizer = OptimWrapper.create(
optimizer_func, 3e-3, get_layer_groups(model), params=iter(other_params), wd=optim_cfg.WEIGHT_DECAY, true_wd=True, bn_wd=True
)

@magnetmoment
Copy link
Author

ok

@weiyuuuuyiew
Copy link

@magnetmoment 您好,parameter都是更新的。我们针对OpenPCDet的这个问题已经做了修复,您可以参考仓库中的如下代码

modules = named_children(('', model), '')
params = model.named_parameters()
other_params = list()
for p in params:
if p[0] not in modules:
name = p[0].split('.')
m = model
for n in name:
if n.isnumeric():
m = m[int(n)]
else:
m = getattr(m, n)
p = p[1]
if isinstance(m, torch.nn.parameter.Parameter) and hasattr(m, '_no_weight_decay'):
setattr(p, '_no_weight_decay', m._no_weight_decay)
other_params.append(p)
optimizer = OptimWrapper.create(
optimizer_func, 3e-3, get_layer_groups(model), params=iter(other_params), wd=optim_cfg.WEIGHT_DECAY, true_wd=True, bn_wd=True
)

作者您好,我发现一个有意思的现象。我以前训练您的模型使用的是Openpcdet仓库的优化器,一个epoch需要2h;最近看见您对这个问题的解答,我将优化器换为您修复过后的,每个epoch的训练时间变为了1h。似乎您在init文件中的修改只会增加运算量,但是训练时间却缩短了一半。这个问题可能与项目无关系,如果能解惑,不胜感激

@AlmoonYsl
Copy link
Collaborator

@weiyuuuuyiew
您好,您可能需要仔细对比下两者训练的区别,正常来说是不应该出现这个现象的

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants