Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unneeded elements in state dict #4645

Closed
ternaus opened this issue Nov 12, 2020 · 1 comment · Fixed by #4685
Closed

Unneeded elements in state dict #4645

ternaus opened this issue Nov 12, 2020 · 1 comment · Fixed by #4685
Labels
bug Something isn't working checkpointing Related to checkpointing help wanted Open to be worked on

Comments

@ternaus
Copy link

ternaus commented Nov 12, 2020

State_dict keys after training resnet18:

dict_keys(['model.conv1.weight', 'model.bn1.weight', 'model.bn1.bias', 'model.bn1.running_mean', 'model.bn1.running_var', 'model.bn1.num_batches_tracked', 'model.layer1.0.conv1.weight', 'model.layer1.0.bn1.weight', 'model.layer1.0.bn1.bias', 'model.layer1.0.bn1.running_mean', 'model.layer1.0.bn1.running_var', 'model.layer1.0.bn1.num_batches_tracked', 'model.layer1.0.conv2.weight', 'model.layer1.0.bn2.weight', 'model.layer1.0.bn2.bias', 'model.layer1.0.bn2.running_mean', 'model.layer1.0.bn2.running_var', 'model.layer1.0.bn2.num_batches_tracked', 'model.layer1.1.conv1.weight', 'model.layer1.1.bn1.weight', 'model.layer1.1.bn1.bias', 'model.layer1.1.bn1.running_mean', 'model.layer1.1.bn1.running_var', 'model.layer1.1.bn1.num_batches_tracked', 'model.layer1.1.conv2.weight', 'model.layer1.1.bn2.weight', 'model.layer1.1.bn2.bias', 'model.layer1.1.bn2.running_mean', 'model.layer1.1.bn2.running_var', 'model.layer1.1.bn2.num_batches_tracked', 'model.layer2.0.conv1.weight', 'model.layer2.0.bn1.weight', 'model.layer2.0.bn1.bias', 'model.layer2.0.bn1.running_mean', 'model.layer2.0.bn1.running_var', 'model.layer2.0.bn1.num_batches_tracked', 'model.layer2.0.conv2.weight', 'model.layer2.0.bn2.weight', 'model.layer2.0.bn2.bias', 'model.layer2.0.bn2.running_mean', 'model.layer2.0.bn2.running_var', 'model.layer2.0.bn2.num_batches_tracked', 'model.layer2.0.downsample.0.weight', 'model.layer2.0.downsample.1.weight', 'model.layer2.0.downsample.1.bias', 'model.layer2.0.downsample.1.running_mean', 'model.layer2.0.downsample.1.running_var', 'model.layer2.0.downsample.1.num_batches_tracked', 'model.layer2.1.conv1.weight', 'model.layer2.1.bn1.weight', 'model.layer2.1.bn1.bias', 'model.layer2.1.bn1.running_mean', 'model.layer2.1.bn1.running_var', 'model.layer2.1.bn1.num_batches_tracked', 'model.layer2.1.conv2.weight', 'model.layer2.1.bn2.weight', 'model.layer2.1.bn2.bias', 'model.layer2.1.bn2.running_mean', 'model.layer2.1.bn2.running_var', 'model.layer2.1.bn2.num_batches_tracked', 'model.layer3.0.conv1.weight', 'model.layer3.0.bn1.weight', 'model.layer3.0.bn1.bias', 'model.layer3.0.bn1.running_mean', 'model.layer3.0.bn1.running_var', 'model.layer3.0.bn1.num_batches_tracked', 'model.layer3.0.conv2.weight', 'model.layer3.0.bn2.weight', 'model.layer3.0.bn2.bias', 'model.layer3.0.bn2.running_mean', 'model.layer3.0.bn2.running_var', 'model.layer3.0.bn2.num_batches_tracked', 'model.layer3.0.downsample.0.weight', 'model.layer3.0.downsample.1.weight', 'model.layer3.0.downsample.1.bias', 'model.layer3.0.downsample.1.running_mean', 'model.layer3.0.downsample.1.running_var', 'model.layer3.0.downsample.1.num_batches_tracked', 'model.layer3.1.conv1.weight', 'model.layer3.1.bn1.weight', 'model.layer3.1.bn1.bias', 'model.layer3.1.bn1.running_mean', 'model.layer3.1.bn1.running_var', 'model.layer3.1.bn1.num_batches_tracked', 'model.layer3.1.conv2.weight', 'model.layer3.1.bn2.weight', 'model.layer3.1.bn2.bias', 'model.layer3.1.bn2.running_mean', 'model.layer3.1.bn2.running_var', 'model.layer3.1.bn2.num_batches_tracked', 'model.layer4.0.conv1.weight', 'model.layer4.0.bn1.weight', 'model.layer4.0.bn1.bias', 'model.layer4.0.bn1.running_mean', 'model.layer4.0.bn1.running_var', 'model.layer4.0.bn1.num_batches_tracked', 'model.layer4.0.conv2.weight', 'model.layer4.0.bn2.weight', 'model.layer4.0.bn2.bias', 'model.layer4.0.bn2.running_mean', 'model.layer4.0.bn2.running_var', 'model.layer4.0.bn2.num_batches_tracked', 'model.layer4.0.downsample.0.weight', 'model.layer4.0.downsample.1.weight', 'model.layer4.0.downsample.1.bias', 'model.layer4.0.downsample.1.running_mean', 'model.layer4.0.downsample.1.running_var', 'model.layer4.0.downsample.1.num_batches_tracked', 'model.layer4.1.conv1.weight', 'model.layer4.1.bn1.weight', 'model.layer4.1.bn1.bias', 'model.layer4.1.bn1.running_mean', 'model.layer4.1.bn1.running_var', 'model.layer4.1.bn1.num_batches_tracked', 'model.layer4.1.conv2.weight', 'model.layer4.1.bn2.weight', 'model.layer4.1.bn2.bias', 'model.layer4.1.bn2.running_mean', 'model.layer4.1.bn2.running_var', 'model.layer4.1.bn2.num_batches_tracked', 'model.fc.weight', 'model.fc.bias', 

'train_accuracy.correct', 'train_accuracy.total', 'val_accuracy.correct', 'val_accuracy.total'])

I have elements with keys
'train_accuracy.correct', 'train_accuracy.total', 'val_accuracy.correct', 'val_accuracy.total in the state dict. Which are metrics that are logged during training.

If it is a feature - it is a strange feature. Now I need to manually delete them so that I will be able to use the checkpoint with the resnet18 model.

@ternaus ternaus added bug Something isn't working help wanted Open to be worked on labels Nov 12, 2020
@Vozf
Copy link
Contributor

Vozf commented Nov 12, 2020

Metric.add_state(..., persistent=False) to fix this, but I agree that True as default is strange

@Borda Borda changed the title [Bug] Uneeded elements in state dict Unneeded elements in state dict Nov 12, 2020
@Borda Borda added the checkpointing Related to checkpointing label Nov 12, 2020
@SkafteNicki SkafteNicki linked a pull request Nov 15, 2020 that will close this issue
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working checkpointing Related to checkpointing help wanted Open to be worked on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants