-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deterministic-ally get activation_index, fixed identation, added support for python3 #4
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks a lot for this.
I will be able to test this over the weekend.
I'm not sure I understand it correctly, I added a comment about some part of the new code. Can you please explain how it works?
finetune.py
Outdated
for layer, (name, module) in enumerate(self.model.features._modules.items()): | ||
x = module(x) | ||
if isinstance(module, torch.nn.modules.conv.Conv2d): | ||
x.register_hook(self.compute_rank) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.compute_rank is now a function that returns a function (hook). It looks like the pytorch hook will call compute_rank, it will return hook as a function object (but won't run it), and self.filter_ranks won't be computed anywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.compute_rank now returns a function (hook). So when self.compute_rank(activation_index) is called, hook (a partial function with the local variable activation_index) is passed in as the call back function for register_hook.
So when the gradients are updated, hook is called, but doesn't need to calculate the activation_index because it's given when you called (self.compute_ranks(INDEX))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
But if so, then wasn't the intention to do:
x.register_hook(self.compute_rank(activation_index))
self.activations.append(x)
Othwerwise x isn't appended to self.activations and can't be used from within hook, and pytorch isn't registering the gradient callback to the partial function from self.compute_rank.
It's hard to explain, but here's a code snippet that explain what partial functions do.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @kendricktan, just wanted to make sure you saw my latest comment.
Does it make sense?
finetune.py
Outdated
for layer, (name, module) in enumerate(self.model.features._modules.items()): | ||
x = module(x) | ||
if isinstance(module, torch.nn.modules.conv.Conv2d): | ||
x.register_hook(self.compute_rank) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
But if so, then wasn't the intention to do:
x.register_hook(self.compute_rank(activation_index))
self.activations.append(x)
Othwerwise x isn't appended to self.activations and can't be used from within hook, and pytorch isn't registering the gradient callback to the partial function from self.compute_rank.
Oh whoops you are completely right, it should be registering the hook with the partial function and appending X to the activations, not the other way around. I should have slept before committing this. I'll change it when I have time, thanks. |
fixed in 212f1b5 |
Hello, thank you for the blog post and the code. I run your code but get some problem, "python finetune.py |
Как по телефону вычислить хазяина,кто знает помогите 89635264714 вот этого гада! |
Hi there, just wanted to say thank you for the blog post and the code example. I noticed that the function
compute_rank
in finetune.py is mutating a global state, namelygrad_index
to calculateactivation_index
.See:
pytorch-pruning/finetune.py
Line 73 in 7c3a5af
While its fine for a single GPU, I noticed that it becomes non-deterministic while being pruned/trained on multiple GPUs.
This pull request solves that issue as well as added support for python3.