Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deterministic-ally get activation_index, fixed identation, added support for python3 #4

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

kendricktan
Copy link

Hi there, just wanted to say thank you for the blog post and the code example. I noticed that the function compute_rank in finetune.py is mutating a global state, namely grad_index to calculate activation_index.

See:

activation_index = len(self.activations) - self.grad_index - 1

While its fine for a single GPU, I noticed that it becomes non-deterministic while being pruned/trained on multiple GPUs.

This pull request solves that issue as well as added support for python3.

Copy link
Owner

@jacobgil jacobgil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks a lot for this.
I will be able to test this over the weekend.
I'm not sure I understand it correctly, I added a comment about some part of the new code. Can you please explain how it works?

finetune.py Outdated
for layer, (name, module) in enumerate(self.model.features._modules.items()):
x = module(x)
if isinstance(module, torch.nn.modules.conv.Conv2d):
x.register_hook(self.compute_rank)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.compute_rank is now a function that returns a function (hook). It looks like the pytorch hook will call compute_rank, it will return hook as a function object (but won't run it), and self.filter_ranks won't be computed anywhere.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.compute_rank now returns a function (hook). So when self.compute_rank(activation_index) is called, hook (a partial function with the local variable activation_index) is passed in as the call back function for register_hook.

So when the gradients are updated, hook is called, but doesn't need to calculate the activation_index because it's given when you called (self.compute_ranks(INDEX))

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.
But if so, then wasn't the intention to do:
x.register_hook(self.compute_rank(activation_index))
self.activations.append(x)

Othwerwise x isn't appended to self.activations and can't be used from within hook, and pytorch isn't registering the gradient callback to the partial function from self.compute_rank.

@kendricktan
Copy link
Author

It's hard to explain, but here's a code snippet that explain what partial functions do.

def f(a):
    def F(b):
        return b + 5
    return F

>>> fun = f(10)
>>> fun(3)

Copy link
Owner

@jacobgil jacobgil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @kendricktan, just wanted to make sure you saw my latest comment.
Does it make sense?

finetune.py Outdated
for layer, (name, module) in enumerate(self.model.features._modules.items()):
x = module(x)
if isinstance(module, torch.nn.modules.conv.Conv2d):
x.register_hook(self.compute_rank)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.
But if so, then wasn't the intention to do:
x.register_hook(self.compute_rank(activation_index))
self.activations.append(x)

Othwerwise x isn't appended to self.activations and can't be used from within hook, and pytorch isn't registering the gradient callback to the partial function from self.compute_rank.

@kendricktan
Copy link
Author

Oh whoops you are completely right, it should be registering the hook with the partial function and appending X to the activations, not the other way around. I should have slept before committing this. I'll change it when I have time, thanks.

@kendricktan
Copy link
Author

fixed in 212f1b5

@jiayouba120035
Copy link

Hello, thank you for the blog post and the code. I run your code but get some problem, "python finetune.py
--train" shows that the test accuracy is about 50%, and the train accuracy is > 95%, I really don't know what's wrong with my implement, so ask for your help. And I consider maybe the data isn't loaded correctly, the test path is /../../test2, and folder "test" is in the folder test2, then the pictures are in folder test, is the data loaded correctly? I am new in python, Thank you in advance for your help.

@Aleks1977
Copy link

Как по телефону вычислить хазяина,кто знает помогите 89635264714 вот этого гада!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants