-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Embed LR schedule and initialization with the model #36
Comments
Weights can be initialized in model's As for lr schedule, I think we can just do sth like if hasattr(model, 'lr_schedule'):
lr = model.lr_schedule(epoch)
else:
lr = args.lr * (0.1 ** (epoch // 30)) |
I've been trying to put weight initialization as part of the model, since it often seem particular to the type of architecture. I added it to the ResNet definition and I'm going to add it to the VGG model def: I'm not sure about learning rate schedule. It seems awkward to put it as part of the model definition, but as you point out, hard coding it in the ImageNet example isn't ideal either |
@apaszke @colesbury Thanks, I will do the same for SqueezeNet. Are there pre-defined initialization functions/classes for popular initialization schemes (e.g. like in Neon or Keras)? @colesbury I think ideally we should provide a default learning schedule as a part of |
Yes, we should add them in nn somewhere. |
What do you think about this kind of regime inside of the model? |
Can't open it, are you sure the link is correct and the repo is public? |
How about https://raw.githubusercontent.com/eladhoffer/convNet.pytorch/master/models/alexnet.py |
That's one way to approach it, but I'm not sure if it's the most convenient one. Having a function that returns an optimizer for a given epoch seems more powerful. |
Is there/will there be a nice way to adapt the learning rate or momentum but keep other state in the optimizer, i.e. for Adam |
Should this issue be moved to the |
As you, guys, speak also here about weight initialization, what about DenseNet if we want to use a not pretrained model ? |
I tried to implement SqueezeNet as a
torchvision
model and train it via ImageNet example, and found that it doesn't converge as is. The reference code differs in two aspect:In PyTorch these aspects are hard-coded inside the ImageNet example, but I think it makes sense to make them part of the model definition in
torch.vision
. What's your position on it?The text was updated successfully, but these errors were encountered: