Embed LR schedule and initialization with the model #36

Maratyszcza · 2017-01-20T18:43:57Z

I tried to implement SqueezeNet as a torchvision model and train it via ImageNet example, and found that it doesn't converge as is. The reference code differs in two aspect:

All but the last convolutions are initialized with Xavier Glorot initializer, the last is normal with stdev 0.01
The learning rate is linearly decreased (polynomial schedule with power=1).

In PyTorch these aspects are hard-coded inside the ImageNet example, but I think it makes sense to make them part of the model definition in torch.vision. What's your position on it?

The text was updated successfully, but these errors were encountered:

apaszke · 2017-01-20T18:48:21Z

Weights can be initialized in model's __init__, so it has nothing to do with the imagenet example, right?

As for lr schedule, I think we can just do sth like

if hasattr(model, 'lr_schedule'):
    lr = model.lr_schedule(epoch)
else:
    lr = args.lr * (0.1 ** (epoch // 30))

colesbury · 2017-01-20T18:48:49Z

I've been trying to put weight initialization as part of the model, since it often seem particular to the type of architecture. I added it to the ResNet definition and I'm going to add it to the VGG model def:
https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py#L112

I'm not sure about learning rate schedule. It seems awkward to put it as part of the model definition, but as you point out, hard coding it in the ImageNet example isn't ideal either

Maratyszcza · 2017-01-20T19:01:58Z

@apaszke @colesbury Thanks, I will do the same for SqueezeNet. Are there pre-defined initialization functions/classes for popular initialization schemes (e.g. like in Neon or Keras)?

@colesbury I think ideally we should provide a default learning schedule as a part of torch.vision models and let users override it via command-line arguments.

apaszke · 2017-01-20T19:26:35Z

Yes, we should add them in nn somewhere.

eladhoffer · 2017-01-21T07:39:43Z

What do you think about this kind of regime inside of the model?
https://github.com/eladhoffer/convNet.pytorch/blob/master/models/alexnet.py

apaszke · 2017-01-21T10:48:15Z

Can't open it, are you sure the link is correct and the repo is public?

eladhoffer · 2017-01-21T10:58:00Z

How about https://raw.githubusercontent.com/eladhoffer/convNet.pytorch/master/models/alexnet.py
The repo is public

apaszke · 2017-01-21T19:51:41Z

That's one way to approach it, but I'm not sure if it's the most convenient one. Having a function that returns an optimizer for a given epoch seems more powerful.

alykhantejani · 2017-02-02T22:11:02Z

Is there/will there be a nice way to adapt the learning rate or momentum but keep other state in the optimizer, i.e. for Adam

alykhantejani · 2017-09-08T08:31:12Z

Should this issue be moved to the pytorch repo instead?

vfdev-5 · 2017-12-07T23:06:10Z

As you, guys, speak also here about weight initialization, what about DenseNet if we want to use a not pretrained model ?
According to Caffe official implementation, convolutions are initialized with something like kaiming_normal.

fmassa added the needs discussion label Sep 11, 2017

emericit mentioned this issue Jan 27, 2020

Illegal instruction (core dumped) with some pretrained models (but not all) #1782

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Embed LR schedule and initialization with the model #36

Embed LR schedule and initialization with the model #36

Maratyszcza commented Jan 20, 2017

apaszke commented Jan 20, 2017

colesbury commented Jan 20, 2017

Maratyszcza commented Jan 20, 2017

apaszke commented Jan 20, 2017

eladhoffer commented Jan 21, 2017 •

edited

Loading

apaszke commented Jan 21, 2017

eladhoffer commented Jan 21, 2017

apaszke commented Jan 21, 2017

alykhantejani commented Feb 2, 2017

alykhantejani commented Sep 8, 2017

vfdev-5 commented Dec 7, 2017

Embed LR schedule and initialization with the model #36

Embed LR schedule and initialization with the model #36

Comments

Maratyszcza commented Jan 20, 2017

apaszke commented Jan 20, 2017

colesbury commented Jan 20, 2017

Maratyszcza commented Jan 20, 2017

apaszke commented Jan 20, 2017

eladhoffer commented Jan 21, 2017 • edited Loading

apaszke commented Jan 21, 2017

eladhoffer commented Jan 21, 2017

apaszke commented Jan 21, 2017

alykhantejani commented Feb 2, 2017

alykhantejani commented Sep 8, 2017

vfdev-5 commented Dec 7, 2017

eladhoffer commented Jan 21, 2017 •

edited

Loading