Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

find_learning_rate() Issues #982

Closed
BernierCR opened this issue Aug 8, 2019 · 1 comment · Fixed by #1119
Closed

find_learning_rate() Issues #982

BernierCR opened this issue Aug 8, 2019 · 1 comment · Fixed by #1119
Labels
question Further information is requested

Comments

@BernierCR
Copy link

BernierCR commented Aug 8, 2019

Hello. So I've been building a TextClassifier. I am up to the optimize learning rate step. However, I don't think the find_learning_rate function is working correctly, and I don't know quite enough about it to debug it very well.

1) The model is supposed to take awhile to train. It completes really fast when using this function.

  1. It's supposed to try 100 different learning rates. I get 87 iterations, not 100, and they are all using the same starting learning rate.

  2. I can't see what it's doing at all. There is no output of the train step. It just all of a sudden says its done, but it hasn't produced anything of value.

I don't know if this is a small bug, a user error, or something bigger.

It almost feels like it's running 87 minibatches at the starting learning rate, and then thinking it's done already when it's barely started.

I mean worst case scenario, I can use hyperopt to check a few learning rates over night. But it would be really cool if this method could be made to work. I need a little guidance to help debug this.

Thanks so much.

@BernierCR BernierCR added the question Further information is requested label Aug 8, 2019
@BernierCR
Copy link
Author

BernierCR commented Aug 8, 2019

I've been investigating and reading up on the theory. It's supposed to be starting out untrained and go by minibatch. That part is good. Point one is invalid.

The issue is that scheduler.step(1) is supposed to increment the learning rate each time. It is not. I will have to go deeper into scheduler.

For point 2, I feel this is still valid. If it runs out of mini-batches before iterations is reached, shouldn't it start over from the beginning of the batches? If I ask for 1000 iterations, I want 1000 iterations. It's up to me to decide whether that number makes any sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant