Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace adam(b1=0) by rmsprop for schedule_free #1025

Merged
merged 1 commit into from
Jul 25, 2024

Conversation

copybara-service[bot]
Copy link

Replace adam(b1=0) by rmsprop for schedule_free

@copybara-service copybara-service bot force-pushed the test_655920595 branch 2 times, most recently from b9f7560 to 36bc598 Compare July 25, 2024 13:40
@vroulet
Copy link
Collaborator

vroulet commented Jul 25, 2024

Closes #1006

@copybara-service copybara-service bot force-pushed the test_655920595 branch 2 times, most recently from ae376c6 to 3c8f16f Compare July 25, 2024 14:01
@nasyxx
Copy link
Contributor

nasyxx commented Oct 2, 2024

Is it possible to rename the schedule_free_adamw to schedule_free_rms or mention in the document that it is actually using rms. Although when beta1 is 0, they are almost the same.

@vroulet
Copy link
Collaborator

vroulet commented Oct 2, 2024

  1. What do you mean by they are "almost" the same? There is a test that checked that they match. If you found non-negligible discrepancies, please let us know: file a bug with a minimal reproducing example.
  2. Provided that there is no difference between the wrapped version and the shortuct version (see tests), the implementation matches the implementation of schedule_free with adamw as intended by the authors so we won't rename it.
  3. I can add indeed a note that the implementation is using rmsprop for efficiency reasons, that's a good point. Could you open an issue on that, so that I don't forget? You may also directly create the PR that adds this information on the docstring.

@nasyxx
Copy link
Contributor

nasyxx commented Oct 2, 2024

  1. What do you mean by they are "almost" the same? There is a test that checked that they match. If you found non-negligible discrepancies, please let us know: file a bug with a minimal reproducing example.

The tests are matched. I meant when b1 is 0, Adam's and RMS's proposals are almost the same.

  1. I can add indeed a note that the implementation is using rmsprop for efficiency reasons, that's a good point. Could you open an issue on that, so that I don't forget? You may also directly create the PR that adds this information on the docstring.

Yes, I just checked another issue and found that it reduces memory usage.

@vroulet
Copy link
Collaborator

vroulet commented Oct 2, 2024

I meant when b1 is 0, Adam's and RMS's proposals are almost the same.

Are they "exactly" or "almost" the same? To me they are exactly the same. If not, please let us know what are exactly the differences.

@nasyxx
Copy link
Contributor

nasyxx commented Oct 2, 2024

Are they "exactly" or "almost" the same? To me they are exactly the same. If not, please let us know what are exactly the differences.

The difference between Adam and RMSprop is that Adam has bias. However, yes, with bias_correction=True in RMS, they should be exactly the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants