-
Notifications
You must be signed in to change notification settings - Fork 613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix LAMB optimizer regex parsing #1532
Conversation
You are owner of some files modified in this pull request. |
Thanks @jarednielsen for this pull request. I'll take a longer look tomorrow to have the big, for the moments, some quick thoughts in 2 minutes:
If there is a way we can fit both in this pull request, that'd be perfect :) |
Thanks for the quick response! I'm all for backwards compatibility. I'm just not seeing how any usage of this parameter could possibly succeed. For example, using
What backwards-compatible behavior would we want to keep? I suppose if your regex was literally just a single character and you only had one pattern to match, then that use case would break. So maybe |
I'm not so sure backwards compatibility here is what's needed since passing a string variable (as enforced in typeguard) would fail. This isn't so much an improvement, but rather a bug fix. The model garden is one of the more popular repos depending on addons so I think it's important we patch this onto 0.8.4 |
@jarednielsen would you mind adding a test case for calling the optimizer with these parameters so this type of thing would be caught in the future please? |
@seanpmorgan Sure, added the test! |
@@ -401,3 +401,11 @@ def test_get_config(self): | |||
opt = lamb.LAMB(1e-4) | |||
config = opt.get_config() | |||
self.assertEqual(config["learning_rate"], 1e-4) | |||
|
|||
def test_exclude_weight_decay(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a test for exclude_from_layer_adaption
as well please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the pull request! That's some great investigation work and solution!
* Fix type for LAMB optimizer exclude_from_weight_decay * Add import * Add optional wrapper * Add test * Layer adaption test * Typo
See my issue at #1530
The LAMB optimizer declares that its
exclude_from_weight_decay
argument should take in a comma-separated string of regex patterns. However, the code expects a list of regex patterns and instead iterates through each character in the string. Thus nearly every call to_do_use_weight_decay()
returns False.I attempted to pass in a list to circumvent this bug, but this leads to a
typeguard
error. So there's no easy way around this in the meantime. A similar bug exists forexclude_from_layer_adaption
.Two proposed fixes, and I'm happy to contribute either:
Optional[str]
toList[str]
. This would be preferred, and match the style of other implementations in the TensorFlow repo. See here for a list of examples. This is the current PR..split(',')
toexclude_from_weight_decay
andexclude_from_layer_adaption
in the constructor.