-
Notifications
You must be signed in to change notification settings - Fork 251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix saving bug #1073
Fix saving bug #1073
Conversation
See keras-team#1042 for repro. This was actually introduced when we added output activations to our classifiers and a faulty assumption we made a serializing activations. `keras.activations.get(None) == keras.activations.linear` Which means that when you round trip through saving/serializing, `activation=None` becomes `activation=keras.activations.linear`. This was causing us to choose an incorrect loss, which was correctly tripping an error we added to the base `Task` object to detect an incorrect loss, which was causing some odd serialization errors to get printed with normal saving. Probably more to do here... both with improving our compilation experience overall, and surfacing better errors for saving.
/gcbrun |
I'll have a look at this and try to reproduce it! Thanks @mattdangerw 👍🏻 |
@@ -113,12 +113,27 @@ def test_classifier_fit_no_xla(self): | |||
self.classifier.fit(self.preprocessed_dataset) | |||
|
|||
def test_serialization(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could the serialization and saving tests be combined? The later is a superset of the former and they are both pretty slow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Saving is still 2-3x slower than serialization, so the net effect of pushing more test cases into the saving path would actually be an even slower test suite.
Is you goal to make these faster or less code?
- If the former, we could consider other follow ups, like not making a dataset in
setUp
. That's an easy win that would actually speed things up. - If the latter, let's just focus on consolidating our testing code across models! Something I would really prioritize, just lacking bandwidth.
@@ -186,9 +186,10 @@ def __init__( | |||
self.dropout = dropout | |||
|
|||
# Default compilation | |||
logit_output = self.activation == keras.activations.linear |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love to understand this better: what is the result of activations.get(None)
? I don't see a None
converted to linear
anywhere in these Tasks so it seems like they must be equivalent at some level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Read the PR description above :), literally answers this question verbatim.
Here's the lines of tf.keras, if interested
https://github.com/keras-team/keras/blob/5849a0953a644bd6af51b672b32a235510d4f43d/keras/activations.py#L683-L684
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
See #1042 for repro.
This was actually introduced when we added output activations to our classifiers and a faulty assumption we made a serializing activations.
keras.activations.get(None) == keras.activations.linear
Which means that when you round trip through saving/serializing,
activation=None
becomesactivation=keras.activations.linear
. This was causing us to choose an incorrect loss, which was correctly tripping an error we added to the baseTask
object to detect an incorrect loss, which was causing some odd serialization errors to get printed with normal saving.Probably more to do here... both with improving our compilation experience overall, and surfacing better errors for saving. Also, someday soon I hope I need to consolidate some of our model testing code.