Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix saving bug #1073

Merged
merged 1 commit into from
Jun 14, 2023
Merged

Fix saving bug #1073

merged 1 commit into from
Jun 14, 2023

Conversation

mattdangerw
Copy link
Member

See #1042 for repro.

This was actually introduced when we added output activations to our classifiers and a faulty assumption we made a serializing activations.

keras.activations.get(None) == keras.activations.linear

Which means that when you round trip through saving/serializing, activation=None becomes activation=keras.activations.linear. This was causing us to choose an incorrect loss, which was correctly tripping an error we added to the base Task object to detect an incorrect loss, which was causing some odd serialization errors to get printed with normal saving.

Probably more to do here... both with improving our compilation experience overall, and surfacing better errors for saving. Also, someday soon I hope I need to consolidate some of our model testing code.

See keras-team#1042 for repro.

This was actually introduced when we added output activations to our
classifiers and a faulty assumption we made a serializing activations.

`keras.activations.get(None) == keras.activations.linear`

Which means that when you round trip through saving/serializing,
`activation=None` becomes `activation=keras.activations.linear`. This
was causing us to choose an incorrect loss, which was correctly tripping
an error we added to the base `Task` object to detect an incorrect
loss, which was causing some odd serialization errors to get printed
with normal saving.

Probably more to do here... both with improving our compilation
experience overall, and surfacing better errors for saving.
@mattdangerw
Copy link
Member Author

/gcbrun

@alvarobartt
Copy link

I'll have a look at this and try to reproduce it! Thanks @mattdangerw 👍🏻

@mattdangerw mattdangerw requested a review from jbischof June 9, 2023 18:40
@@ -113,12 +113,27 @@ def test_classifier_fit_no_xla(self):
self.classifier.fit(self.preprocessed_dataset)

def test_serialization(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the serialization and saving tests be combined? The later is a superset of the former and they are both pretty slow.

Copy link
Member Author

@mattdangerw mattdangerw Jun 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saving is still 2-3x slower than serialization, so the net effect of pushing more test cases into the saving path would actually be an even slower test suite.

Is you goal to make these faster or less code?

  • If the former, we could consider other follow ups, like not making a dataset in setUp. That's an easy win that would actually speed things up.
  • If the latter, let's just focus on consolidating our testing code across models! Something I would really prioritize, just lacking bandwidth.

@@ -186,9 +186,10 @@ def __init__(
self.dropout = dropout

# Default compilation
logit_output = self.activation == keras.activations.linear
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to understand this better: what is the result of activations.get(None)? I don't see a None converted to linear anywhere in these Tasks so it seems like they must be equivalent at some level.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read the PR description above :), literally answers this question verbatim.

Here's the lines of tf.keras, if interested
https://github.com/keras-team/keras/blob/5849a0953a644bd6af51b672b32a235510d4f43d/keras/activations.py#L683-L684

Copy link
Contributor

@jbischof jbischof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants