Fix saving bug #1073

mattdangerw · 2023-06-09T02:44:59Z

See #1042 for repro.

This was actually introduced when we added output activations to our classifiers and a faulty assumption we made a serializing activations.

keras.activations.get(None) == keras.activations.linear

Which means that when you round trip through saving/serializing, activation=None becomes activation=keras.activations.linear. This was causing us to choose an incorrect loss, which was correctly tripping an error we added to the base Task object to detect an incorrect loss, which was causing some odd serialization errors to get printed with normal saving.

Probably more to do here... both with improving our compilation experience overall, and surfacing better errors for saving. Also, someday soon I hope I need to consolidate some of our model testing code.

See keras-team#1042 for repro. This was actually introduced when we added output activations to our classifiers and a faulty assumption we made a serializing activations. `keras.activations.get(None) == keras.activations.linear` Which means that when you round trip through saving/serializing, `activation=None` becomes `activation=keras.activations.linear`. This was causing us to choose an incorrect loss, which was correctly tripping an error we added to the base `Task` object to detect an incorrect loss, which was causing some odd serialization errors to get printed with normal saving. Probably more to do here... both with improving our compilation experience overall, and surfacing better errors for saving.

mattdangerw · 2023-06-09T02:47:12Z

/gcbrun

alvarobartt · 2023-06-09T07:55:24Z

I'll have a look at this and try to reproduce it! Thanks @mattdangerw 👍🏻

jbischof · 2023-06-09T22:30:17Z

keras_nlp/models/deberta_v3/deberta_v3_classifier_test.py

@@ -113,12 +113,27 @@ def test_classifier_fit_no_xla(self):
        self.classifier.fit(self.preprocessed_dataset)

    def test_serialization(self):


Could the serialization and saving tests be combined? The later is a superset of the former and they are both pretty slow.

Saving is still 2-3x slower than serialization, so the net effect of pushing more test cases into the saving path would actually be an even slower test suite.

Is you goal to make these faster or less code?

If the former, we could consider other follow ups, like not making a dataset in setUp. That's an easy win that would actually speed things up.

If the latter, let's just focus on consolidating our testing code across models! Something I would really prioritize, just lacking bandwidth.

jbischof · 2023-06-09T22:34:12Z

keras_nlp/models/albert/albert_classifier.py

@@ -186,9 +186,10 @@ def __init__(
        self.dropout = dropout

        # Default compilation
+        logit_output = self.activation == keras.activations.linear


I would love to understand this better: what is the result of activations.get(None)? I don't see a None converted to linear anywhere in these Tasks so it seems like they must be equivalent at some level.

Read the PR description above :), literally answers this question verbatim.

Here's the lines of tf.keras, if interested
https://github.com/keras-team/keras/blob/5849a0953a644bd6af51b672b32a235510d4f43d/keras/activations.py#L683-L684

jbischof

Thanks!

mattdangerw mentioned this pull request Jun 9, 2023

Fine-tuned model not loadable via tensorflow.keras.models.load_model #1042

Open

mattdangerw requested a review from jbischof June 9, 2023 18:40

jbischof reviewed Jun 9, 2023

View reviewed changes

jbischof approved these changes Jun 13, 2023

View reviewed changes

mattdangerw merged commit d241dd0 into keras-team:master Jun 14, 2023

alvarobartt mentioned this pull request Jul 9, 2023

Error in loading model from the saved model in the disk #1112

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix saving bug #1073

Fix saving bug #1073

mattdangerw commented Jun 9, 2023

mattdangerw commented Jun 9, 2023

alvarobartt commented Jun 9, 2023

jbischof Jun 9, 2023

mattdangerw Jun 12, 2023 •

edited

Loading

jbischof Jun 9, 2023

mattdangerw Jun 12, 2023

jbischof left a comment

		@@ -113,12 +113,27 @@ def test_classifier_fit_no_xla(self):
		self.classifier.fit(self.preprocessed_dataset)

		def test_serialization(self):

Fix saving bug #1073

Fix saving bug #1073

Conversation

mattdangerw commented Jun 9, 2023

mattdangerw commented Jun 9, 2023

alvarobartt commented Jun 9, 2023

jbischof Jun 9, 2023

Choose a reason for hiding this comment

mattdangerw Jun 12, 2023 • edited Loading

Choose a reason for hiding this comment

jbischof Jun 9, 2023

Choose a reason for hiding this comment

mattdangerw Jun 12, 2023

Choose a reason for hiding this comment

jbischof left a comment

Choose a reason for hiding this comment

mattdangerw Jun 12, 2023 •

edited

Loading