Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle trainable and name in the backbone base class #680

Merged
merged 2 commits into from
Jan 19, 2023

Conversation

mattdangerw
Copy link
Member

No description provided.

@mattdangerw mattdangerw requested a review from jbischof January 18, 2023 21:18
Copy link
Contributor

@jbischof jbischof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@@ -27,6 +27,12 @@ class Backbone(keras.Model):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)

def get_config(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: Couldn't you technically chain all the way up to keras.Model for this?

Copy link
Member Author

@mattdangerw mattdangerw Jan 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot because of a pain point with functional subclassed models.

By default get_config on a functional model will return a huge nested description of all layers, that can only be used with from_config for the function keras model class itself. There's no way to use the standard functional model get_config and from_config and still get back a BertBackbone, you would just get back a vanilla functional model (with no special properties, methods, docstring etc).

So essentially for functional subclasses, you have to "break the chain" of get_config and from_config chaining to super, which is unusual. Everywhere else in Keras you want to do this! But at least we are able to hide this pain point from our downstream model implementation now. And potentially we could prioritize some work in core Keras at some point to improve the "functional subclass" experience and remove this weird gotcha.

I can leave a (much shorter than this), comment explaining this on the backbone class.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation! Lifelong learner 🤓

@mattdangerw mattdangerw merged commit c747ad8 into keras-team:master Jan 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants