Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port albert transformer checkpoint #1767

Merged
merged 5 commits into from
Aug 19, 2024

Conversation

cosmo3769
Copy link
Contributor

Hi @mattdangerw @ariG23498,

Ported albert transformers checkpoint in kerasNLP. Please check. Thank you!

hf_weight_key="albert.encoder.embedding_hidden_mapping_in.bias",
)

keras_prefix = backbone.get_layer("group_0_inner_layer_0")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't really a prefix, it's a layer. maybe change the name?

)

keras_prefix = backbone.get_layer("group_0_inner_layer_0")
hf_prefix = "albert.encoder.albert_layer_groups.0.albert_layers.0."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this seems two hardcoded. Both Keras and HF have config support for multiple groups, but this assumes that num_hidden_groups=1 and maybe that inner_group_num=1 too.

If it's easy to write this in such a way that we are not assuming those constants we should just do that. Otherwise, we should at least throw if num_hidden_groups > 1 or inner_group_num > 1, which could be true for a user upload.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Makes sense. I have made changes to fix it. Does this approach works?

@cosmo3769 cosmo3769 requested a review from mattdangerw August 16, 2024 13:23
@mattdangerw
Copy link
Member

This looks good to me! I pushed some formatting fixes to limit line length, but will pull this in.

mattdangerw added a commit to mattdangerw/keras-hub that referenced this pull request Aug 19, 2024
Just noticed while porting keras-team#1767
that the default learning rate for our classifier does not work
for albert pretrained checkpoints. Let's lower it for this model
@mattdangerw
Copy link
Member

Here's a colab to test encoder style checkpoints like this if it's helpful https://colab.research.google.com/gist/mattdangerw/a1fedc952d28c8395021a94b77e75872/albert-test.ipynb

@mattdangerw mattdangerw merged commit c60e9ad into keras-team:master Aug 19, 2024
7 checks passed
mattdangerw added a commit to mattdangerw/keras-hub that referenced this pull request Aug 19, 2024
Just noticed while porting keras-team#1767
that the default learning rate for our classifier does not work
for albert pretrained checkpoints. Let's lower it for this model
mattdangerw added a commit that referenced this pull request Aug 20, 2024
Just noticed while porting #1767
that the default learning rate for our classifier does not work
for albert pretrained checkpoints. Let's lower it for this model
pkgoogle pushed a commit to pkgoogle/keras-hub that referenced this pull request Aug 22, 2024
* port albert

* update test

* resolve comments

* changed name

* minor formatting fixes

---------

Co-authored-by: Matt Watson <[email protected]>
pkgoogle pushed a commit to pkgoogle/keras-hub that referenced this pull request Aug 22, 2024
Just noticed while porting keras-team#1767
that the default learning rate for our classifier does not work
for albert pretrained checkpoints. Let's lower it for this model
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants