Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added model card links for all pretrained models. #795

Merged
merged 6 commits into from
Mar 8, 2023

Conversation

Cyber-Machine
Copy link
Contributor

Fixes: #789

@mattdangerw I have added model_card for all pre-trained models under metadata.

I have used model_cards provided by huggingface which are officially maintained by the organization or mentioned in the official paper for each of the models. BART & OPT models are maintained by Facebook, DeBERTa-V3 are maintained by Microsoft, RoBERTa, DistilBERT, and XLM_RoBERTa are maintained by hf-maintainers and FNet is maintained by Google.

Since not all models under BERT was maintained officially at huggingface, I have added a link to the README file for all BERT model.

For GPT-2 I have added a link to model_card present on their Github.

I am thinking to render the model card in one of the following ways:

My plan is to integrate model card from first way into keras-io.

Awaiting for your suggestions to this PR.

Copy link
Collaborator

@abheesht17 abheesht17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +39 to +44
"weights_url": "https://storage.googleapis.com/keras-nlp/models/deberta_v3_extra_small_en/v1/model.h5",
"weights_hash": "d8e10327107e5c5e20b45548a5028619",
"spm_proto_url": "https://storage.googleapis.com/keras-nlp/models/deberta_v3_extra_small_en/v1/vocab.spm",
"spm_proto_hash": "1613fcbf3b82999c187b09c9db79b568",
},
"deberta_v3_small_en": {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a lot of code delta unrelated to the change we want to make in this PR. We should avoid making such changes. Make only those changes which are necessary for the PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have rearranged dictionary orders so that metadata comes first in order to make it consistent with other presets, there is not much change apart from that.

@Cyber-Machine
Copy link
Contributor Author

@Cyber-Machine - in my opinion, we should not add HF links for any model except DistilBERT (DistilBERT is by folks from HF...so, we can add the HF link there). But for other models, we should link the README of the official repo.

Listing them down below: BART: https://github.com/facebookresearch/fairseq/blob/main/examples/bart/README.md DeBERTa: https://github.com/microsoft/DeBERTa/blob/master/README.md FNet: https://github.com/google-research/google-research/blob/master/f_net/README.md OPT: https://github.com/facebookresearch/metaseq/blob/main/projects/OPT/README.md RoBERTa: https://github.com/facebookresearch/fairseq/blob/main/examples/roberta/README.md XLM-RoBERTa: https://github.com/facebookresearch/fairseq/blob/main/examples/xlmr/README.md

I might be wrong though...so, we'll have to check with @mattdangerw!

I was thinking the same while adding but huggingface provides a model card for each presets and is officially maintained either by the organization itself or by hf-maintainers.

Infact, presets of DeBERTa in README link to huggingface repository.

@abheesht17
Copy link
Collaborator

Ohh, I was checking a few model cards on Hugging Face. Most of them have a disclaimer. For example:

https://huggingface.co/facebook/bart-base

Disclaimer: The team releasing BART did not write a model card for this model so this model card has been written by the Hugging Face team.

https://huggingface.co/google/fnet-base

Disclaimer: This model card has been written by <contributor>.

https://huggingface.co/facebook/opt-350m

Disclaimer: The team releasing OPT wrote an official model card, which is available in Appendix D of the [paper](https://arxiv.org/pdf/2205.01068.pdf). Content from this model card has been written by the Hugging Face team.

, etc.

So, most of the model cards have not been written by the team which came up with the model.

@Cyber-Machine
Copy link
Contributor Author

Ohh, I was checking a few model cards on Hugging Face. Most of them have a disclaimer. For example:

https://huggingface.co/facebook/bart-base

Disclaimer: The team releasing BART did not write a model card for this model so this model card has been written by the Hugging Face team.

https://huggingface.co/google/fnet-base

Disclaimer: This model card has been written by <contributor>.

https://huggingface.co/facebook/opt-350m

Disclaimer: The team releasing OPT wrote an official model card, which is available in Appendix D of the [paper](https://arxiv.org/pdf/2205.01068.pdf). Content from this model card has been written by the Hugging Face team.

, etc.

So, most of the model cards have not been written by the team which came up with the model.

Understood, So I'll be removing model cards that are not officially written by the authors.

Apart from DistilBERT and DeBERTa, I will add links for README for the official model done by authors and links to the OPT model's appendix.

@Cyber-Machine Cyber-Machine requested a review from abheesht17 March 2, 2023 18:30
@abheesht17
Copy link
Collaborator

Let's wait for Matt's opinion on this

Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Cyber-Machine and @abheesht17 for the review!

I think what the two of your came up with is good. For model cards we should make an attempt to link to what was authored by the original researchers, so this sgtm.

We will probably need a disclaimer on the keras.io side explaining that we will link a model card where available, and a general README if none has been provided.

keras_nlp/models/opt/opt_presets.py Outdated Show resolved Hide resolved
keras_nlp/models/xlm_roberta/xlm_roberta_presets.py Outdated Show resolved Hide resolved
@Cyber-Machine Cyber-Machine requested review from mattdangerw and abheesht17 and removed request for abheesht17 and mattdangerw March 3, 2023 05:19
@@ -24,6 +24,7 @@
"params": 66362880,
"official_name": "DistilBERT",
"path": "distil_bert",
"model_card": "https://huggingface.co/distilbert-base-uncased",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of linking this, let's link the model card explicitly, So, in this case, https://huggingface.co/distilbert-base-uncased#distilbert-base-model-uncased. Same for other DistilBERT presets.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with the main link actually. What is here is what you get when you click the model card tab on huggingface. (and i would somewhat suspect that anchor link might change down the link anyway)

"Trained on the C4 dataset."
),
"params": 82861056,
"params": 236945408,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is just the diff being confusing right? This isn't a new number though.

"Trained on the C4 dataset."
),
"params": 82861056,
"params": 236945408,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is just the diff being confusing right? This isn't a new number though.

@@ -24,6 +24,7 @@
"params": 66362880,
"official_name": "DistilBERT",
"path": "distil_bert",
"model_card": "https://huggingface.co/distilbert-base-uncased",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with the main link actually. What is here is what you get when you click the model card tab on huggingface. (and i would somewhat suspect that anchor link might change down the link anyway)

@mattdangerw
Copy link
Member

Merging, these nightly failures are unrelated and fixed elsewhere.

@mattdangerw mattdangerw merged commit 50e9248 into keras-team:master Mar 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add model card links on keras.io for all pre-trained models
3 participants