-
Notifications
You must be signed in to change notification settings - Fork 251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added model card links for all pretrained models. #795
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Cyber-Machine - in my opinion, we should not add HF links for any model except DistilBERT (DistilBERT is by folks from HF...so, we can add the HF link there). But for other models, we should link the README of the official repo.
Listing them down below:
BART: https://github.com/facebookresearch/fairseq/blob/main/examples/bart/README.md
DeBERTa: https://github.com/microsoft/DeBERTa/blob/master/README.md
FNet: https://github.com/google-research/google-research/blob/master/f_net/README.md
OPT: https://github.com/facebookresearch/metaseq/blob/main/projects/OPT/README.md
RoBERTa: https://github.com/facebookresearch/fairseq/blob/main/examples/roberta/README.md
XLM-RoBERTa: https://github.com/facebookresearch/fairseq/blob/main/examples/xlmr/README.md
I might be wrong though...so, we'll have to check with @mattdangerw!
"weights_url": "https://storage.googleapis.com/keras-nlp/models/deberta_v3_extra_small_en/v1/model.h5", | ||
"weights_hash": "d8e10327107e5c5e20b45548a5028619", | ||
"spm_proto_url": "https://storage.googleapis.com/keras-nlp/models/deberta_v3_extra_small_en/v1/vocab.spm", | ||
"spm_proto_hash": "1613fcbf3b82999c187b09c9db79b568", | ||
}, | ||
"deberta_v3_small_en": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be a lot of code delta unrelated to the change we want to make in this PR. We should avoid making such changes. Make only those changes which are necessary for the PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have rearranged dictionary orders so that metadata comes first in order to make it consistent with other presets, there is not much change apart from that.
I was thinking the same while adding but huggingface provides a model card for each presets and is officially maintained either by the organization itself or by hf-maintainers. Infact, presets of DeBERTa in README link to huggingface repository. |
Ohh, I was checking a few model cards on Hugging Face. Most of them have a disclaimer. For example: https://huggingface.co/facebook/bart-base
https://huggingface.co/google/fnet-base
https://huggingface.co/facebook/opt-350m
, etc. So, most of the model cards have not been written by the team which came up with the model. |
Understood, So I'll be removing model cards that are not officially written by the authors. Apart from DistilBERT and DeBERTa, I will add links for README for the official model done by authors and links to the OPT model's appendix. |
Let's wait for Matt's opinion on this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Cyber-Machine and @abheesht17 for the review!
I think what the two of your came up with is good. For model cards we should make an attempt to link to what was authored by the original researchers, so this sgtm.
We will probably need a disclaimer on the keras.io side explaining that we will link a model card where available, and a general README if none has been provided.
@@ -24,6 +24,7 @@ | |||
"params": 66362880, | |||
"official_name": "DistilBERT", | |||
"path": "distil_bert", | |||
"model_card": "https://huggingface.co/distilbert-base-uncased", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of linking this, let's link the model card explicitly, So, in this case, https://huggingface.co/distilbert-base-uncased#distilbert-base-model-uncased. Same for other DistilBERT presets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with the main link actually. What is here is what you get when you click the model card tab on huggingface. (and i would somewhat suspect that anchor link might change down the link anyway)
"Trained on the C4 dataset." | ||
), | ||
"params": 82861056, | ||
"params": 236945408, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you used https://github.com/keras-team/keras-nlp/blob/master/tools/count_preset_params.py to count the #parameters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is just the diff being confusing right? This isn't a new number though.
"Trained on the C4 dataset." | ||
), | ||
"params": 82861056, | ||
"params": 236945408, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is just the diff being confusing right? This isn't a new number though.
@@ -24,6 +24,7 @@ | |||
"params": 66362880, | |||
"official_name": "DistilBERT", | |||
"path": "distil_bert", | |||
"model_card": "https://huggingface.co/distilbert-base-uncased", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with the main link actually. What is here is what you get when you click the model card tab on huggingface. (and i would somewhat suspect that anchor link might change down the link anyway)
Merging, these nightly failures are unrelated and fixed elsewhere. |
Fixes: #789
@mattdangerw I have added
model_card
for all pre-trained models undermetadata
.I have used model_cards provided by huggingface which are officially maintained by the organization or mentioned in the official paper for each of the models. BART & OPT models are maintained by Facebook, DeBERTa-V3 are maintained by Microsoft, RoBERTa, DistilBERT, and XLM_RoBERTa are maintained by hf-maintainers and FNet is maintained by Google.
Since not all models under BERT was maintained officially at huggingface, I have added a link to the README file for all BERT model.
For GPT-2 I have added a link to model_card present on their Github.
I am thinking to render the model card in one of the following ways:
My plan is to integrate model card from first way into keras-io.
Awaiting for your suggestions to this PR.