Added model card links for all pretrained models. #795

Cyber-Machine · 2023-03-01T18:51:11Z

Fixes: #789

@mattdangerw I have added model_card for all pre-trained models under metadata.

I have used model_cards provided by huggingface which are officially maintained by the organization or mentioned in the official paper for each of the models. BART & OPT models are maintained by Facebook, DeBERTa-V3 are maintained by Microsoft, RoBERTa, DistilBERT, and XLM_RoBERTa are maintained by hf-maintainers and FNet is maintained by Google.

Since not all models under BERT was maintained officially at huggingface, I have added a link to the README file for all BERT model.

For GPT-2 I have added a link to model_card present on their Github.

I am thinking to render the model card in one of the following ways:

My plan is to integrate model card from first way into keras-io.

Awaiting for your suggestions to this PR.

abheesht17

@Cyber-Machine - in my opinion, we should not add HF links for any model except DistilBERT (DistilBERT is by folks from HF...so, we can add the HF link there). But for other models, we should link the README of the official repo.

Listing them down below:
BART: https://github.com/facebookresearch/fairseq/blob/main/examples/bart/README.md
DeBERTa: https://github.com/microsoft/DeBERTa/blob/master/README.md
FNet: https://github.com/google-research/google-research/blob/master/f_net/README.md
OPT: https://github.com/facebookresearch/metaseq/blob/main/projects/OPT/README.md
RoBERTa: https://github.com/facebookresearch/fairseq/blob/main/examples/roberta/README.md
XLM-RoBERTa: https://github.com/facebookresearch/fairseq/blob/main/examples/xlmr/README.md

I might be wrong though...so, we'll have to check with @mattdangerw!

abheesht17 · 2023-03-02T14:51:30Z

keras_nlp/models/deberta_v3/deberta_v3_presets.py

+        "weights_url": "https://storage.googleapis.com/keras-nlp/models/deberta_v3_extra_small_en/v1/model.h5",
+        "weights_hash": "d8e10327107e5c5e20b45548a5028619",
+        "spm_proto_url": "https://storage.googleapis.com/keras-nlp/models/deberta_v3_extra_small_en/v1/vocab.spm",
+        "spm_proto_hash": "1613fcbf3b82999c187b09c9db79b568",
+    },
+    "deberta_v3_small_en": {


There seems to be a lot of code delta unrelated to the change we want to make in this PR. We should avoid making such changes. Make only those changes which are necessary for the PR

I have rearranged dictionary orders so that metadata comes first in order to make it consistent with other presets, there is not much change apart from that.

Cyber-Machine · 2023-03-02T16:58:43Z

@Cyber-Machine - in my opinion, we should not add HF links for any model except DistilBERT (DistilBERT is by folks from HF...so, we can add the HF link there). But for other models, we should link the README of the official repo.

Listing them down below: BART: https://github.com/facebookresearch/fairseq/blob/main/examples/bart/README.md DeBERTa: https://github.com/microsoft/DeBERTa/blob/master/README.md FNet: https://github.com/google-research/google-research/blob/master/f_net/README.md OPT: https://github.com/facebookresearch/metaseq/blob/main/projects/OPT/README.md RoBERTa: https://github.com/facebookresearch/fairseq/blob/main/examples/roberta/README.md XLM-RoBERTa: https://github.com/facebookresearch/fairseq/blob/main/examples/xlmr/README.md

I might be wrong though...so, we'll have to check with @mattdangerw!

I was thinking the same while adding but huggingface provides a model card for each presets and is officially maintained either by the organization itself or by hf-maintainers.

Infact, presets of DeBERTa in README link to huggingface repository.

abheesht17 · 2023-03-02T17:36:36Z

Ohh, I was checking a few model cards on Hugging Face. Most of them have a disclaimer. For example:

https://huggingface.co/facebook/bart-base

Disclaimer: The team releasing BART did not write a model card for this model so this model card has been written by the Hugging Face team.

https://huggingface.co/google/fnet-base

Disclaimer: This model card has been written by <contributor>.

https://huggingface.co/facebook/opt-350m

Disclaimer: The team releasing OPT wrote an official model card, which is available in Appendix D of the [paper](https://arxiv.org/pdf/2205.01068.pdf). Content from this model card has been written by the Hugging Face team.

, etc.

So, most of the model cards have not been written by the team which came up with the model.

Cyber-Machine · 2023-03-02T18:18:01Z

Ohh, I was checking a few model cards on Hugging Face. Most of them have a disclaimer. For example:

https://huggingface.co/facebook/bart-base
Disclaimer: The team releasing BART did not write a model card for this model so this model card has been written by the Hugging Face team.
https://huggingface.co/google/fnet-base
Disclaimer: This model card has been written by <contributor>.
https://huggingface.co/facebook/opt-350m
Disclaimer: The team releasing OPT wrote an official model card, which is available in Appendix D of the [paper](https://arxiv.org/pdf/2205.01068.pdf). Content from this model card has been written by the Hugging Face team.
, etc.

So, most of the model cards have not been written by the team which came up with the model.

Understood, So I'll be removing model cards that are not officially written by the authors.

Apart from DistilBERT and DeBERTa, I will add links for README for the official model done by authors and links to the OPT model's appendix.

abheesht17 · 2023-03-02T18:44:03Z

Let's wait for Matt's opinion on this

mattdangerw

Thanks @Cyber-Machine and @abheesht17 for the review!

I think what the two of your came up with is good. For model cards we should make an attempt to link to what was authored by the original researchers, so this sgtm.

We will probably need a disclaimer on the keras.io side explaining that we will link a model card where available, and a general README if none has been provided.

keras_nlp/models/opt/opt_presets.py

keras_nlp/models/xlm_roberta/xlm_roberta_presets.py

abheesht17 · 2023-03-04T23:54:39Z

keras_nlp/models/distil_bert/distil_bert_presets.py

@@ -24,6 +24,7 @@
            "params": 66362880,
            "official_name": "DistilBERT",
            "path": "distil_bert",
+            "model_card": "https://huggingface.co/distilbert-base-uncased",


Instead of linking this, let's link the model card explicitly, So, in this case, https://huggingface.co/distilbert-base-uncased#distilbert-base-model-uncased. Same for other DistilBERT presets.

I'm ok with the main link actually. What is here is what you get when you click the model card tab on huggingface. (and i would somewhat suspect that anchor link might change down the link anyway)

abheesht17 · 2023-03-04T23:55:42Z

keras_nlp/models/f_net/f_net_presets.py

                "Trained on the C4 dataset."
            ),
-            "params": 82861056,
+            "params": 236945408,


Have you used https://github.com/keras-team/keras-nlp/blob/master/tools/count_preset_params.py to count the #parameters?

I think this is just the diff being confusing right? This isn't a new number though.

mattdangerw · 2023-03-06T19:51:26Z

keras_nlp/models/f_net/f_net_presets.py

                "Trained on the C4 dataset."
            ),
-            "params": 82861056,
+            "params": 236945408,


I think this is just the diff being confusing right? This isn't a new number though.

mattdangerw · 2023-03-06T19:53:06Z

keras_nlp/models/distil_bert/distil_bert_presets.py

@@ -24,6 +24,7 @@
            "params": 66362880,
            "official_name": "DistilBERT",
            "path": "distil_bert",
+            "model_card": "https://huggingface.co/distilbert-base-uncased",


I'm ok with the main link actually. What is here is what you get when you click the model card tab on huggingface. (and i would somewhat suspect that anchor link might change down the link anyway)

mattdangerw · 2023-03-08T20:51:33Z

Merging, these nightly failures are unrelated and fixed elsewhere.

Cyber-Machine added 4 commits March 1, 2023 23:22

Added Model Presets to metadata.

ddb6f61

Made proper format

c339df8

Removed Redundant Descriptions from gpt2_presets

2e2fed8

Added model_card to xlm_roberta

cb647c9

abheesht17 reviewed Mar 2, 2023

View reviewed changes

Updated Model Card Details.

991f938

Cyber-Machine requested a review from abheesht17 March 2, 2023 18:30

mattdangerw requested changes Mar 3, 2023

View reviewed changes

keras_nlp/models/opt/opt_presets.py Outdated Show resolved Hide resolved

keras_nlp/models/xlm_roberta/xlm_roberta_presets.py Outdated Show resolved Hide resolved

Added official README for OPT and XLM-RoBERTa

009df13

Cyber-Machine requested review from mattdangerw and abheesht17 and removed request for abheesht17 and mattdangerw March 3, 2023 05:19

abheesht17 reviewed Mar 4, 2023

View reviewed changes

mattdangerw approved these changes Mar 6, 2023

View reviewed changes

mattdangerw merged commit 50e9248 into keras-team:master Mar 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added model card links for all pretrained models. #795

Added model card links for all pretrained models. #795

Cyber-Machine commented Mar 1, 2023

abheesht17 left a comment

abheesht17 Mar 2, 2023

Cyber-Machine Mar 2, 2023

Cyber-Machine commented Mar 2, 2023

abheesht17 commented Mar 2, 2023

Cyber-Machine commented Mar 2, 2023

abheesht17 commented Mar 2, 2023

mattdangerw left a comment

abheesht17 Mar 4, 2023

mattdangerw Mar 6, 2023

abheesht17 Mar 4, 2023

mattdangerw Mar 6, 2023

mattdangerw Mar 6, 2023

mattdangerw Mar 6, 2023

mattdangerw commented Mar 8, 2023

Added model card links for all pretrained models. #795

Added model card links for all pretrained models. #795

Conversation

Cyber-Machine commented Mar 1, 2023

abheesht17 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Cyber-Machine commented Mar 2, 2023

abheesht17 commented Mar 2, 2023

Cyber-Machine commented Mar 2, 2023

abheesht17 commented Mar 2, 2023

mattdangerw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mattdangerw commented Mar 8, 2023