Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add additional repo card utils from modelcards repo #940

Merged
merged 63 commits into from
Sep 2, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
960845a
:construction: wip - add modelcard utils
nateraw Jul 6, 2022
930f6cc
:construction: use original regex, fix metadata update
nateraw Jul 6, 2022
88f604e
:fire: remove identical_ok kwarg
nateraw Jul 6, 2022
ee260ae
:white_check_mark: add repocard_data tests
nateraw Jul 6, 2022
6becdde
:construction: wip - add default modelcard template
nateraw Jul 11, 2022
a8e280e
:white_check_mark: Add modelcard tests
nateraw Jul 11, 2022
9ebb475
:construction: wip - add back repo_type
nateraw Jul 12, 2022
70d6aa1
:memo: update modelcard template
nateraw Jul 12, 2022
b6a413d
[#940] Preserve newlines in existing card files (#949)
julien-c Jul 25, 2022
bfe85ca
[#940] Better inheritance for repo card utils (#956)
nateraw Jul 26, 2022
7b57650
:lipstick: style
nateraw Aug 10, 2022
193922b
:heavy_plus_sign: add jinja dep to setup.py testing extras
nateraw Aug 10, 2022
cc394da
:fire: remove unnecessary jinja import
nateraw Aug 10, 2022
1fefeff
:bulb: fix regex comment
nateraw Aug 10, 2022
0526eb5
:art: reformatting
nateraw Aug 10, 2022
b0dd013
:memo: fix doctests even though we aren't running them
nateraw Aug 10, 2022
a06c975
:memo: fix doctests even though we aren't using them
nateraw Aug 10, 2022
e175298
:white_check_mark: update tests
nateraw Aug 11, 2022
2ca3382
:memo: add some docs
nateraw Aug 22, 2022
f0f09c4
:white_check_mark: fix flake8 on logging
nateraw Aug 22, 2022
1e14031
:memo: update docstrings to use python code blocks
nateraw Aug 22, 2022
4fc196e
:memo: add a docs page
nateraw Aug 22, 2022
17e4837
:memo: add api docs
nateraw Aug 22, 2022
99f91d8
:memo: update repocard package reference in docs
nateraw Aug 23, 2022
ce31d5a
:memo: update ref to CardData in docs
nateraw Aug 23, 2022
0f93ece
:memo: update docs
nateraw Aug 23, 2022
81e3d3a
:memo: update docs
nateraw Aug 23, 2022
fc0c652
:memo: update docs
nateraw Aug 23, 2022
12537bc
:memo: docs
nateraw Aug 23, 2022
379cae6
:memo: update model cards guide
nateraw Aug 23, 2022
e4365ac
:memo: update cards reference docs
nateraw Aug 23, 2022
3d551db
:memo: add note that local filepaths dont use repo_type
nateraw Aug 23, 2022
2a2afc5
:memo: update docstrings
nateraw Aug 23, 2022
53c3033
:memo: update docstrings
nateraw Aug 23, 2022
b17e26d
:white_check_mark: add some dataset card/data tests
nateraw Aug 23, 2022
5707566
:memo: update docstrings and dataset card template
nateraw Aug 23, 2022
29740ce
:lipstick: style
nateraw Aug 23, 2022
705cfb8
:memo: add evalresult docstring
nateraw Aug 23, 2022
2e12967
:memo: update some docstrings
nateraw Aug 23, 2022
62d12f1
:memo: fix link
nateraw Aug 23, 2022
6e22a57
:art: use defaultdict instead of dict
nateraw Aug 23, 2022
1fb224b
:memo: update model cards guide
nateraw Aug 23, 2022
2f562fc
:memo: update model cards guide
nateraw Aug 23, 2022
ebed54f
:lipstick: style
nateraw Aug 23, 2022
e3d1e9d
:memo: update ref docs
nateraw Aug 23, 2022
1c04ae1
:white_check_mark: update tests
nateraw Aug 23, 2022
d8ae0d8
:white_check_mark: update tests
nateraw Aug 23, 2022
3c20044
:memo: add docstring for repocard, update guide
nateraw Aug 23, 2022
6f6d00d
Apply suggestions from code review
nateraw Aug 24, 2022
c518fd2
Apply suggestions from code review
nateraw Aug 24, 2022
1a5c5c4
:heavy_minus_sign: remove Jinja2 from install_requires
nateraw Aug 24, 2022
6a6e585
:truck: use internal testing repo for docstring example
nateraw Aug 24, 2022
1f3f38d
:art: kwargs only for ModelCardData/DatasetCardData
nateraw Aug 24, 2022
8ffca13
:goal_net: rename caught error to ValueError
nateraw Aug 24, 2022
473a0bd
:white_check_mark: fix tests
nateraw Aug 24, 2022
8e44759
Update src/huggingface_hub/repocard.py
nateraw Aug 24, 2022
4ea3998
:lipstick: style
nateraw Aug 24, 2022
8ea02af
:memo: add typehints to repocard.py
nateraw Aug 24, 2022
1217995
:memo: update model cards docs
nateraw Sep 1, 2022
999bbee
:memo: add docstring for _detect_line_ending helper fn
nateraw Sep 1, 2022
ad4eced
Apply suggestions from code review
Wauplin Sep 2, 2022
07c97e5
Update src/huggingface_hub/repocard_data.py
Wauplin Sep 2, 2022
6d4129e
Update src/huggingface_hub/repocard_data.py
Wauplin Sep 2, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
include src/huggingface_hub/templates/modelcard_template.md
include src/huggingface_hub/templates/datasetcard_template.md
4 changes: 4 additions & 0 deletions docs/source/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@
title: Interact with Discussions and Pull Requests
- local: how-to-cache
title: Manage the Cache
- local: how-to-model-cards
title: Create and Share Model Cards
title: "Guides"
- sections:
- local: package_reference/repository
Expand All @@ -37,4 +39,6 @@
title: Discussions and Pull Requests
- local: package_reference/cache
title: Cache-system reference
- local: package_reference/cards
title: Repo Cards and Repo Card Data
title: "Reference"
315 changes: 315 additions & 0 deletions docs/source/how-to-model-cards.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,315 @@
# Creating and Sharing Model Cards

The `huggingface_hub` library provides a Python interface to create, share, and update Model Cards.
Visit [the dedicated documentation page](https://huggingface.co/docs/hub/models-cards)
for a deeper view of what Model Cards on the Hub are, and how they work under the hood.

## Loading a Model Card from the Hub

To load an existing card from the Hub, you can use the [`ModelCard.load`] function. Here, we'll load the card from [`nateraw/vit-base-beans`](https://huggingface.co/nateraw/vit-base-beans).

```python
from huggingface_hub import ModelCard

card = ModelCard.load('nateraw/vit-base-beans')
```

This card has some helpful attributes that you may want to access/leverage:
- `card.data`: Returns a [`ModelCardData`] instance with the model card's metadata. Call `.to_dict()` on this instance to get the representation as a dictionary.
- `card.text`: Returns the text of the card, *excluding the metadata header*.
- `card.content`: Returns the text content of the card, *including the metadata header*.

## Creating Model Cards

### From Text

To initialize a Model Card from text, just pass the text content of the card to the `ModelCard` on init.

```python
content = """
---
language: en
license: mit
---
# My Model Card
"""

card = ModelCard(content)
card.data.to_dict() == {'language': 'en', 'license': 'mit'} # True
```

Another way you might want to do this is with f-strings. In the following example, we:

- Use [`ModelCardData.to_yaml`] to convert metadata we defined to YAML so we can use it to insert the YAML block in the model card.
- Show how you might use a template variable via Python f-strings.

```python
card_data = ModelCardData(language='en', license='mit', library='timm')

example_template_var = 'nateraw'
content = f"""
---
{ card_data.to_yaml() }
---
# My Model Card
This model created by [@{example_template_var}](https://github.com/{example_template_var})
"""

card = ModelCard(content)
print(card)
```

The above example would leave us with a card that looks like this:

```
---
language: en
license: mit
library: timm
---
# My Model Card
This model created by [@nateraw](https://github.com/nateraw)
```

### From a Jinja Template

If you have `Jinja2` installed, you can create Model Cards from a jinja template file. Let's see a basic example:

```python
from pathlib import Path

from huggingface_hub import ModelCard, ModelCardData

# Define your jinja template
template_text = """
---
{{ card_data }}
---
# Model Card for MyCoolModel
This model does this and that.
This model was created by [@{{ author }}](https://hf.co/{{author}}).
""".strip()

# Write the template to a file
Path('custom_template.md').write_text(template_text)

# Define card metadata
card_data = ModelCardData(language='en', license='mit', library_name='keras')

# Create card from template, passing it any jinja template variables you want.
# In our case, we'll pass author
card = ModelCard.from_template(card_data, template_path='custom_template.md', author='nateraw')
card.save('my_model_card_1.md')
print(card)
```

The resulting card's markdown looks like this:

```
---
language: en
license: mit
library_name: keras
---
# Model Card for MyCoolModel
This model does this and that.
This model was created by [@nateraw](https://hf.co/nateraw).
```

If you update any card.data, it'll reflect in the card itself.

```
card.data.library_name = 'timm'
card.data.language = 'fr'
card.data.license = 'apache-2.0'
print(card)
```

Now, as you can see, the metadata header has been updated:

```
---
language: fr
license: apache-2.0
library_name: timm
---
# Model Card for MyCoolModel
This model does this and that.
This model was created by [@nateraw](https://hf.co/nateraw).
```

As you update the card data, you can validate the card is still valid against the Hub by calling [`ModelCard.validate`]. This ensures that the card passes any validation rules set up on the Hugging Face Hub.
nateraw marked this conversation as resolved.
Show resolved Hide resolved

### From the Default Template

Instead of using your own template, you can also use the [default template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md), which is a fully featured model card with tons of sections you may want to fill out. Under the hood, it uses [Jinja2](https://jinja.palletsprojects.com/en/3.1.x/) to fill out a template file.

<Tip>

Note that you will have to have Jinja2 installed to use `from_template`. You can do so with `pip install Jinja2`.

</Tip>

```python
card_data = ModelCardData(language='en', license='mit', library_name='keras')
card = ModelCard.from_template(
card_data,
model_id='my-cool-model',
model_description="this model does this and that",
developers="Nate Raw",
more_resources="https://github.com/huggingface/huggingface_hub",
)
card.save('my_model_card_2.md')
print(card)
```

## Sharing Model Cards

If you're authenticated with the Hugging Face Hub (either by using `huggingface-cli login` or `huggingface_hub.notebook_login()`), you can push cards to the Hub by simply calling [`ModelCard.push_to_hub`]. Let's take a look at how to do that...

First, we'll create a new repo called 'hf-hub-modelcards-pr-test' under the authenticated user's namespace:

```python
from huggingface_hub import whoami, create_repo

user = whoami()['name']
repo_id = f'{user}/hf-hub-modelcards-pr-test'
url = create_repo(repo_id, exist_ok=True)
```

Then, we'll create a card from the default template (same as the one defined in the section above):

```python
card_data = ModelCardData(language='en', license='mit', library_name='keras')
card = ModelCard.from_template(
card_data,
model_id='my-cool-model',
model_description="this model does this and that",
developers="Nate Raw",
more_resources="https://github.com/huggingface/huggingface_hub",
)
```

Finally, we'll push that up to the hub

```python
card.push_to_hub(repo_id)
```

You can check out the resulting card [here](https://huggingface.co/nateraw/hf-hub-modelcards-pr-test/blob/main/README.md).

If you instead wanted to push a card as a pull request, you can just say `create_pr=True` when calling `push_to_hub`:

```python
card.push_to_hub(repo_id, create_pr=True)
```

A resulting PR created from this command can be seen [here](https://huggingface.co/nateraw/hf-hub-modelcards-pr-test/discussions/3).
nateraw marked this conversation as resolved.
Show resolved Hide resolved

### Including Evaluation Results

To include evaluation results in the metadata `model-index`, you can pass an [`EvalResult`] or a list of `EvalResult` with your associated evaluation results. Under the hood it'll create the `model-index` when you call `card.data.to_dict()`. For more information on how this works, you can check out [this section of the Hub docs](https://huggingface.co/docs/hub/models-cards#evaluation-results).

<Tip>

Note that using this function requires you to include the `model_name` attribute in [`ModelCardData`].

</Tip>

```python
card_data = ModelCardData(
language='en',
license='mit',
model_name='my-cool-model',
eval_results = EvalResult(
task_type='image-classification',
dataset_type='beans',
dataset_name='Beans',
metric_type='accuracy',
metric_value=0.7
)
)

card = ModelCard.from_template(card_data)
print(card.data)
```

The resulting `card.data` should look like this:

```
language: en
license: mit
model-index:
- name: my-cool-model
results:
- task:
type: image-classification
dataset:
name: Beans
type: beans
metrics:
- type: accuracy
value: 0.7
```

If you have more than one evaluation result you'd like to share, just pass a list of `EvalResult`:

```python
card_data = ModelCardData(
language='en',
license='mit',
model_name='my-cool-model',
eval_results = [
EvalResult(
task_type='image-classification',
dataset_type='beans',
dataset_name='Beans',
metric_type='accuracy',
metric_value=0.7
),
EvalResult(
task_type='image-classification',
dataset_type='beans',
dataset_name='Beans',
metric_type='f1',
metric_value=0.65
)
]
)
card = ModelCard.from_template(card_data)
card.data
```

Which should leave you with the following `card.data`:

```
language: en
license: mit
model-index:
- name: my-cool-model
results:
- task:
type: image-classification
dataset:
name: Beans
type: beans
metrics:
- type: accuracy
value: 0.7
- type: f1
value: 0.65
```
Loading