Skip to content

Commit

Permalink
Add additional repo card utils from modelcards repo (#940)
Browse files Browse the repository at this point in the history
* 🚧 wip - add modelcard utils

* 🚧 use original regex, fix metadata update

* 🔥 remove identical_ok kwarg

* ✅ add repocard_data tests

* 🚧 wip - add default modelcard template

* ✅ Add modelcard tests

* 🚧 wip - add back repo_type

* 📝 update modelcard template

* [#940] Preserve newlines in existing card files (#949)

* Preserve newlines in the existing card files

* make style

* ✅ add test

* 💄 apply style

Co-authored-by: nateraw <[email protected]>

* [#940] Better inheritance for repo card utils (#956)

* 🚧 wip

* 🚧 wip

* 🚧 wip

* 💄 apply style

* 🚧 wip

* 💄 style

* ➕ add jinja dep to setup.py testing extras

* 🔥 remove unnecessary jinja import

* 💡 fix regex comment

* 🎨 reformatting

* 📝 fix doctests even though we aren't running them

* 📝 fix doctests even though we aren't using them

* ✅ update tests

* 📝 add some docs

* ✅ fix flake8 on logging

* 📝 update docstrings to use python code blocks

* 📝 add a docs page

* 📝 add api docs

* 📝 update repocard package reference in docs

* 📝 update ref to CardData in docs

* 📝 update docs

* 📝 update docs

* 📝 update docs

* 📝 docs

* 📝 update model cards guide

* 📝 update cards reference docs

* 📝 add note that local filepaths dont use repo_type

* 📝 update docstrings

* 📝 update docstrings

* ✅ add some dataset card/data tests

* 📝 update docstrings and dataset card template

* 💄 style

* 📝 add evalresult docstring

* 📝 update some docstrings

* 📝 fix link

* 🎨 use defaultdict instead of dict

* 📝 update model cards guide

* 📝 update model cards guide

* 💄 style

* 📝 update ref docs

* ✅ update tests

* ✅ update tests

* 📝 add docstring for repocard, update guide

* Apply suggestions from code review

Co-authored-by: Omar Sanseviero <[email protected]>
Co-authored-by: Lucain <[email protected]>

* Apply suggestions from code review

Co-authored-by: Lucain <[email protected]>
Co-authored-by: Omar Sanseviero <[email protected]>

* ➖ remove Jinja2 from install_requires

* 🚚 use internal testing repo for docstring example

* 🎨 kwargs only for ModelCardData/DatasetCardData

* 🥅 rename caught error to ValueError

* ✅ fix tests

* Update src/huggingface_hub/repocard.py

Co-authored-by: Lucain <[email protected]>

* 💄 style

* 📝 add typehints to repocard.py

* 📝 update model cards docs

* 📝 add docstring for _detect_line_ending helper fn

* Apply suggestions from code review

* Update src/huggingface_hub/repocard_data.py

* Update src/huggingface_hub/repocard_data.py

Co-authored-by: Julien Chaumond <[email protected]>
Co-authored-by: Omar Sanseviero <[email protected]>
Co-authored-by: Lucain <[email protected]>
  • Loading branch information
4 people authored Sep 2, 2022
1 parent 0f2c286 commit 75a1ba5
Show file tree
Hide file tree
Showing 23 changed files with 2,583 additions and 339 deletions.
2 changes: 2 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
include src/huggingface_hub/templates/modelcard_template.md
include src/huggingface_hub/templates/datasetcard_template.md
4 changes: 4 additions & 0 deletions docs/source/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@
title: Interact with Discussions and Pull Requests
- local: how-to-cache
title: Manage the Cache
- local: how-to-model-cards
title: Create and Share Model Cards
title: "Guides"
- sections:
- local: package_reference/repository
Expand All @@ -37,4 +39,6 @@
title: Discussions and Pull Requests
- local: package_reference/cache
title: Cache-system reference
- local: package_reference/cards
title: Repo Cards and Repo Card Data
title: "Reference"
315 changes: 315 additions & 0 deletions docs/source/how-to-model-cards.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,315 @@
# Creating and Sharing Model Cards

The `huggingface_hub` library provides a Python interface to create, share, and update Model Cards.
Visit [the dedicated documentation page](https://huggingface.co/docs/hub/models-cards)
for a deeper view of what Model Cards on the Hub are, and how they work under the hood.

## Loading a Model Card from the Hub

To load an existing card from the Hub, you can use the [`ModelCard.load`] function. Here, we'll load the card from [`nateraw/vit-base-beans`](https://huggingface.co/nateraw/vit-base-beans).

```python
from huggingface_hub import ModelCard

card = ModelCard.load('nateraw/vit-base-beans')
```

This card has some helpful attributes that you may want to access/leverage:
- `card.data`: Returns a [`ModelCardData`] instance with the model card's metadata. Call `.to_dict()` on this instance to get the representation as a dictionary.
- `card.text`: Returns the text of the card, *excluding the metadata header*.
- `card.content`: Returns the text content of the card, *including the metadata header*.

## Creating Model Cards

### From Text

To initialize a Model Card from text, just pass the text content of the card to the `ModelCard` on init.

```python
content = """
---
language: en
license: mit
---
# My Model Card
"""

card = ModelCard(content)
card.data.to_dict() == {'language': 'en', 'license': 'mit'} # True
```

Another way you might want to do this is with f-strings. In the following example, we:

- Use [`ModelCardData.to_yaml`] to convert metadata we defined to YAML so we can use it to insert the YAML block in the model card.
- Show how you might use a template variable via Python f-strings.

```python
card_data = ModelCardData(language='en', license='mit', library='timm')

example_template_var = 'nateraw'
content = f"""
---
{ card_data.to_yaml() }
---
# My Model Card
This model created by [@{example_template_var}](https://github.com/{example_template_var})
"""

card = ModelCard(content)
print(card)
```

The above example would leave us with a card that looks like this:

```
---
language: en
license: mit
library: timm
---
# My Model Card
This model created by [@nateraw](https://github.com/nateraw)
```

### From a Jinja Template

If you have `Jinja2` installed, you can create Model Cards from a jinja template file. Let's see a basic example:

```python
from pathlib import Path

from huggingface_hub import ModelCard, ModelCardData

# Define your jinja template
template_text = """
---
{{ card_data }}
---
# Model Card for MyCoolModel
This model does this and that.
This model was created by [@{{ author }}](https://hf.co/{{author}}).
""".strip()

# Write the template to a file
Path('custom_template.md').write_text(template_text)

# Define card metadata
card_data = ModelCardData(language='en', license='mit', library_name='keras')

# Create card from template, passing it any jinja template variables you want.
# In our case, we'll pass author
card = ModelCard.from_template(card_data, template_path='custom_template.md', author='nateraw')
card.save('my_model_card_1.md')
print(card)
```

The resulting card's markdown looks like this:

```
---
language: en
license: mit
library_name: keras
---
# Model Card for MyCoolModel
This model does this and that.
This model was created by [@nateraw](https://hf.co/nateraw).
```

If you update any card.data, it'll reflect in the card itself.

```
card.data.library_name = 'timm'
card.data.language = 'fr'
card.data.license = 'apache-2.0'
print(card)
```

Now, as you can see, the metadata header has been updated:

```
---
language: fr
license: apache-2.0
library_name: timm
---
# Model Card for MyCoolModel
This model does this and that.
This model was created by [@nateraw](https://hf.co/nateraw).
```

As you update the card data, you can validate the card is still valid against the Hub by calling [`ModelCard.validate`]. This ensures that the card passes any validation rules set up on the Hugging Face Hub.

### From the Default Template

Instead of using your own template, you can also use the [default template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md), which is a fully featured model card with tons of sections you may want to fill out. Under the hood, it uses [Jinja2](https://jinja.palletsprojects.com/en/3.1.x/) to fill out a template file.

<Tip>

Note that you will have to have Jinja2 installed to use `from_template`. You can do so with `pip install Jinja2`.

</Tip>

```python
card_data = ModelCardData(language='en', license='mit', library_name='keras')
card = ModelCard.from_template(
card_data,
model_id='my-cool-model',
model_description="this model does this and that",
developers="Nate Raw",
more_resources="https://github.com/huggingface/huggingface_hub",
)
card.save('my_model_card_2.md')
print(card)
```

## Sharing Model Cards

If you're authenticated with the Hugging Face Hub (either by using `huggingface-cli login` or `huggingface_hub.notebook_login()`), you can push cards to the Hub by simply calling [`ModelCard.push_to_hub`]. Let's take a look at how to do that...

First, we'll create a new repo called 'hf-hub-modelcards-pr-test' under the authenticated user's namespace:

```python
from huggingface_hub import whoami, create_repo

user = whoami()['name']
repo_id = f'{user}/hf-hub-modelcards-pr-test'
url = create_repo(repo_id, exist_ok=True)
```

Then, we'll create a card from the default template (same as the one defined in the section above):

```python
card_data = ModelCardData(language='en', license='mit', library_name='keras')
card = ModelCard.from_template(
card_data,
model_id='my-cool-model',
model_description="this model does this and that",
developers="Nate Raw",
more_resources="https://github.com/huggingface/huggingface_hub",
)
```

Finally, we'll push that up to the hub

```python
card.push_to_hub(repo_id)
```

You can check out the resulting card [here](https://huggingface.co/nateraw/hf-hub-modelcards-pr-test/blob/main/README.md).

If you instead wanted to push a card as a pull request, you can just say `create_pr=True` when calling `push_to_hub`:

```python
card.push_to_hub(repo_id, create_pr=True)
```

A resulting PR created from this command can be seen [here](https://huggingface.co/nateraw/hf-hub-modelcards-pr-test/discussions/3).

### Including Evaluation Results

To include evaluation results in the metadata `model-index`, you can pass an [`EvalResult`] or a list of `EvalResult` with your associated evaluation results. Under the hood it'll create the `model-index` when you call `card.data.to_dict()`. For more information on how this works, you can check out [this section of the Hub docs](https://huggingface.co/docs/hub/models-cards#evaluation-results).

<Tip>

Note that using this function requires you to include the `model_name` attribute in [`ModelCardData`].

</Tip>

```python
card_data = ModelCardData(
language='en',
license='mit',
model_name='my-cool-model',
eval_results = EvalResult(
task_type='image-classification',
dataset_type='beans',
dataset_name='Beans',
metric_type='accuracy',
metric_value=0.7
)
)

card = ModelCard.from_template(card_data)
print(card.data)
```

The resulting `card.data` should look like this:

```
language: en
license: mit
model-index:
- name: my-cool-model
results:
- task:
type: image-classification
dataset:
name: Beans
type: beans
metrics:
- type: accuracy
value: 0.7
```

If you have more than one evaluation result you'd like to share, just pass a list of `EvalResult`:

```python
card_data = ModelCardData(
language='en',
license='mit',
model_name='my-cool-model',
eval_results = [
EvalResult(
task_type='image-classification',
dataset_type='beans',
dataset_name='Beans',
metric_type='accuracy',
metric_value=0.7
),
EvalResult(
task_type='image-classification',
dataset_type='beans',
dataset_name='Beans',
metric_type='f1',
metric_value=0.65
)
]
)
card = ModelCard.from_template(card_data)
card.data
```

Which should leave you with the following `card.data`:

```
language: en
license: mit
model-index:
- name: my-cool-model
results:
- task:
type: image-classification
dataset:
name: Beans
type: beans
metrics:
- type: accuracy
value: 0.7
- type: f1
value: 0.65
```
Loading

0 comments on commit 75a1ba5

Please sign in to comment.