Add helper to generate an eval result `model-index`, with proper typing #382

julien-c · 2021-10-01T09:51:46Z

I think this is helpful (and could replace some code in transformers) but let me know, it can probably be improved.

The "types" themselves are generated from the JTD schema used in validation in our internal repo (but I didn't keep the to/from json serialization code as we're not using it)

See also internal context: https://huggingface.slack.com/archives/C01BWJU0YKW/p1632159653242300?thread_ts=1632158986.236100&cid=C01BWJU0YKW

…, with proper typing

julien-c · 2021-10-01T10:03:49Z

(CI failure is unrelated)

elishowk · 2021-10-01T11:50:33Z

src/huggingface_hub/repocard_types.py

@@ -0,0 +1,83 @@
+# Code generated by jtd-codegen for Python v0.3.1


Nice !
This could be useful to add in comments how to update it to propose a PR if/when the original schema changes ?

elishowk

Just one comment that led to another general question : shouldn't we publish de JTD in this repo as a static asset to be use with the python jtd-codegen command that generates src/huggingface_hub/repocard_types.py

julien-c · 2021-10-04T10:47:26Z

at some point we can do it but for now i think we can merge like this?

Will wait for a few other reviewers' review before merging

lvwerra · 2021-10-04T12:46:55Z

That looks very useful - thanks for adding it! Just one question: how would one go about adding multiple metrics? Would one call the metadata_eval_result function for each metric and concatenate the list at 'model-index' in the dict? I have this example or a use-case that even has several tasks.

julien-c · 2021-10-04T13:03:52Z

No then I would say you would create the ModelIndex object yourself like what's inside metadata_eval_result (or just create the dict by hand in that case)

Or do you think it's common even to warrant another helper?

lvwerra · 2021-10-04T13:13:43Z

The use-case with several metrics is pretty common I would say (e.g. some flavour of accuracy/precision/recall/f1 for classification type tasks such as sequence/token classification or QA) and also what the Trainer supports with custom metrics. Having several evaluation tasks is probably rarer in real use-cases and more academic. What do you think?

LysandreJik

Thanks for working on this PR! I'm not entirely certain I understand the use case: I think the addition of the dataclasses ModelIndex, SingleMetric, etc, are useful, but I feel like the metadata_eval_result is too limiting.

Results are a list, metrics are a list, but this only offers a single possibility for both results and metrics. I think it's a bit misleading and doesn't capture the array of possibilities available thanks to the modelcard format. This tool would be useful to build an initial modelcard with a single evaluation/single metric, but would not work to complete that same card with additional metrics or evaluation results, nor would it work for multiple evaluations or multiple metrics out of the box.

I would push for users to build their own SingleResult objects made up of their own SingleMetric objects, which would give them much more flexibility and understanding of what is possible with the current implementation.

julien-c · 2021-10-05T08:02:01Z

Yeah I think metadata_eval_result is more like a documented example of the simplest use-case of "ModelIndex and friends", that leads to your model having enough data to get a spot in leaderboards on hf.co

Happy to rework and/or document if this is not clear enough, or you can take over and push this PR over the line if you'd like

Thanks!

LysandreJik

Agreed to have this as a first approach on which we can build in the future. Thanks, @julien-c, and thanks for implementing tests too.

osanseviero

Given all other reviewers are happy with this, I'm happy as well. Thanks for proposing, implementing and testing this!

julien-c added 4 commits October 1, 2021 10:39

Fix typo + consistency

5912a3f

code generated by jtd-codegen

fb50569

Simplify generated code for now, only keep interfaces

780b8f5

metadata_eval_result: helper to generate an eval result model-index…

9a01b03

…, with proper typing

julien-c requested review from elishowk, osanseviero, lvwerra and LysandreJik October 1, 2021 09:51

elishowk reviewed Oct 1, 2021

View reviewed changes

elishowk approved these changes Oct 4, 2021

View reviewed changes

LysandreJik reviewed Oct 4, 2021

View reviewed changes

LysandreJik approved these changes Oct 11, 2021

View reviewed changes

lvwerra approved these changes Oct 11, 2021

View reviewed changes

osanseviero approved these changes Oct 11, 2021

View reviewed changes

LysandreJik and others added 2 commits October 11, 2021 11:51

Merge branch 'main' into model-index-types

cd653e8

Style

0358c27

LysandreJik merged commit 60c11fd into main Oct 11, 2021

LysandreJik deleted the model-index-types branch October 11, 2021 19:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add helper to generate an eval result `model-index`, with proper typing #382

Add helper to generate an eval result `model-index`, with proper typing #382

julien-c commented Oct 1, 2021

julien-c commented Oct 1, 2021

elishowk Oct 1, 2021

elishowk left a comment

julien-c commented Oct 4, 2021

lvwerra commented Oct 4, 2021

julien-c commented Oct 4, 2021

lvwerra commented Oct 4, 2021

LysandreJik left a comment

julien-c commented Oct 5, 2021

LysandreJik left a comment

osanseviero left a comment

		@@ -0,0 +1,83 @@
		# Code generated by jtd-codegen for Python v0.3.1

Add helper to generate an eval result model-index, with proper typing #382

Add helper to generate an eval result model-index, with proper typing #382

Conversation

julien-c commented Oct 1, 2021

julien-c commented Oct 1, 2021

elishowk Oct 1, 2021

Choose a reason for hiding this comment

elishowk left a comment

Choose a reason for hiding this comment

julien-c commented Oct 4, 2021

lvwerra commented Oct 4, 2021

julien-c commented Oct 4, 2021

lvwerra commented Oct 4, 2021

LysandreJik left a comment

Choose a reason for hiding this comment

julien-c commented Oct 5, 2021

LysandreJik left a comment

Choose a reason for hiding this comment

osanseviero left a comment

Choose a reason for hiding this comment

Add helper to generate an eval result `model-index`, with proper typing #382

Add helper to generate an eval result `model-index`, with proper typing #382