-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Repo metadata load and save #339
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's definitely useful, thanks a lot for adding this!
Thanks so much for this! This is a very needed upgrade! ❤️🔥 This will be very useful for all our 3rd party integrations if we ever need to modify all the repos. We should communicate about this PR once merged to all big libraries and orgs (e.g. Stanford NLP) |
(current status: spamming the CI to try and get all transient errors to go green 😅) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good stuff!
Co-authored-by: Omar Sanseviero <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/huggingface_hub/constants.py
Outdated
@@ -13,6 +13,7 @@ | |||
TF_WEIGHTS_NAME = "model.ckpt" | |||
FLAX_WEIGHTS_NAME = "flax_model.msgpack" | |||
CONFIG_NAME = "config.json" | |||
MODELCARD_NAME = "README.md" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're setting MODELCARD_NAME
here but using the repocard
term elsewhere - what should it be? I would favor having all instances be REPOCARD_NAME
if you're looking for a repo-type-agnostic name
else: | ||
raise ValueError("repo card metadata block should be a dict") | ||
else: | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably raise an error too, right? Running metadata_load
on any random file will return nothing while I would expect it to fail if it doesn't find any metadata
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's the expected behavior then I would also put it in a test to ensure that it doesn't switch to raising an error in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as discussed offline we have a lot of repo cards (model cards, etc) that do not have a yaml block, and i think the expected behavior here is to return None. If need be, we might modify this in a subsequent PR
@LysandreJik attempted to solve merge conflict This should be ready to merge, depending on final maintainer acceptance (😘 ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Will push an additional test and merge in a bit. |
Thank you for your work @julien-c |
Note about that issue I witnessed working on PRs : huggingface_hub and transformers CI often fails because of moon-staging 504 timeouts. If needed, I can work on strengthening this staging server's capacities |
Tentatively close #54
Example use case:
apache-2.0
– see Licenses for Helsinki-NLP models transformers#13328 (comment)