Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add delete_patterns option to upload_folder #1370

Merged
merged 5 commits into from
Mar 8, 2023

Conversation

Wauplin
Copy link
Contributor

@Wauplin Wauplin commented Mar 2, 2023

Step 1 for #1352: add a delete_patterns parameter to upload_folder. It makes sense in a lot of cases to delete remote files before uploading new ones.

The workflow is:

  1. if delete_patterns is provided, list remote files (can be optimized once https://github.com/huggingface/moon-landing/pull/5580 is deployed)
  2. apply the pattern filter => create a list of CommitOperationDelete (never delete ".gitattributes")
  3. if a file is both deleted and overwritten in the same commit, drop the deletion operation
  4. (not done) we could optimize this in case the user deletes an entire folder. At the moment it's 1 file = 1 delete operation. Seems error-prone to try to decode the patterns manually so let's do it later if we see it's a limitation.

Included in PR:

  • refactored _prepare_upload_folder_additions (fixed a bug) + implement _prepare_upload_folder_deletions
  • add mocked tests for upload_folder to check the CommitOperation list is correct
  • adapted ModelHubMixin + keras/pytorch/fastai integrations
  • adapted upload docs + some docstrings

Step 2 for #1352 will be adding the possibility for automatic "chained commits" in create_commit using a PR if the list of operations is too big (see description there).

@Wauplin Wauplin requested review from julien-c and LysandreJik March 2, 2023 14:52
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Mar 2, 2023

The documentation is not available anymore as the PR was closed or merged.

@Wauplin Wauplin changed the title Add delete_patterns option to upload_folder + tests + adapt mixins Add delete_patterns option to upload_folder Mar 2, 2023
@Wauplin Wauplin changed the title Add delete_patterns option to upload_folder Add delete_patterns option to upload_folder Mar 2, 2023
@codecov
Copy link

codecov bot commented Mar 2, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.04 🎉

Comparison is base (3a051bd) 84.49% compared to head (526f28c) 84.54%.

❗ Current head 526f28c differs from pull request most recent head 546cc31. Consider uploading reports for the commit 546cc31 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1370      +/-   ##
==========================================
+ Coverage   84.49%   84.54%   +0.04%     
==========================================
  Files          48       48              
  Lines        4812     4814       +2     
==========================================
+ Hits         4066     4070       +4     
+ Misses        746      744       -2     
Impacted Files Coverage Δ
src/huggingface_hub/fastai_utils.py 80.61% <100.00%> (ø)
src/huggingface_hub/hf_api.py 90.10% <100.00%> (+0.09%) ⬆️
src/huggingface_hub/hub_mixin.py 94.18% <100.00%> (+0.13%) ⬆️
src/huggingface_hub/keras_mixin.py 93.84% <100.00%> (-0.32%) ⬇️
src/huggingface_hub/commands/user.py 0.00% <0.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me!

Comment on lines -66 to 81
>>> from huggingface_hub import HfApi
>>> api = HfApi()
>>> api.upload_folder(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) Depending on how you test this (or if you use doctests at all, which I don't think is the case), you may need to still redeclare the previous if you want to isolate each statement. Still clean this way so no need to change.

Copy link
Contributor Author

@Wauplin Wauplin Mar 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not using doctests for now. I tend to prefer not redefining similar lines just to be more "straight to the point". At least when it's obvious what is api like here.
No strong opinion though. Might use doctests at some point but I have not seeing a big need for now.

src/huggingface_hub/hf_api.py Outdated Show resolved Hide resolved
src/huggingface_hub/hf_api.py Outdated Show resolved Hide resolved
src/huggingface_hub/hf_api.py Outdated Show resolved Hide resolved
@Wauplin
Copy link
Contributor Author

Wauplin commented Mar 8, 2023

Thanks for the careful review 👍
I have addressed the comments and will now merge it :)

@Wauplin Wauplin merged commit e3fa660 into main Mar 8, 2023
@Wauplin Wauplin deleted the 1352-upload-folder-with-delete-patterns branch March 8, 2023 08:42
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants