-
Notifications
You must be signed in to change notification settings - Fork 613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Git prune #450
Git prune #450
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice new API and cool new tests, thanks a lot!
if result and self._post_method is not None: | ||
self._post_method() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very smart!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know the code too well, but it's a bit weird to me that a property "is this command done?" now calls a "post"
method. Wouldn't it be cleaner to separate the self._post_method()
from is_done
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case the user will need to manually check when an async git commit is finished to call the post method, which is what we are trying to avoid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand what you mean Patrick, let me give you some context: we're leveraging the subprocess.Popen
approach to launch the process in a non-blocking manner. This approach means we're spawning a subprocess and keeping track of it in the CommandInProgress
utility.
The process does not have the ability to let the Python runtime know when it's done: it's up to the Python runtime to check when it is done by checking its status
.
Therefore, here, we have three options:
- Check the status regularly and when it is done run the
git lfs prune
- Let the user check the status, and when it is done run the
git lfs prune
- Don't do any of that and let the user handle the
git lfs prune
when they feel like it
We've chosen to do 2., as that's the least intensive way of doing things (doesn't need to check every second the status of a subprocess); but approach 1. might be better as the git lfs prune
might take a bit of time for larger files, which is currently blocking when the user checks for is_done
. For now I'm going to add a log message when it is pruning, but I'm open to switching from 2. to 1. if you feel strongly.
- is not a great user API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good for me!
@@ -735,6 +735,168 @@ def test_delete_tag(self): | |||
repo.delete_tag("v4.6.0", remote="origin") | |||
self.assertFalse(repo.tag_exists("v4.6.0", remote="origin")) | |||
|
|||
def test_lfs_prune(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great tests!
@@ -986,12 +1017,16 @@ def status_method(): | |||
is_done_method=lambda: process.poll() is not None, | |||
status_method=status_method, | |||
process=process, | |||
post_method=self.lfs_prune if auto_lfs_prune else None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe only do this when blocking=False
? When blocking=True
, we can run the instruction at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already the case, this whole block is under if not blocking
!
huggingface_hub/src/huggingface_hub/repository.py
Lines 1006 to 1025 in 4152f15
if not blocking: | |
def status_method(): | |
status = process.poll() | |
if status is None: | |
return -1 | |
else: | |
return status | |
command = CommandInProgress( | |
"push", | |
is_done_method=lambda: process.poll() is not None, | |
status_method=status_method, | |
process=process, | |
post_method=self.lfs_prune if auto_lfs_prune else None, | |
) | |
self.command_queue.append(command) | |
return self.git_head_commit_url(), command |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry, I didn't catch that :-)
Adds the option to prune local files to prevent the local folder from exploding in size.