Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Git prune #450

Merged
merged 4 commits into from
Nov 2, 2021
Merged

Git prune #450

merged 4 commits into from
Nov 2, 2021

Conversation

LysandreJik
Copy link
Member

Adds the option to prune local files to prevent the local folder from exploding in size.

@LysandreJik LysandreJik requested a review from sgugger November 1, 2021 18:50
Copy link
Contributor

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice new API and cool new tests, thanks a lot!

Comment on lines +49 to +50
if result and self._post_method is not None:
self._post_method()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very smart!

Copy link
Contributor

@patrickvonplaten patrickvonplaten Nov 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the code too well, but it's a bit weird to me that a property "is this command done?" now calls a "post" method. Wouldn't it be cleaner to separate the self._post_method() from is_done?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case the user will need to manually check when an async git commit is finished to call the post method, which is what we are trying to avoid.

Copy link
Member Author

@LysandreJik LysandreJik Nov 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand what you mean Patrick, let me give you some context: we're leveraging the subprocess.Popen approach to launch the process in a non-blocking manner. This approach means we're spawning a subprocess and keeping track of it in the CommandInProgress utility.

The process does not have the ability to let the Python runtime know when it's done: it's up to the Python runtime to check when it is done by checking its status.

Therefore, here, we have three options:

  1. Check the status regularly and when it is done run the git lfs prune
  2. Let the user check the status, and when it is done run the git lfs prune
  3. Don't do any of that and let the user handle the git lfs prune when they feel like it

We've chosen to do 2., as that's the least intensive way of doing things (doesn't need to check every second the status of a subprocess); but approach 1. might be better as the git lfs prune might take a bit of time for larger files, which is currently blocking when the user checks for is_done. For now I'm going to add a log message when it is pruning, but I'm open to switching from 2. to 1. if you feel strongly.

  1. is not a great user API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good for me!

@@ -735,6 +735,168 @@ def test_delete_tag(self):
repo.delete_tag("v4.6.0", remote="origin")
self.assertFalse(repo.tag_exists("v4.6.0", remote="origin"))

def test_lfs_prune(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great tests!

@@ -986,12 +1017,16 @@ def status_method():
is_done_method=lambda: process.poll() is not None,
status_method=status_method,
process=process,
post_method=self.lfs_prune if auto_lfs_prune else None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe only do this when blocking=False? When blocking=True, we can run the instruction at the end.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already the case, this whole block is under if not blocking!

if not blocking:
def status_method():
status = process.poll()
if status is None:
return -1
else:
return status
command = CommandInProgress(
"push",
is_done_method=lambda: process.poll() is not None,
status_method=status_method,
process=process,
post_method=self.lfs_prune if auto_lfs_prune else None,
)
self.command_queue.append(command)
return self.git_head_commit_url(), command

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, I didn't catch that :-)

@LysandreJik LysandreJik merged commit 2bc9a05 into main Nov 2, 2021
@LysandreJik LysandreJik deleted the git-prune branch November 2, 2021 16:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants