Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Push to hub/commit with branches #282

Merged
merged 6 commits into from
Aug 17, 2021
Merged

Push to hub/commit with branches #282

merged 6 commits into from
Aug 17, 2021

Conversation

LysandreJik
Copy link
Member

Completes the revision-based API by adding branches management to push_to_hub and the commit context manager.

Examples of usage:

Using revision from Repository.__init__

project_name = 'org/project'

# Checkout revision
hf_repo = Repository(project_name, clone_from=project_name, revision='test-1')

model = GPT2LMHeadModel.from_pretrained(project_name)
model.save_pretrained(project_name)

hf_repo.push_to_hub(commit_message=f'step {step}')

Using git_checkout from Repository

project_name = 'org/project'

# Checkout revision
hf_repo = Repository(project_name, clone_from=project_name)
hf_repo.git_checkout("branch")

model = GPT2LMHeadModel.from_pretrained(project_name)
model.save_pretrained(project_name)

hf_repo.push_to_hub(commit_message=f'step {step}')

Note that the push_to_hub method interacts in a complex manner:

  • The branch must first be checked out
  • The files are saved
  • Push to hub should be called

Using the Repository.commit context manager

Therefore, I would instead recommend using the Repository.commit context manager as I find it has a better API, as can be seen below. I think the two could eventually be merged into a single with Repository.push_to_hub. See details of the API with usage below:

repo = Repository("local_directory", clone_from="Jikiwa/with-commit-1")

with repo.commit("Add files"):
    with open("file.txt", "w+") as f:
        f.write("Nice")

with repo.commit("Add files to brand-new-branch", branch="brand-new-branch"):
    with open("brand-new-file.txt", "w+") as f:
        f.write("Nice Nice")

repo.git_checkout("nice-branch", create_branch_ok=True)

# Will use lastly checked-out branch
with repo.commit("Add files in nice-branch"):
    with open("file.txt", "w+") as f:
        f.write("Nice")

A caveat with git_checkout is that it may result in a fatal error if some files are added/modified as the changes would be overridden by the checkout.

@lvwerra
Copy link
Member

lvwerra commented Aug 16, 2021

LGTM, thanks for improving the API!

Copy link
Contributor

@osanseviero osanseviero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall seems good, but the tests are failing at the moment.

I was also wondering if we should add some documentation in https://github.com/huggingface/huggingface_hub/tree/main/src/huggingface_hub#managing-a-repository-with-repository, but this could be a follow up as well

src/huggingface_hub/repository.py Show resolved Hide resolved
@NielsRogge
Copy link
Contributor

LGTM, thank you for adding this!

@LysandreJik LysandreJik merged commit b4be074 into main Aug 17, 2021
@LysandreJik LysandreJik deleted the push-branching branch August 17, 2021 16:07
Copy link
Member

@julien-c julien-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice implementation, and I like the thoroughness of the tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants