Push to hub save #15327

sgugger · 2022-01-25T15:02:35Z

What does this PR do?

This PR changes the documentation of the push_to_hub training argument to be in sync with the actual behavior, and also makes sure that when push_to_hub=True, a push is done every time the model is saved.

Fixes #15313

HuggingFaceDocBuilder · 2022-01-25T15:04:47Z

The documentation is not available anymore as the PR was closed or merged.

LysandreJik

This works for me! Something to keep in mind here (might be a bit complex), is that git is aware of what commits have been pushed to the remote only when it has done pushing that commit.

It means the following:

repo.git_commit(...)
repo.git_push(blocking=False)
...
repo.git_commit(...)
repo.git_push(blocking=False) <--- if the previous push was not finished, it will push the two commits instead of the last one, resulting in a 2x bigger (and slower) push

This means that if the pushes are very frequent, or with very big models, then it wouldn't be surprising to have each git_push push the previous commit(s) alongside it, resulting in each following invocation pushing bigger and bigger pushes.

One way to prevent that would be to ensure that all previous pushes have been finished before starting a new one. I'm not sure there is a perfect solution to this.

sgugger · 2022-01-25T16:50:12Z

I think you're missing the fact that, apart from a push asked explicitly with trainer.push_to_hub() (or now trainer.save_model()) all other pushes are only done if the previous one is completely finished (see here). So, there are no frequent pushes (except if the user asks for it by putting trainer.save_model() in a loop but I don't think that will happen).

LysandreJik · 2022-01-25T21:40:06Z

Then that's perfect :)

osanseviero · 2022-01-26T08:18:24Z

src/transformers/training_args.py

-            `output_dir` exists, it needs to be a local clone of the repository to which the [`Trainer`] will be
+            Whether or not to push the model to the Hub every time the model is saved. If this is activated,
+            `output_dir` will begin a git directory synced with the the repo (determined by `hub_model_id`) and the
+            content will be pushed each time a save is triggered (depneding on your `save_strategy`). Calling


Nit

Suggested change

content will be pushed each time a save is triggered (depneding on your `save_strategy`). Calling

content will be pushed each time a save is triggered (depending on your `save_strategy`). Calling

osanseviero · 2022-01-26T08:18:40Z

src/transformers/training_args.py

+            Whether or not to push the model to the Hub every time the model is saved. If this is activated,
+            `output_dir` will begin a git directory synced with the the repo (determined by `hub_model_id`) and the
+            content will be pushed each time a save is triggered (depneding on your `save_strategy`). Calling
+            [`~Trainer.save_model`] will also trigger a push


Nit

Suggested change

[`~Trainer.save_model`] will also trigger a push

[`~Trainer.save_model`] will also trigger a push.

osanseviero · 2022-01-26T08:39:59Z

src/transformers/trainer.py

@@ -2051,6 +2051,10 @@ def save_model(self, output_dir: Optional[str] = None):
        elif self.args.should_save:
            self._save(output_dir)

+        # Push to the Hub when `save_model` is called by the user.
+        if self.args.push_to_hub and not _internal_call:


I don't fully understand this. Don't we also want to push to the Hub automatically during training (which is an internal call)?

push_to_hub calls save_model which calls push_to_hub which calls save_model which calls push_to_hub which calls save_model which calls push_to_hub which calls save_model which calls push_to_hub which calls save_model which calls push_to_hub which calls save_model which calls push_to_hub which calls save_model which calls push_to_hub which calls save_model which calls push_to_hub which calls save_model ...

This internal argument is here to avoid that ;-)

Thanks for explaining and for the PR!

Unfortunately I'm still getting the behaviour in which the model is getting saved during training but it's not getting pushed. https://colab.research.google.com/drive/1GAXf3egH2GDbk7M0btdKWbLerBqLoJPi?usp=sharing. I have a commit saving at step 500 but no push

Hi Omar! I just checked your colab, and ran it. You have to wait for a solid 15 minutes after the fact to see the weights on your repo as Colab is uploading at an excruciatingly slow speed (those pushes during training are asynchronous to avoid slowing down training).

trainer.push_in_progress give you the job that is pushing, you can check its stdout attribute to see the progress it makes, its is_done attribute to see if it's finished or not and its stderr attribute to check if there was an error or not.

You can see on this repo that I eventually got my weights pushed with your code :-)

Aha! Thanks @sgugger for the investigation. I was not aware about trainer.push_in_progress, this is something I'll definitively use next time. Thanks once again.

osanseviero

Thanks @sgugger for the PR and clarifications on the pushing behaviour 🚀 🥳

sgugger · 2022-01-27T14:01:58Z

Ugh, sorry @osanseviero I was convinced I had accepted your suggestions before merging (must have skipped one click), will make a new commit with you as Co-author...

sgugger requested review from osanseviero, patrickvonplaten and LysandreJik January 25, 2022 15:02

sgugger added 2 commits January 25, 2022 10:03

Adapt doc and push at every save

754fc57

style

13c0bb6

LysandreJik approved these changes Jan 25, 2022

View reviewed changes

osanseviero reviewed Jan 26, 2022

View reviewed changes

osanseviero approved these changes Jan 27, 2022

View reviewed changes

sgugger merged commit da5ef25 into master Jan 27, 2022

sgugger deleted the push_to_hub_save branch January 27, 2022 14:00

sgugger mentioned this pull request Jan 27, 2022

Implement fixes for TrainingArguments doc #15370

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Push to hub save #15327

Push to hub save #15327

sgugger commented Jan 25, 2022

HuggingFaceDocBuilder commented Jan 25, 2022 •

edited

Loading

LysandreJik left a comment •

edited

Loading

sgugger commented Jan 25, 2022

LysandreJik commented Jan 25, 2022

osanseviero Jan 26, 2022

osanseviero Jan 26, 2022

osanseviero Jan 26, 2022

sgugger Jan 26, 2022

osanseviero Jan 27, 2022

sgugger Jan 27, 2022 •

edited

Loading

osanseviero Jan 27, 2022

osanseviero left a comment

sgugger commented Jan 27, 2022

	content will be pushed each time a save is triggered (depneding on your `save_strategy`). Calling
	content will be pushed each time a save is triggered (depending on your `save_strategy`). Calling

	[`~Trainer.save_model`] will also trigger a push
	[`~Trainer.save_model`] will also trigger a push.

Push to hub save #15327

Push to hub save #15327

Conversation

sgugger commented Jan 25, 2022

What does this PR do?

HuggingFaceDocBuilder commented Jan 25, 2022 • edited Loading

LysandreJik left a comment • edited Loading

Choose a reason for hiding this comment

sgugger commented Jan 25, 2022

LysandreJik commented Jan 25, 2022

osanseviero Jan 26, 2022

Choose a reason for hiding this comment

osanseviero Jan 26, 2022

Choose a reason for hiding this comment

osanseviero Jan 26, 2022

Choose a reason for hiding this comment

sgugger Jan 26, 2022

Choose a reason for hiding this comment

osanseviero Jan 27, 2022

Choose a reason for hiding this comment

sgugger Jan 27, 2022 • edited Loading

Choose a reason for hiding this comment

osanseviero Jan 27, 2022

Choose a reason for hiding this comment

osanseviero left a comment

Choose a reason for hiding this comment

sgugger commented Jan 27, 2022

HuggingFaceDocBuilder commented Jan 25, 2022 •

edited

Loading

LysandreJik left a comment •

edited

Loading

sgugger Jan 27, 2022 •

edited

Loading