Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When on sagemaker use their env variables for saves #9876

Merged
merged 3 commits into from
Jan 29, 2021
Merged

Conversation

sgugger
Copy link
Collaborator

@sgugger sgugger commented Jan 28, 2021

What does this PR do?

When on SageMaker, the content of the env variable "SM_OUTPUT_DATA_DIR" should be used to save training artifacts (such as our checkpoint) so make it overwrite the output_dir (and make that argument optional so it doesn't need to be passed for sagemaker training).
Then the final model will be easy to deploy if it's also saved to the content of the env variable "SM_MODEL_DIR" so adding that as well.

@sgugger sgugger requested a review from LysandreJik January 28, 2021 21:11
Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, LGTM!

Comment on lines +1369 to +1370
# If on sagemaker and we are saving the main model (not a checkpoint so output_dir=None), save a copy to
# SM_MODEL_DIR for easy deployment.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great comment

"`output_dir` is only optional if it can get inferred from the environment. Please set a value for "
"`output_dir`."
)
elif os.getenv("SM_OUTPUT_DATA_DIR") is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this check is self.output_dir is not None before logging a warning that it's being overwritten?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, good point!

@sgugger sgugger merged commit 7eadfe1 into master Jan 29, 2021
@sgugger sgugger deleted the sm_output_dir branch January 29, 2021 14:52
Qbiwan pushed a commit to Qbiwan/transformers that referenced this pull request Jan 31, 2021
* When on sagemaker use their env variables for saves

* Address review comments

* Quality
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants