-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix when TPU device check is ran #469
Conversation
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can maybe add an argument to the is_tpu_available
function to no check for a tpu device sometimes, but we can't remove the device test entirely as we added it for a reason.
try: | ||
# Will raise a RuntimeError if no XLA configuration is found | ||
_ = xm.xla_device() | ||
_tpu_available = True | ||
except RuntimeError: | ||
_tpu_available = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmm, removing this will break the other places we use is_tpu_available
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The added reason was specifically inside AcceleratorState
, at the proposed location. But having it as an argument instead works as well. Will refactor
src/accelerate/utils/imports.py
Outdated
@@ -56,8 +51,15 @@ def is_apex_available(): | |||
return importlib.util.find_spec("apex") is not None | |||
|
|||
|
|||
def is_tpu_available(): | |||
"Checks if `torch_xla` is installed and if a TPU is in the environment" | |||
def is_tpu_available(check_device=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be True
by default, and False
when we don't want to check for the device (before launching multiprocessing for instance).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It needs to be False
by default, because otherwise it also does this on import checks that are scattered around the library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or made into a separate function if we don't want False
behavior. (I know we're not fans of that, but this is one case where it should be False
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for iterating!
This PR fixes an issue where
xm.xla_device()
can't be called outside ofxm.spawn
. As a result the current behavior foris_tpu_available
breaks the notebook launcher.The proposed fix is to check this in state directly outside the if chain so that checking if on a TPU device can be checked properly still.