Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable logging in with organization token #780

Merged
merged 11 commits into from
Mar 31, 2022
Merged

Conversation

merveenoyan
Copy link
Contributor

@merveenoyan merveenoyan commented Mar 18, 2022

Hello 👋 This PR solves this issue #658. I looked into code to see where the user might login and where tokens are checked. I've seen three cases:

  • When the token in the folder is retrieved (when user doesn't give it)
  • When user logs in through CLI or notebook (_login)
  • When user passes the token directly (which are checked inside repository related functions that are used in places like mixins) like so:
        if token is None:

            token = self._validate_or_retrieve_token()

       elif isinstance(token, str):

          if token.startswith("api_org"):
              raise ValueError("You must use your personal account token for login.")
          elif not self._is_valid_token(token):
              if self._is_valid_token(name):
                  warnings.warn(
                      "`create_repo` now takes `token` as an optional positional argument. "
                     ....

We also check inside validate_or_retrieve token in case the token pulled is organization token.
So I added checks there and wrote tests for those three cases too.
cc: @osanseviero

Copy link
Contributor

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix @merveenoyan .

Now that you're touching these bits, I realize there's quite a few repetitions. Could you maybe refactor those bits by creating a single _get_token utility function which us used throughout the codebase instead of having the same logic multiple times?

I'm not attached to the _get_token function name though.

@merveenoyan
Copy link
Contributor Author

merveenoyan commented Mar 18, 2022

@adrinjalali that's what I thought tbh when I saw them 😅 sure!

@merveenoyan
Copy link
Contributor Author

I refactored & tested but will check if there's nits to do on Monday 🙂

token, name = name, token
if isinstance(token, str):
if token.startswith("api_org"):
raise ValueError("You must use your personal account token for login.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not just for login, since users might be passing the token as a param

Comment on lines 527 to 530
if name is not None:
return token, name
else:
return token
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can always return token, name. It would then return None for the second case for name which is fine and will make usage easier

Comment on lines 1085 to 1090
if token is None:
token = self._validate_or_retrieve_token()
elif not self._is_valid_token(token):
if self._is_valid_token(name):
warnings.warn(
"`create_repo` now takes `token` as an optional positional argument. "
"Be sure to adapt your code!",
FutureWarning,
)
token, name = name, token
else:
raise ValueError("Invalid token passed!")
token = self._validate_or_retrieve_token(function_name="create_repo")
elif isinstance(token, str):
token, name = self._validate_or_retrieve_token(
token, name, function_name="create_repo"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be simplified even further by just doing this:

token, name = self._validate_or_retrieve_token(
                token, name, function_name="create_repo"
            )

_validate_or_retrieve_token already handles case in which token is None or string, so no need to do the check here. Same applies for other functions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can move _validate_or_retrieve_token outside this class, move it to a utils module and use it also in the other file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adrinjalali I recently pushed a new change but didn't do this yet, there's a repetition in user.py if we don't do this, I agree. Will push that soon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adrinjalali it doesn't make any sense to put _validate_or_retrieve_token under endpoint helpers, the function depends heavily on other classes and functions in hf_api.py, so instead I called it in user.py like hf_api._validate_or_retrieve_token.

)
else:
raise ValueError("Invalid token passed!")
token = self._validate_or_retrieve_token(function_name="upload_file")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be updated since the logic below is in validate_or_retrieve_token?

@merveenoyan
Copy link
Contributor Author

I don't see anything else to refactor. I just have one question.

What is the reason behind this in upload_file? (below is initial version)

elif not self._is_valid_token(token):
 if self._is_valid_token(path_or_fileobj):

Do people pass tokens in paths or files? I refactored assuming that. (it didn't make much sense to me though) @osanseviero

@merveenoyan merveenoyan requested a review from adrinjalali March 25, 2022 11:46
@adrinjalali adrinjalali added this to the v0.4 milestone Mar 31, 2022
@adrinjalali
Copy link
Contributor

What is the reason behind this in upload_file? (below is initial version)

I believe this is there only for backward compatibility. At some point the order of args were changed, and therefore this makes sure the older order is still accepted.

Copy link
Contributor

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM

src/huggingface_hub/hf_api.py Show resolved Hide resolved
src/huggingface_hub/hf_api.py Show resolved Hide resolved
src/huggingface_hub/hf_api.py Show resolved Hide resolved
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@merveenoyan merveenoyan requested a review from adrinjalali March 31, 2022 11:01
name (``str``, `optional`):
Name of the repository.
function_name (``str``, `optional`):
If called from a function, name of that function for deprecation warning.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing Returns and Raises sections.

It would also be nice to explain why name and function_name are required and that they'd be removed in a future version (in v0.7 maybe?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Copy link
Contributor

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ended up being quite a bit more than the original issue (it's probably my fault 🙈 ). But I really like the improvement here. Thanks @merveenoyan

@merveenoyan merveenoyan merged commit cc28c37 into main Mar 31, 2022
@merveenoyan merveenoyan deleted the disable_org_token branch March 31, 2022 12:07
@merveenoyan
Copy link
Contributor Author

@adrinjalali what would you rather have, I really wonder 🙂

@adrinjalali
Copy link
Contributor

One could argue that the refactoring could or should have been a separate PR. But I'm glad this is done now.

@osanseviero
Copy link
Contributor

Note that this PR reverts docstrings changes needed for the new documentation (e.g. using *optional*). You can see it in the diff

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants