-
-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Basic http auth + unicode = error #3662
Comments
Yeah, that looks like a bug. I think in this case the best fix is to allow the user to provide bytestrings for the username and password, and if they do that to simply use the bytestring directly rather than to try to encode. Are you interested in providing a test and patch for this? |
@Lukasa Gladly, but I'm a bit overburdened at the moment. I'll have spare time in 2-3 weeks, if it's still open, I'll take a peek. |
Ok cool, I'll mark this as contributor friendly and if no-one else picks it up by the time you have time you should take a swing at it. |
Hello, @Lukasa! Your idea about byte strings looks very good and fully matches the white spaces in spec. But. There are two ways to release your idea:
And last, I think 95% peoples will be write code like this: u = 'Дмитрий' # my name in Russian
p = 'password'
r = request.get(url, auth=(u.encode('utf-8'), p)) To my mind, it looks not 'for humans'. r = request.get(url, auth=(u.encode('utf-8').decode('latin1'), p)) But we can change only one line of code: - b64encode(('%s:%s' % (username, password)).encode('latin1')).strip()
+ b64encode(('%s:%s' % (username, password)).encode('utf-8')).strip() After that the same code will look as: r = request.get(url, auth=(u, p)) It looks for Humans :) What do you think about all this? Sorry for my grammar. |
@klimenko It does look better that way, but it's unfortunately just moving the problem. Now anyone whose server is expecting a non-UTF-8 encoded username is going to get tripped up, and so we'll have to re-open this issue when someone says "my server wanted Latin1 and now doesn't get it". It's better to use bytestrings because that way we avoid making a guess that is wrong. If the users still want the helpful automatic choice, they can pass a unicode string, but if they want to do something more specific we have an escape hatch for them. |
Hi guys, I would like to take a crack at this. |
@nateprewitt I will keep an eye on it, thanks. |
Resolved by #3673. |
Thanks, I got past it! |
Description
It is not possible to send a basic http authentication using a username or password that contains Unicode data.
What happens
UnicodeEncodeError
is thrown. Traceback:Expected behavior
The authentication is encoded as utf-8 (at least if charset=utf-8 is provided in the header).
How to reproduce
Consider the following request:
I think that the culprit is this line (https://github.com/kennethreitz/requests/blob/master/requests/auth.py#L32), which assumes latin-1 encoding regardless of the charset header:
Workaround
This seems to work:
Version info
The text was updated successfully, but these errors were encountered: