Skip to content

Commit

Permalink
HTTP basic auth: encode username and password as UTF-8 (mwclient#315)
Browse files Browse the repository at this point in the history
As discussed upstream in
psf/requests#4564 , HTTP basic auth
usernames and passwords sent to requests as Python text strings
are encoded as latin1. This of course makes it impossible to
log in with a username or password containing characters not
represented in latin1, as the reporter of mwclient#315 found out.

To work around this rather old-fashioned default, let's intercept
string usernames and passwords and encode them as utf-8 before
sending them to requests.

Anyone dealing with a really old server that can't handle utf-8,
or something like that, can encode the username and password
appropriately and provide them as bytestrings.

Signed-off-by: Adam Williamson <[email protected]>
  • Loading branch information
AdamWill committed Jan 27, 2024
1 parent 2cb2e32 commit 8e7422a
Show file tree
Hide file tree
Showing 2 changed files with 20 additions and 2 deletions.
13 changes: 11 additions & 2 deletions mwclient/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,10 +59,13 @@ class Site:
do_init (bool): Whether to automatically initialize the :py:class:`Site` on
initialization. When set to `False`, the :py:class:`Site` must be initialized
manually using the :py:meth:`.site_init` method. Defaults to `True`.
httpauth (Union[tuple[str, str], requests.auth.AuthBase]): An
httpauth (Union[tuple[basestring, basestring], requests.auth.AuthBase]): An
authentication method to be used when making API requests. This can be either
an authentication object as provided by the :py:mod:`requests` library, or a
tuple in the form `{username, password}`.
tuple in the form `{username, password}`. Usernames and passwords provided as
text strings are encoded as UTF-8. If dealing with a server that cannot
handle UTF-8, please provide the username and password already encoded with
the appropriate encoding.
reqs (Dict[str, Any]): Additional arguments to be passed to the
:py:meth:`requests.Session.request` method when performing API calls. If the
`timeout` key is empty, a default timeout of 30 seconds is added.
Expand Down Expand Up @@ -109,6 +112,12 @@ def __init__(self, host, path='/w/', ext='.php', pool=None, retry_timeout=30,
if consumer_token is not None:
auth = OAuth1(consumer_token, consumer_secret, access_token, access_secret)
elif isinstance(httpauth, (list, tuple)):
# workaround weird requests default to encode as latin-1
# https://github.com/mwclient/mwclient/issues/315
# https://github.com/psf/requests/issues/4564
httpauth = [
it.encode("utf-8") if isinstance(it, str) else it for it in httpauth
]
auth = HTTPBasicAuth(*httpauth)
elif httpauth is None or isinstance(httpauth, (AuthBase,)):
auth = httpauth
Expand Down
9 changes: 9 additions & 0 deletions test/test_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,15 @@ def test_httpauth_defaults_to_basic_auth(self):

assert isinstance(site.connection.auth, requests.auth.HTTPBasicAuth)

@responses.activate
def test_basic_auth_non_latin(self):

self.httpShouldReturn(self.metaResponseAsJson())

site = mwclient.Site('test.wikipedia.org', httpauth=('我', '非常秘密'))

assert isinstance(site.connection.auth, requests.auth.HTTPBasicAuth)

@responses.activate
def test_httpauth_raise_error_on_invalid_type(self):

Expand Down

0 comments on commit 8e7422a

Please sign in to comment.