Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert to using a copy of splituser from Python 3.8. #1670

Merged
merged 3 commits into from
Feb 3, 2019

Conversation

jaraco
Copy link
Member

@jaraco jaraco commented Feb 3, 2019

Using urllib.parse.urlparse is clumsy and causes problems as reported in #1663 and #1668. Alternative to #1499 and fixes #1668.

Summary of changes

In #1499, we learned that Python deprecated splituser here and that the recommended change was to use urllib.parse.urlparse. However, as we saw in the original patch and later in regressions, in both places where splituser was previously used, the use of urllib.parse.urlparse doesn't satisfy the needs that splituser did.

So this PR steals the latest implementation of splituser from the Python 3.8 source and uses it instead, largely reverting the changes from #1501 and #1666.

I note that this change restores a flaw in the logic for open_with_auth: if scheme is not http/s, address (formerly host) is never defined, but if auth is found in PyPIConfig, a NameError could occur on line 1072 because address isn't defined. Since this flaw existed in the old code and didn't cause any issues, I'm not interested in fixing it.

Closes #1668

Pull Request Checklist

  • Changes have tests
  • News fragment added in changelog.d. See documentation for details

@jaraco
Copy link
Member Author

jaraco commented Feb 3, 2019

As a sanity check, I ran a diff of package_index.py against v40.6.3. When I did, I realized there was yet another place where the port handling was likely broken in #1501 and in be840c2 I've reverted that change also. Now the diff against v40.6.3 looks like this:

diff --git a/setuptools/package_index.py b/setuptools/package_index.py
index 1608b91a..705a47cf 100644
--- a/setuptools/package_index.py
+++ b/setuptools/package_index.py
@@ -856,7 +856,7 @@ class PackageIndex(Environment):
             scheme, netloc, path, p, q, f = urllib.parse.urlparse(url)
             if not netloc and path.startswith('//') and '/' in path[2:]:
                 netloc, path = path[2:].split('/', 1)
-                auth, host = urllib.parse.splituser(netloc)
+                auth, host = _splituser(netloc)
                 if auth:
                     if ':' in auth:
                         user, pw = auth.split(':', 1)
@@ -1047,7 +1047,8 @@ class PyPIConfig(configparser.RawConfigParser):
 def open_with_auth(url, opener=urllib.request.urlopen):
     """Open a urllib2 request, handling HTTP authentication"""
 
-    scheme, netloc, path, params, query, frag = urllib.parse.urlparse(url)
+    parsed = urllib.parse.urlparse(url)
+    scheme, netloc, path, params, query, frag = parsed
 
     # Double scheme does not raise on Mac OS X as revealed by a
     # failing test. We would expect "nonnumeric port". Refs #20.
@@ -1055,7 +1056,7 @@ def open_with_auth(url, opener=urllib.request.urlopen):
         raise http_client.InvalidURL("nonnumeric port: ''")
 
     if scheme in ('http', 'https'):
-        auth, host = urllib.parse.splituser(netloc)
+        auth, address = _splituser(netloc)
     else:
         auth = None
 
@@ -1068,7 +1069,7 @@ def open_with_auth(url, opener=urllib.request.urlopen):
 
     if auth:
         auth = "Basic " + _encode_auth(auth)
-        parts = scheme, host, path, params, query, frag
+        parts = scheme, address, path, params, query, frag
         new_url = urllib.parse.urlunparse(parts)
         request = urllib.request.Request(new_url)
         request.add_header("Authorization", auth)
@@ -1082,13 +1083,20 @@ def open_with_auth(url, opener=urllib.request.urlopen):
         # Put authentication info back into request URL if same host,
         # so that links found on the page will work
         s2, h2, path2, param2, query2, frag2 = urllib.parse.urlparse(fp.url)
-        if s2 == scheme and h2 == host:
+        if s2 == scheme and h2 == address:
             parts = s2, netloc, path2, param2, query2, frag2
             fp.url = urllib.parse.urlunparse(parts)
 
     return fp
 
 
+# copy of urllib.parse._splituser from Python 3.8
+def _splituser(host):
+    """splituser('user[:passwd]@host[:port]') --> 'user[:passwd]', 'host[:port]'."""
+    user, delim, host = host.rpartition('@')
+    return (user if delim else None), host
+
+
 # adding a timeout to avoid freezing package_index
 open_with_auth = socket_timeout(_SOCKET_TIMEOUT)(open_with_auth)
 

Basically the only change is to adopt _splituser from Python to avoid the deprecation warning, and to rename host to address in open_with_auth to avoid the temptation to use parsed.hostname in its place.

@jaraco jaraco merged commit f03ef20 into 40.7-maintenance Feb 3, 2019
@jaraco jaraco deleted the bugfix/1668-adopt-splituser branch February 3, 2019 15:25
@cjerdonek
Copy link
Member

FYI, pip also has an implementation of this here (with unit tests elsewhere in the code base): https://github.com/pypa/pip/blob/f048eb7a76eab222a4dfcfc3fd01f1c9e992f49a/src/pip/_internal/utils/misc.py#L941
distlib does, too: https://bitbucket.org/pypa/distlib/issues/116/parse_credentials-should-percent-decode
You might want to check to see if you need to unquote at any point, too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants