Revert "Restrict URL preparation to HTTP/HTTPS" #3738

Lukasa · 2016-11-30T21:22:10Z

This reverts commit 34af72c.

This commit was added by @tiran to try to avoid the IDNA-encoding logic that we added in v2.12.0. I believe that the other workarounds we merged for that are sufficient, and this change broke docker-py and probably broke others.

@tiran, can you confirm that your code continues to function with this change reverted? I'd like to be able to merge this and ship a v2.12.3 if at all possible.

Resolves #3735.

This reverts commit 34af72c.

nateprewitt · 2016-11-30T21:24:39Z

@Lukasa, I think it would be beneficial readding the test after this reversion. It still provides a useful check going forward and will pass regardless of the tightened scheme check.

Lukasa · 2016-11-30T21:33:38Z

Heh, I just noticed that that test never actually tested the relevant function. WHAT IDIOT REVIEWED THIS CODE THE LAST TIME? 😉

I'll write some new tests that actually codify our behaviour here so that we avoid regressing.

gvilarino · 2016-11-30T21:37:33Z

I lost 6 hours of my day troubleshooting this bug 😨 🔫

ghost · 2016-12-01T06:00:01Z

@Lukasa sorry for being a hound but do you have an eta on this 2.12.3 release? We just lost a day of dev / test trying to figure out what broke docker-py until we found this. Thx

MS

marcosnils · 2016-12-01T06:04:02Z

@mshahpalerra just pin your requests dep to an older 2.11.x version and you'll be ok. @Lukasa please release a new version of requests ASAP or this will be another left-pad.

ghost · 2016-12-01T06:06:58Z

@marcosnils - already being done but that has some other complications as we are changing this pinning on code that is pre prod so has to be retested (not your problem i know :) ) - thanks for the tip

Lukasa · 2016-12-01T08:02:34Z

@marcosnils Well that's a ludicrously hyperbolic thing to say.

The left-pad incident broke all users of the package. This breaks docker-py, and docker-py only. The vast majority of our users have not noticed a problem with v2.12.2. There are several bug reports both here and on docker-py, there is a good temporary workaround, so if it's all the same to you I'm going to take long enough over this to get it right.

It is much more important to me that we don't keep rushing out v2.12.x releases that require further panicked fixes than it is that we make it possible for you to avoid placing pins in your dependency tree. In this case, the patch that we're reverting was added for a reason, and I'd like to ensure that that reason is still covered rather than re-break someone else.

Lukasa · 2016-12-01T08:04:03Z

@mshahpalerra As to the ETA of v2.12.3, it will be as soon as I can get confirmation from @tiran that the problem he was originally trying to fix is still fixed in this branch. If it isn't, then it will be as soon as I can work with him to build a new patch that avoids this specific issue.

tiran · 2016-12-01T10:44:47Z

ACK

Custodia's test suite for my http+unix adapter is passing with @Lukasa 's branch.

Lukasa · 2016-12-01T10:45:23Z

Ok, let's ship a 2.12.3 then. Thanks @tiran!

lslebodn · 2016-12-22T14:03:52Z

Custodia's test suite for my http+unix adapter is passing with @Lukasa 's branch.

I it really amazing that http+unix t will work with python-requests-3.0. Unfortunately, python-requests-3.0 has not been released yet. But 2.12.3 is released and already in some distributions: fedora-26 and debian unstable and might get to debian testing in few days.

2.12.0 introduced IDNA-encoding logic which broke http+unix:// 2.12.3 with this patch reverted the fix due to #3735 but did not provide any alternative solution. And therefore http+unix:// is broken again.

Is there a reason why cannot be fixed in 2.12.* as well?

nateprewitt · 2016-12-22T15:05:36Z

@Islebodn could you provide an example http+unix:// URI/filepath that you're having issues with in 2.12.3? This should be fixed even without #3713 but perhaps we missed something.

Lukasa · 2016-12-22T15:09:58Z

Correct, this should be fixed in 2.12.4.

lslebodn · 2016-12-22T17:43:46Z

@Islebodn could you provide an example http+unix:// URI/filepath that you're having issues with in 2.12.3? This should be fixed even without #3713 but perhaps we missed something.
Correct, this should be fixed in 2.12.4.
Thank you very much for fast response. i really appreciate it.

I did some tests with more version of pyhton-requests and I found out that 2.12.3 and 2.14.1 does not work well with upper case letters in path. The url is

>>> from requests.compat import quote
>>> sock_path ='/tmp/UPPER/testing.socket'
>>> url = 'http+unix://' + quote(sock_path, safe='')
>>> print url
http+unix://%2Ftmp%2FUPPER%2Ftesting.socket

>>> import requests
>>> r = requests.Request('GET', url=url)
>>> p = r.prepare()
>>> p.url
'http+unix://%2ftmp%2fupper%2ftesting.socket/'

I've just tested and upper case letters works well in such situation with 2.10.0 and 2.11.1

nateprewitt · 2016-12-22T18:34:52Z

Hmm, so you're right. This is an issue in that this path will pass through IDNA without throwing an exception, but IDNA is forcing it to lowercase. This is valid behaviour for a hostname which should be case insensitive but doesn't work for paths (which really shouldn't be IDNA encoded in the first place). This is a bug, but the way unix sockets have worked in the past is kind of incorrect too. It supplies a path in place of the hostname we're expecting.

I don't have an immediate solution for this other than suggesting you store the socket file in a lower-cased path. We'll have to wait for @Lukasa's opinion on this and I'll take a deeper dive if it's deemed needed.

lslebodn · 2016-12-22T18:53:38Z

Hmm, so you're right. This is an issue in that this path will pass through IDNA without throwing an exception, but IDNA is forcing it to lowercase.
This is valid behaviour for a hostname which should be case insensitive

Agree; that's the reason why reverted patch tried to use IDNA only with http:// and https://

but doesn't work for paths (which really shouldn't be IDNA encoded in the first place). This is a bug, but the way unix sockets have worked in the past is kind of incorrect too. It supplies a path in place of the hostname we're expecting.

I am not sure whether it's incorrect. it is an URI with different protocol then http. The similar use case is with different protocols: ldap, ldaps which expect hostname/IP and ldapi which expect url encoded path to unix socket e.g. http://www.openldap.org/lists/openldap-devel/200201/msg00231.html
http://www.openldap.org/doc/admin24/runningslapd.html

nateprewitt · 2016-12-22T19:12:43Z

So, I did some more investigating and this actually is completely unrelated to IDNA.

urllib3/urllib3#911 introduced a forced lowercasing of all host names in urllib3 which was released in urllib3 1.17. The version of urllib3 used in Requests prior to 2.12.0 was 1.16 which is why we hadn't hit this before. urllib3's behaviour here is completely correct because host names should be able to be lowercased without affecting the URIs usability.

As I said above, passing a path has worked in the past but is technically not in line with how we're parsing the URI. Again, I think we'll have to wait for @Lukasa to weigh in here.

sigmavirus24 · 2016-12-22T19:23:26Z

urllib3's behaviour here is completely correct because host names should be able to be lowercased without affecting the URIs usability.

Right.

As I said above, passing a path has worked in the past but is technically not in line with how we're parsing the URI. Again, I think we'll have to wait for @Lukasa to weigh in here.

Since there is no specification around UNIX socket URIs, I would think we'd treat that as a path, not a hostname. shrug

nateprewitt · 2016-12-22T19:44:39Z

Since there is no specification around UNIX socket URIs, I would think we'd treat that as a path, not a hostname. shrug

Agreed, that seems like the sensible thing to do if Requests is going to properly support unix sockets. We'd need to expand both how urllib3 parses URIs, and our flow in prepare_url, rather than relying on our current workarounds. That approach would fix a lot of the IDNA issues at their core too, because our primary conflict has been passing non-hostnames (IPs and paths) into the encoder. Maybe something to consider in 3.0.0.

Lukasa · 2016-12-22T20:25:08Z

Hrm. This is not ideal. However, it's further evidence that Requests is really ill suited to handling these use-cases. It makes an enormous number of assumptions about URL structure that are simply not reasonable.

I have no interest in expanding urllib3's notion of a URL to encompass this use-case. I would, however, be willing to add an extension to allow its URL case normalisation to apply only to specific schemes.

nateprewitt · 2016-12-22T21:47:19Z

So I have an issue ready to throw up on urllib3 to catalog this appropriately, but have another question before we start down the scheme specific normalization route.

Storing something in the form /tmp/path/out.socket as the host and not path of the URI seems off. The authority component isn't required in the URI, so wouldn't it be more correct to detect a URI without a host and store the value in the path variable? I realize that we're adhering to the URL subset of URIs, but can't find anything specific on whether the authority requirement differs. Moving to using path would allow us to avoid this dance around values in the host param that aren't actually hosts.

Lukasa · 2016-12-23T09:33:24Z

@nateprewitt I don't think we do store /tmp/path/out.socket as the host. I think we only store it if its urlencoded when we receive it, which seems more reasonable to me. Though admittedly I'm not the URI expert (that'd be @sigmavirus24).

nateprewitt · 2016-12-23T16:45:08Z

@Lukasa, you're right, I was conflating behaviours in my explanation. %-encoding the path is a hack around prepare_url because Requests refuses URLs without a host at the model level, rather than adapter. The host is mandated for http/https schemes but becomes a grey area with the use of "http+".

If we want to support "http+{unix,docker,???}", which seems poorly defined but in use, we'd probably need to relax the hard stop at no host until we get to the adapter. This would allow someone to pass /tmp/path/out.socket and we can treat it correctly as a path, without needing a work around. Then we can avoid modifying correctly functioning code to accommodate the hack. If we don't intend to support "http+" in 3.0.0 then this is probably a moot point.

tiran · 2016-12-23T18:22:21Z

The % encoding of the path to the Unix socket file or abstract namespace is mandatory. How else would you distinguish between the file path and the HTTP path? Some implementations probe ever path segment until a socket file is found and then magically treat the remaining path segments as HTTP path. That's a dangerous hack. The socket path may include non-ASCII chars or a NULL byte for abstract namespace. Am 23. Dezember 2016 17:45:25 MEZ, schrieb Nate Prewitt <[email protected]>:

@Lukasa, you're right, I was conflating behaviours in my explanation. %-encoding the path is a hack around `prepare_url` because Requests refuses URLs without a host at the model level, rather than adapter. The host is mandated for http/https schemes but becomes a grey area with the use of "http+". If we want to support "http+{unix,docker,???}", which seems poorly defined but in use, we'd probably need to relax the hard stop at no host until we get to the adapter. This would allow someone to pass `/tmp/path/out.socket` and we can treat it correctly as a path, without needing a work around. Then we can avoid modifying correctly functioning code to accommodate the hack. If we don't intend to support "http+" in 3.0.0 then this is probably a moot point. -- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/kennethreitz/requests/pull/3738#issuecomment-269016139

Sent from my Android phone with K-9 Mail.

nateprewitt · 2016-12-23T18:27:37Z

@tiran thanks, for clarifying! I'll defer to those of you that are much more well versed in this space than I am :)

I'll open the issue in urllib3 with Lukasa's originally suggested fix.

tiran · 2016-12-23T21:04:40Z

One has to suffer a lot before one knows all pain points. :) Sent from my Android phone with K-9 Mail.

lslebodn · 2017-01-02T14:22:52Z

and a little bit broken python-requests 1.12.4 got to the debian testing
https://packages.debian.org/search?suite=stretch&searchon=names&keywords=python-requests
https://packages.debian.org/stretch/python-requests
and of course python-urllib3 was upgraded as well
https://packages.debian.org/search?suite=stretch&searchon=names&keywords=python-urllib3
https://packages.debian.org/stretch/python-urllib3

Revert "Restrict URL preparation to HTTP/HTTPS"

402a55b

This reverts commit 34af72c.

Tests for our URL handling.

6f659a4

marcosnils mentioned this pull request Dec 1, 2016

docker_container never starts containers ansible/ansible-modules-core#5774

Closed

graingert mentioned this pull request Dec 1, 2016

Requests v2.12.2 does not add parameters to URLs with schemes it does not understand. #3735

Closed

Lukasa merged commit 7d2dfa8 into psf:master Dec 1, 2016

Lukasa deleted the revert-exclusion-patch branch December 1, 2016 10:45

tiran mentioned this pull request Dec 1, 2016

Tests: accept more ssl error message styles latchset/custodia#92

Closed

pmcnett mentioned this pull request Dec 1, 2016

[2016.11.0] dockerng failing to pull images saltstack/salt#38010

Closed

nateprewitt mentioned this pull request Dec 23, 2016

restrict host normalization to http/https urllib3/urllib3#1080

Closed

nateprewitt mentioned this pull request Jan 11, 2017

only normalize http(s) urls urllib3/urllib3#1089

Merged

lslebodn mentioned this pull request Sep 11, 2017

ConnectionPool should not lowercase all hostnames urllib3/urllib3#1267

Closed

pbrezina mentioned this pull request Apr 28, 2020

Revert workaround in CI for bug in python-{request,urllib3} pbrezina/sssd-test#3271

Closed

sssd-bot mentioned this pull request May 2, 2020

Revert workaround in CI for bug in python-{request,urllib3} SSSD/sssd#4309

Closed

github-actions bot locked as resolved and limited conversation to collaborators Sep 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert "Restrict URL preparation to HTTP/HTTPS" #3738

Revert "Restrict URL preparation to HTTP/HTTPS" #3738

Lukasa commented Nov 30, 2016

nateprewitt commented Nov 30, 2016

Lukasa commented Nov 30, 2016

gvilarino commented Nov 30, 2016

ghost commented Dec 1, 2016

marcosnils commented Dec 1, 2016 •

edited

Loading

ghost commented Dec 1, 2016

Lukasa commented Dec 1, 2016

Lukasa commented Dec 1, 2016

tiran commented Dec 1, 2016

Lukasa commented Dec 1, 2016

lslebodn commented Dec 22, 2016

nateprewitt commented Dec 22, 2016

Lukasa commented Dec 22, 2016

lslebodn commented Dec 22, 2016

nateprewitt commented Dec 22, 2016

lslebodn commented Dec 22, 2016

nateprewitt commented Dec 22, 2016

sigmavirus24 commented Dec 22, 2016

nateprewitt commented Dec 22, 2016 •

edited

Loading

Lukasa commented Dec 22, 2016

nateprewitt commented Dec 22, 2016 •

edited

Loading

Lukasa commented Dec 23, 2016

nateprewitt commented Dec 23, 2016

tiran commented Dec 23, 2016 via email

nateprewitt commented Dec 23, 2016

tiran commented Dec 23, 2016 via email

lslebodn commented Jan 2, 2017 •

edited

Loading

Revert "Restrict URL preparation to HTTP/HTTPS" #3738

Revert "Restrict URL preparation to HTTP/HTTPS" #3738

Conversation

Lukasa commented Nov 30, 2016

nateprewitt commented Nov 30, 2016

Lukasa commented Nov 30, 2016

gvilarino commented Nov 30, 2016

ghost commented Dec 1, 2016

marcosnils commented Dec 1, 2016 • edited Loading

ghost commented Dec 1, 2016

Lukasa commented Dec 1, 2016

Lukasa commented Dec 1, 2016

tiran commented Dec 1, 2016

Lukasa commented Dec 1, 2016

lslebodn commented Dec 22, 2016

nateprewitt commented Dec 22, 2016

Lukasa commented Dec 22, 2016

lslebodn commented Dec 22, 2016

nateprewitt commented Dec 22, 2016

lslebodn commented Dec 22, 2016

nateprewitt commented Dec 22, 2016

sigmavirus24 commented Dec 22, 2016

nateprewitt commented Dec 22, 2016 • edited Loading

Lukasa commented Dec 22, 2016

nateprewitt commented Dec 22, 2016 • edited Loading

Lukasa commented Dec 23, 2016

nateprewitt commented Dec 23, 2016

tiran commented Dec 23, 2016 via email

nateprewitt commented Dec 23, 2016

tiran commented Dec 23, 2016 via email

lslebodn commented Jan 2, 2017 • edited Loading

marcosnils commented Dec 1, 2016 •

edited

Loading

nateprewitt commented Dec 22, 2016 •

edited

Loading

nateprewitt commented Dec 22, 2016 •

edited

Loading

lslebodn commented Jan 2, 2017 •

edited

Loading