-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(router) ensure transparent proxying of URIs #2519
Conversation
54000c0
to
abbb128
Compare
7754e78
to
cf9156a
Compare
a980e6f
to
3b6c05f
Compare
kong/core/handler.lua
Outdated
|
||
if uri_args then | ||
uri = uri .. "?" .. uri_args | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than the string concatentation here, how do we feel about defining the proxy_pass directive as
proxy_pass $upstream_scheme://kong_upstream$upstream_url$is_args$args;
i dont have strong feels one way or another, just wondering about idiomatic approach and performance (ngx.var access)
@@ -659,16 +660,12 @@ function _M.new(apis) | |||
return nil | |||
end | |||
|
|||
local new_uri | |||
local uri_root = uri == "/" | |||
local uri_root = request_uri == "/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some of this logic feels ever-so-slightly hand wavey. some comments here about what's going on would be useful, particularly given the subtlety and complexity of the bug resolved here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All these seem just to be variable renames, and that's it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, i spose this comment was in reference to the desire to have some more commentary on this module in general. certainly not a merge blocker though :)
whups, wrong button, didnt mean to approve. LGTM overall though
This looks good to me (except that one |
What this will break are the plugins that call |
Well, this is just a revert back to the implementation we used all along 0.1 to 0.9, and that was only changed in 0.10, nothing groundbreaking. It just handles the URI arguments slightly better, which was one of the original concerns, but there was another big one. The reason I changed it in 0.10 was to avoid the Nginx variable, and benefit from a somewhat nicer encapsulation from I am not sure what the benefit of using multiple Nginx variables could be. Performance because of |
The
I'm pretty sure 2. is handled, but whatabout that 1.? Other than that, I agree, we don't need to have |
@bungle Good point, I will add tests for 1. and 2. The later is handled, the former, I am less sure. |
(relabelling given the notes above ^) |
Yep, thanks! Actually implementing those tests. |
3dcb7a6
to
d85843c
Compare
Pushed a new version with a test for 1. Tests for 2. already exist in the form of the request-transformer plugin. |
assert.equal("world", json.args.hello) | ||
end) | ||
|
||
it("does not proxy '$is_args' if URI does not contain arguments", function() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my only comment here is that stripping the ?
char from the query string when there are no args does not align with our desire to be as transparent as possible in proxying the request. we could build our own version of is_args
that examines the state of args
and the downstream query, and act appropriate. or we could not because i think we all just want this to go away ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yeah I did think about it too, and wanted to explain why not, but I forgot.
In my thinking, as much as we want transparent proxying to happen (and you know how much I pushed for this), I just cannot, for the life of me, think of a use case for why someone would want to send a request with an empty querystring /foo?
. Hence, I did not take care of this and simply judged the Nginx behavior to be acceptable in this case. If we ever receive such a requirement from a use-case, beer is on me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, i did find one case where someone noted that the presence of ?
was semantically significant (granted, not our project or even our lang): psf/requests#2912
In addition, RFC 3986 1.2.3 points out:
The generic syntax uses the slash ("/"), question mark ("?"), and number sign ("#")
characters to delimit components that are significant to the generic parser's hierarchical
interpretation of an identifier.
And from section 6.2.3:
Normalization should not remove delimiters when their associated
component is empty unless licensed to do so by the scheme
specification... the presence or absence of delimiters within a userinfo
subcomponent is usually significant to its interpretation.
If we ever receive such a requirement from a use-case, beer is on me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That 6.2.3 section is quite relevant indeed. Now I kind of want to take care of this use-case actually... Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This originally started because Nginx $uri
is a bad normalization semantics-wise. We changes to $request_uri
only to find out that decoding and encoding doesn't work 1:1. Thats why we will skip that with this patch, but I do think we can do better here. Following what Nginx does here doesn't feel right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bungle under "Normalizations that change semantics"
Removing the "?" when the query is empty. When the query is empty, there may be no need for the "?".
Example:
http://www.example.com/display? → http://www.example.com/display
The point here is that we want to avoid changing semantic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like this patch resolves the issue from what I can tell (hastily whipped together): https://gist.github.com/p0pr0ck5/e742a478a8bfcaeabeefddafd54af947
However, the test in question fails, assumedly because of upstream normalization (both httpbin and mockbin strip ?
when no query args are present). tshark proof: https://gist.github.com/p0pr0ck5/d19fb99c88ed5e2b1efdc53707912bb1#file-gistfile1-txt-L375
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but I do think we can do better here.
Yep, let's do it then. Updating the patch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@p0pr0ck5 The patch would work, but would also add extra calls to ngx.var
, so at this point, I think we should group the querystring in upstream_uri
as the original implementation of this patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New patch is up!
d48fb66
to
7192697
Compare
|
||
local body = assert.res_status(200, res) | ||
local json = cjson.decode(body) | ||
assert.matches("/get%?$", json.vars.request_uri) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We test this behavior with our custom template since both httpbin and mockbin strip the querystring delimiter if it is empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, wouldnt mind seeing a few comments clarified but not a blocker for merge at this point
kong/core/handler.lua
Outdated
-- `/foo?` is to keep `$is_args` an empty string, hence effectively | ||
-- stripping the empty querystring. | ||
-- We overcome this behavior with our own logic, to prevent user | ||
-- desired semantics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"to prevent user desired semantics" doesnt quite make sense to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant "preserve"
kong/core/handler.lua
Outdated
-- `uri` is the URI with which to call upstream, as returned by the | ||
-- router, which might have truncated it (`strip_uri`). | ||
-- | ||
-- If set `host_header` is the original header to be preserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the alignment of the comments can be a bit confusing; might me sense to have
-- comment upstream_uri
var.upstream_uri = uri
-- comment about host_headr
var.upstream_host = ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like grouping multiple var
or ctx
assignments together, as is done multiple times in that function
* The router now returns a 4th value: the URI to use for the upstream call. * The `proxy_pass` directive now inlines a new `$upstream_uri` variable which contains a raw string with both the Nginx `$request_uri` and `$args` variables. * Update the router value return test * New test to ensure we proxy the request's querystring * New regression test to ensure we do not double percent-encode request URIs. Fix #2512
7192697
to
14f6297
Compare
Rebuilt the history since the refactor commit accidentally got some diff from the fix one. + clarified comments |
Summary
Since
ngx.req.set_uri
will make Nginx's proxying escape the upstream URI, we do not use this API anymore. As it was the case prior to Kong 0.10, we go back to inlining an Nginx variable containing the desired, plain URI, and thus ensure Nginx will not unconditionally percent-encode before proxying.Full Changelog
call.
proxy_pass
directive now inlines a new$upstream_uri
variablewhich contains a raw string with both the Nginx
$request_uri
and$args
variables.URIs.
Issues resolved
Fix #2512