Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UriCompliance is ignored for query string parsing #12750

Closed
Spikhalskiy opened this issue Jan 31, 2025 · 16 comments · Fixed by #12763
Closed

UriCompliance is ignored for query string parsing #12750

Spikhalskiy opened this issue Jan 31, 2025 · 16 comments · Fixed by #12763
Assignees
Labels
Bug For general bugs on Jetty side

Comments

@Spikhalskiy
Copy link

Spikhalskiy commented Jan 31, 2025

Jetty version(s)
Jetty 10, 11, 12.0.16

Jetty Environment
EE 10

Description

We have an application that was able to process URIs in the following format with Jetty 9:

http://localhost:8008/set?action=ADD&url=http://somedomain.com?us_options=%%US_OPTIONS%%

Jetty 10 and 11 enforced stricter requirements for URIs but exposed UriCompliance, specifically UriCompliance.LEGACY, which should help us deal with the legacy apps / clients.

But the query string's deserialization code path seemingly doesn't respect the UriCompliance.LEGACY or any other custom UriCompliance setting.

So the URL above that was able to be processed by jetty 9, in jetty 12 (and also 10/11) leads to the following exception:

java.lang.IllegalArgumentException: Not valid encoding '%%U'
	at org.eclipse.jetty.util.UrlEncoded.decodeHexByte(UrlEncoded.java:974)
	at org.eclipse.jetty.util.UrlEncoded.decodeUtf8To(UrlEncoded.java:380)
	at org.eclipse.jetty.util.UrlEncoded.decodeTo(UrlEncoded.java:226)
	at org.eclipse.jetty.server.Request.extractQueryParameters(Request.java:569)
	at org.eclipse.jetty.ee10.servlet.ServletApiRequest.extractQueryParameters(ServletApiRequest.java:1071)
	at org.eclipse.jetty.ee10.servlet.ServletApiRequest.getParameters(ServletApiRequest.java:960)
	at org.eclipse.jetty.ee10.servlet.ServletApiRequest.getParameterNames(ServletApiRequest.java:934)
	at jakarta.servlet.ServletRequestWrapper.getParameterNames(ServletRequestWrapper.java:166)
	at jakarta.servlet.ServletRequestWrapper.getParameterNames(ServletRequestWrapper.java:166)

Would it be possible to respect UriCompliance for query strings so we can accept the same URIs on Jetty 12 that were acceptable on Jetty 9?

@Spikhalskiy Spikhalskiy added the Bug For general bugs on Jetty side label Jan 31, 2025
@joakime
Copy link
Contributor

joakime commented Jan 31, 2025

Already fixed in PR #12671

Due out in Jetty 12.0.17

@Spikhalskiy
Copy link
Author

Closed by #12671

@Spikhalskiy
Copy link
Author

Spikhalskiy commented Feb 2, 2025

I tested Jetty 12.0.17-SNAPSHOT

UriCompliance.LEGACY is indeed partially fixed, as the servlets now get HttpServletRequest with "legacy" query strings, and they are not getting blocked.
But HttpServletRequest#getParameter doesn't work and returns null for all parameters (even correctly encoded) of the passed query string. I think an expected behavior for a Legacy mode would be for #getParameter to work in 'legacy' way too? And return the parameters correctly.

Image

@Spikhalskiy Spikhalskiy reopened this Feb 2, 2025
@joakime
Copy link
Contributor

joakime commented Feb 2, 2025

@gregw want to take this one?
Looks like a good one to get in for 12.0.17 release.

@joakime
Copy link
Contributor

joakime commented Feb 2, 2025

I tested Jetty 12.0.17-SNAPSHOT

Thanks for doing that!

UriCompliance.LEGACY is indeed partially fixed, as the servlets now get HttpServletRequest with "legacy" query strings, and they are not getting blocked. But HttpServletRequest#getParameter doesn't work and returns null for all parameters (even correctly encoded) of the passed query string. I think an expected behavior for a Legacy mode would be getParameter to work in 'legacy' way too? And return the parameters correctly.

Note, the LEGACY mode does NOT contain the BAD_UTF8_ENCODING violation. (an intentional decision at the moment).

Can you test with a custom UriCompliance ...

httpConfiguration.setUriCompliance(UriCompliance.from("LEGACY,BAD_UTF8_ENCODING"));

@joakime
Copy link
Contributor

joakime commented Feb 2, 2025

@gregw we should review our decision to not have BAD_UTF8_ENCODING in LEGACY mode.

@Spikhalskiy
Copy link
Author

Spikhalskiy commented Feb 2, 2025

@joakime I tried UNSAFE - same behavior, request reaches the servlet, but #getParameter returns null for all.

@joakime
Copy link
Contributor

joakime commented Feb 2, 2025

@joakime I tried UNSAFE - same behavior, request reaches the servlet, but #getParameter returns null for all.

Ah, yeah, UNSAFE does include BAD_UTF8_ENCODING, so now it's a confirmed bug.

@joakime
Copy link
Contributor

joakime commented Feb 2, 2025

What environment were you testing in? (eg: ee11, ee9, etc)
Or were you testing with native Jetty Handlers directly?

@Spikhalskiy
Copy link
Author

Spikhalskiy commented Feb 2, 2025

It's Jetty 12.0.17-SNAPSHOT with EE10 with jakarta's HttpServletRequest request that is getting forwarded somewhere in our filters chain.

To simplify it for you guys and eliminate forwarding concerns, this is a debug state earlier in the filter chain with #getParameter called on org.eclipse.jetty.ee10.servlet.ServletApiRequest and is still returning null.

Image

@gregw
Copy link
Contributor

gregw commented Feb 2, 2025

I'm AFK until Tuesday

@joakime
Copy link
Contributor

joakime commented Feb 2, 2025

I'm AFK until Tuesday

I'll take this, triage it on Monday.

@joakime
Copy link
Contributor

joakime commented Feb 3, 2025

PR opened

joakime added a commit that referenced this issue Feb 7, 2025
Add BAD_UTF8_ENCODING to UriCompliance.LEGACY
joakime added a commit that referenced this issue Feb 7, 2025
)

* Issue #12750 - handle query double-pct-encoded BAD_UTF8_ENCODING

* Add BAD_UTF8_ENCODING to UriCompliance.LEGACY
@github-project-automation github-project-automation bot moved this from 🏗 In progress to ✅ Done in Jetty 12.0.17 - FROZEN Feb 7, 2025
sbordet pushed a commit that referenced this issue Feb 9, 2025
Add BAD_UTF8_ENCODING to UriCompliance.LEGACY
sbordet pushed a commit that referenced this issue Feb 9, 2025
)

* Issue #12750 - handle query double-pct-encoded BAD_UTF8_ENCODING

* Add BAD_UTF8_ENCODING to UriCompliance.LEGACY
@joakime
Copy link
Contributor

joakime commented Feb 13, 2025

@Spikhalskiy be aware that this feature is changing again before release.

After analyzing the Jetty 11 vs Jetty 12 behavior, the BAD_UTF8_ENCODING does not follow the Jetty 11 behavior.
We will be removing BAD_UTF8_ENCODING from UriCompliance.LEGACY.

See the following PR for test cases and such showing the differences.

The trend in that PR is to introduce a new UriCompliance option that fits the Jetty 11 behaviors more accurately.
The actual name isn't settled yet.

@Spikhalskiy
Copy link
Author

Spikhalskiy commented Feb 13, 2025

@joakime Thank you for letting me know. It's not very important, as we configure UNSAFE for our purposes. BUT.
UriCompliance.LEGACY was introduced as a legacy for Jetty 9 I believe, not Jetty 11. And I think that BAD_UTF8_ENCODING is aligned with what we had on Jetty 9. So it honestly makes more sense to me that BAD_UTF8_ENCODING should be included in Legacy, so I don't get the rationale for taking it off based on J11 vs J12 comparison.

@joakime
Copy link
Contributor

joakime commented Feb 13, 2025

@joakime Thank you for letting me know. It's not very important, as we configure UNSAFE for our purposes. BUT.
UriCompliance.LEGACY was introduced as a legacy for Jetty 9 I believe, not Jetty 11. And I think that BAD_UTF8_ENCODING is aligned with what we had on Jetty 9. So it honestly makes more sense to me that BAD_UTF8_ENCODING should be included in Legacy, so I don't get the rationale for taking it off based on J11 vs J12 comparison.

Actually, turns out it doesn't match Jetty 9 thru Jetty 11 behavior.
In Jetty 9 thru Jetty 11, the majority of bad UTF-8 encoding situations actually result in an Exception triggering a 400/500 response, using replacement characters (like what BAD_UTF8_ENCODING does) is actually the minority case.
Said another way ... There is a very small and narrow set of conditions where a replacement character is used in Jetty 9 thru Jetty 11, the rest of the cases are Exceptions / failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug For general bugs on Jetty side
Projects
No open projects
Status: ✅ Done
Development

Successfully merging a pull request may close this issue.

3 participants