-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#687] trim payload before try to parse it. #721
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, the checking for the leading {
is a pain.
We have talked about this a few times as @lhazlewood has been working away on the "jwe" branch.
IIRC, we talked about using the cty
header, for this.
The cty
header:
It will typically not be used by applications when the kind of object is already known. This parameter is ignored by JWS implementations; any processing of this parameter is performed by the JWS application. Use of this Header Parameter is OPTIONAL.
Per spec, the JWT payload could be anything, though its usually JSON, so the default should assume JSON, and anything else should use a parseBytes()
method.
This came up recently too, this method isn't 100% correct, it should return a byte[] (well, and have it's name tweaked)
Jwt<Header, String> parsePlaintextJwt(String plaintextJwt) |
If interested in hacking on this send a PR to the jwe
branch 😄
Also, thanks @bmarwell! |
The concept of 'plaintext' (as a String) will be going away for 1.0. JJWT's concept of String vs JSON body support was created when the JWT RFCs weren't yet finalized. Now that they are final, the reality is that a JWT can contain any type of payload, with JSON and JSON claims being the most common type of payload. This is very cool IMO, and not many folks are aware of this super cool capability. It means that JWTs can be used to represent images, pdf documents, etc - basically anything. JWT is now effectively a general-purpose messaging format, not something limited to just identity assertions. As such, if it's not a commonly-expected JSON claims payload, the As @bdemers quoted above, if the jjwt/impl/src/main/java/io/jsonwebtoken/impl/DefaultJwtParser.java Lines 511 to 514 in 3ba22fe
(but without the raw Now - with regards to trimming the resulting String before employing any heuristics: Because the payload can be anything - even a normal string - we can't alter the payload for trying to detect if it's JSON (because that leading whitespace could be an expected part of a non-JSON payload). If anything, the best thing to do is extract this logic out to a helper method, e.g. private static boolean isLikelyJson(String payload, Header<?> header) {
// 1. simple null/empty check:
if (Strings.isEmpty(payload) {
return false;
}
// 2. if (hasContentType(header), return `false`.
// 3. Start at the beginning of the string, and iterate character by character, stopping at the first non-whitespace character
// if, a `{`, continue, if not, return `false`
// 4. Start at the end of the string and iterate character by character, stopping at the first non-whitespace character
//. if a `}`, return `true`, if not, return `false`
} Something to that effect. This ensures we don't create another non-mutable String in the String table, and doesn't alter the intended payload. |
Even after reading this, I still don't get why just trying to parse it is not an option. Besides, my option does not alter the input... If you are just trying to parse JSON and add an try-except, the parser will take care of superfluous whitespace. Anyway, I was trying to get a quick fix into the current version. Your branch might take another month or so to evolve. So why not start with an 80% solution? Long story short: |
It doesn't alter the input directly, but it does add a potentially-large temporary
Invoking a JSON parser and swallowing a subsequent exception when the payload isn't JSON is a heavyweight operation that could significantly impact performance - better to avoid it if possible. I've also never really been a fan of using try/catch scenarios for expected logic, because it's not 'exceptional' any longer. But that's less of a concern for me - performance is more important IMO. But even if we ignore the heap and potential performance impact, there's a bigger problem here: How do you distinguish between JSON-but-invalid-JSON and not-JSON-but-should-be-swallowed/ignored exceptions? They need to be handled differently, and there's no way to know really without inspecting the
We can do that today. The quick fix is to create the helper method This ensures there is zero heap or performance penalty, and retains current behavior semantics until we can enforce the
I agree we should be more tolerant here per the Robustness Principle, but in fairness, I wouldn't exactly call it serious. To the best of my knowledge, this is the first report I can remember in many years of leading whitespace being an issue, probably because practically every other JWT library doesn't add extraneous whitespace. That said, we do need address this, and I think(?) the minimal 3-part Does that work? |
Phew, I don’t really know… for several reasons.
That's a potential +1 for the Json parser, btw. ;-)
Is it? They can operate on streams and fail early, especially if you tell them to deserialize a That said, the more effort we put into heuristics or any other
I totally agree here.
At the moment we have an even bigger problem described in #678 by the original author: At least that is a bigger problem for me, because jjwt is already using JSON parsers which would solve the problem. The workaround would be for users to implement the callback parseJws/parseJwt methods where THEY do the try-catch logic. Not very user friendly when the library could do this for you. I fully agree with all the following cty stuff, but I am still looking for a fix for a minor version.
As mentioned above, there is a high potential that we are not giving back a ClaimJWT/JWS but instead a byte[] payload when the JWT contained perfectly valid JSON.
🤷🏻♂️ I mean, we could do a performance measurement. |
I'm fairly confident that trying logic when not needed, throwing an exception, filling in a stack trace, propagating the call stack, etc, is most definitely slower and more cumbersome than just skipping whitespace in a loop, especially when 99.99% of payloads never have leading or trailing whitespace.
I'm not worried about this particular case because the fix is trivial: We'd be keeping logic that has been working for 7+ years, and just accounting for any leading or trailing whitespace that might occur. This is what the JWT RFC requires for JSON Claims anyway. Once that logic exists, I'm pretty sure it won't need to change for another 7 years (if ever). We can go straight to the 'if the So in summary, skipping leading and trailing whitespace in the payload before checking for a I'm really not trying to be difficult, I promise! I like the parse/try/catch approach, but I want to do the bare minimum necessary to resolve #687 without adding any additional risk whatsoever. |
Ok, I'll try to come up with something new. |
Closing this just as a matter of housekeeping: the corresponding logic for this is in jjwt/impl/src/main/java/io/jsonwebtoken/impl/DefaultJwtParser.java Lines 257 to 325 in fa1e32b
|
fixes #687
Hint: I really don't like your heuristics.
Imagine this scenario:
You have a binary payload which is by design not intended to be json, e.g. HOCON (typesafe config).
The parser won't parse it as either JSON nor as payload just because it starts and ends with
{
and}
.Better approach:
If you are interested in this approach, I can create another PR. :)