-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
encoding/json: add Decoder.DisallowDuplicateFields #48298
Comments
This proposal has been added to the active column of the proposals project |
Based on the discussion above, this proposal seems like a likely accept. |
No change in consensus, so accepted. 🎉 |
Is this for #14750? |
Since duplicate names are UB, even Hyrum's Rule aside, can we treat duplicate names as blank unless equal? ISTM, if you're relying on getting the first name or the last name, you're in for trouble, so we should just return blank (but not an error unless DisallowDuplicateFields is set, due to the Hyrum's Law issue). |
Precisely due to Hyrum's Law, I don't see how we can change the default behavior without breaking existing usages. |
Currently encoding/json accepts duplicate fields in the json. Since golang/go#48298 got accepted, we should use the decoder interface with Decoder.DisallowDuplicateFields turned on when available. Its exact behavior will determine whether json.RawMessage's will be re-unmarshaled or will follow the byte reader path.
Looks like it's been a few months since there was any update. Is there anything I can do to help move a fix along? |
I would ask for patience; I know encoding/json is moving very slowly, but we are busy and there will be more to share soon. The blocker in this thread is certainly not a patch :) |
Would be happy to see this added finally. If you need an example of duplicate JSON fields causing troubles, e.g. to priority this more, here is one:
|
This comment was marked as duplicate.
This comment was marked as duplicate.
Hi all, we kicked off a discussion for a possible "encoding/json/v2" package that addresses the spirit of this proposal. |
Just chiming in that another reason duplicate keys can be a liability is that it allows attackers to pad a json object with garbage in order to get past web application firewalls (WAF). AWS and Google WAFs have a 8kb limit, so if the payload is normally
and it will get past WAF with a malicious payload that normally wouldn't get past WAF. And it also gets past internal checks for unknown keys because the key is being masked. There's an existing workaround which is to limit incoming JSON payloads to 8kb |
The presence of duplicate fields in JSON input is almost always a bug from the sender and the behavior across various implementations is highly inconsistent. It's too late to switch the current behavior to always reject duplicate fields in the current package, but we can provide an option to enforce stricter checks. As such, I propose adding a
Decoder.DisallowDuplicateFields
option.Background
Per RFC 8259, section 4, the handling of duplicate names is left as undefined behavior. Rejecting such inputs is within the realm of valid behavior. Tim Bray, the author of RFC 8259, actually recommends going beyond RFC 8259 and that implementations should instead target compliance with RFC 7493. RFC 7493 is a fully compatible subset of RFC 8259, which makes strict decisions about behavior that RFC 8259 leaves undefined (including the rejection of duplicate names).
The lack of duplicate name rejection has correctness implications where roundtrip unmarshal/marshal does not result in semantically equivalent JSON, and surprising behavior for users when they accidentally send JSON objects with duplicate names. In such a case, the current behavior is actually somewhat inconsistent and difficult to explain.
The lack of duplicate name rejection may have security implications since it becomes difficult for a security tool to validate the semantic meaning of a JSON object since meaning is inherently undefined in the presence of duplicate names.
Implementation
A naive implementation can remember all seen names in a Go map. A more clever implementation can take advantage of the fact that we are almost always unmarshaling into a Go map or Go struct. In the case of a Go map, we can use the Go map itself as a means to detect duplicate names. In the case of a Go struct, we can convert a JSON name into an index (i.e., the field index in the Go struct), and then use a an efficient bitmap to detect whether we saw the name before.
In the common case, there would be no performance slow downs to enabling checks for duplicate names.
Aside: I'm not fond of the name
Fields
since JSON terminology calls this either a "name" or "member" (per RFC 8259, section 4). However, it is consistent with the existingDisallowUnknownFields
option.\cc @bradfitz @crawshaw @mvdan
The text was updated successfully, but these errors were encountered: