Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-125682: Python implementation of json.loads() accepts non-ascii digits #125687

Merged
merged 6 commits into from
Oct 18, 2024

Conversation

nineteendo
Copy link
Contributor

@nineteendo nineteendo commented Oct 18, 2024

@nineteendo nineteendo marked this pull request as ready for review October 18, 2024 09:51
Copy link
Contributor

@taleinat taleinat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is a backwards-incompatible change, so we should consider adding a minor note in "What's New", and I'm not sure that it should be backported.

@ZeroIntensity
Copy link
Member

Eh, I don't think a "What's New" entry is warranted. This only applies to the pure-Python version of JSON, which isn't the default, so I highly doubt anyone is relying on it. I think not backporting is probably reasonable, though.

@taleinat
Copy link
Contributor

Eh, I don't think a "What's New" entry is warranted. This only applies to the pure-Python version of JSON, which isn't the default, so I highly doubt anyone is relying on it. I think not backporting is probably reasonable, though.

Yes, upon further consideration I agree: This is inconsistent between the C and Python implementations of the same module, and seems not to have been intentional, since our docs directly reference RFC 7159 and ECMA-404 which both clearly specify what are "digits" in the context of JSON.

@nineteendo nineteendo changed the title gh-125682: Python implementation of json.loads() accepts unicode digits gh-125682: Python implementation of json.loads() accepts non-ascii digits Oct 18, 2024
Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this looks good, thank you!

I'll let Tal have the final decision on whether to backport this and #125683. I'm personally -1 on backporting--Python implementations that are relying on this might get a breaking change between patches. This is small enough that it's fine for minor versions, but I'm not comfortable with putting this in a patch release.

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. 👍

Alternatively we could use the re.ASCII flag. These are two equivalent solutions.

@serhiy-storchaka
Copy link
Member

I think that this should be backported.

@serhiy-storchaka serhiy-storchaka added needs backport to 3.12 bug and security fixes needs backport to 3.13 bugs and security fixes labels Oct 18, 2024
@serhiy-storchaka serhiy-storchaka merged commit d358425 into python:main Oct 18, 2024
40 checks passed
@miss-islington-app
Copy link

Thanks @nineteendo for the PR, and @serhiy-storchaka for merging it 🌮🎉.. I'm working now to backport this PR to: 3.12, 3.13.
🐍🍒⛏🤖

@nineteendo nineteendo deleted the patch-2 branch October 18, 2024 12:26
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Oct 18, 2024
… of JSON decoder (pythonGH-125687)

(cherry picked from commit d358425)

Co-authored-by: Nice Zombies <[email protected]>
@bedevere-app
Copy link

bedevere-app bot commented Oct 18, 2024

GH-125692 is a backport of this pull request to the 3.13 branch.

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Oct 18, 2024
… of JSON decoder (pythonGH-125687)

(cherry picked from commit d358425)

Co-authored-by: Nice Zombies <[email protected]>
@bedevere-app bedevere-app bot removed the needs backport to 3.13 bugs and security fixes label Oct 18, 2024
@bedevere-app
Copy link

bedevere-app bot commented Oct 18, 2024

GH-125693 is a backport of this pull request to the 3.12 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.12 bug and security fixes label Oct 18, 2024
@taleinat
Copy link
Contributor

I think that this should be backported.

Thanks @serhiy-storchaka. For future reference, I'd be interested in your reasoning regarding whether or not to backport this.

@serhiy-storchaka
Copy link
Member

I have not merged the backports yet.

My reasoning is that this is a clear bug. There are should be really good reasons to not backport it, for example if there is a known user code that depends on it or at least plausible scenario of this, or the fix has significant performance cost, or depends on feature that did not exist in older version, or the code was changed so much, that backporting requires significant efforts. This is not the case. There is an old and stable specification of JSON, and current code does not follow it. It only accepts numbers with non-ASCII digits if they start with an ASCII digit -- generating such representation even by mistake is improbably. And there is a difference between the C (used by default) and the Python implementations.

The origin of such bug is clear. The code was correct in Python 2, with 8-bit strings, and the bug was introduced when it was changed to parse Unicode strings.

Backporting bugfixes is the default procedure. Not backporting requires reasons.

@taleinat
Copy link
Contributor

Excellent, thank you for such a clear and thorough explanation!

serhiy-storchaka pushed a commit that referenced this pull request Oct 21, 2024
…n of JSON decoder (GH-125687) (GH-125693)

(cherry picked from commit d358425)

Co-authored-by: Nice Zombies <[email protected]>
serhiy-storchaka pushed a commit that referenced this pull request Oct 21, 2024
…n of JSON decoder (GH-125687) (GH-125692)

(cherry picked from commit d358425)

Co-authored-by: Nice Zombies <[email protected]>
ebonnal pushed a commit to ebonnal/cpython that referenced this pull request Jan 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants