Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-129025: fix too wide source location for bytecodes emitted for except* #129026

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

iritkatriel
Copy link
Member

@iritkatriel iritkatriel commented Jan 19, 2025

Copy link
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.
I have one question about the choice of location.

@@ -2365,7 +2365,8 @@ codegen_try_except(compiler *c, stmt_ty s)
for (i = 0; i < n; i++) {
excepthandler_ty handler = (excepthandler_ty)asdl_seq_GET(
s->v.Try.handlers, i);
location loc = LOC(handler);
/* Set location to the entire line of the except keyword */
location loc = LOCATION(handler->lineno, handler->lineno, 0, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the entire line? Would the location of the type would be better?

Is this legal syntax?

except *\
     ValueError:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the location of the type would be better?

The type has the location of the expression that defines it. This is for exceptions raised from the except* itself. If you look at the test - the exception is from split() on the exception group, which is called by except*.

Why the entire line?

We don't have accurate information in the AST about the location of the except*, so this is next best. If it is multiline, the first line is where the except* is, which is what we want.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for the CHECK_EG_MATCH instruction, IIUC. Since it is matching the type, wouldn't the location of the type be a reasonable location?

If the AST had all the location information for everything, what location would you give the CHECK_EG_MATCH instruction?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think CHECK_EG_MATCH should have the location of except*, which is applying the match check. The location of calculating the type should be the location of the type expression.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't generally attach locations to syntax like except *.
For example, the location of the calls to __enter__ and __exit__ in a with statement is the context expression, not the with keyword.

Maybe we should attach the locations to the keywords? Though, as you say, it will need changes to the AST.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this issue to add locations to the AST #129157.

But we need to do something here that we can backport.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for the CHECK_EG_MATCH instruction, IIUC. Since it is matching the type, wouldn't the location of the type be a reasonable location?

If we go with the type, then we have the analogous situation with CHECK_EXC_MATCH for old-style except, where there may not be a type. We probably can't get an exception raised there, but do we mind if the location info is different in the code object is different for except and except*?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also valid (PEP 8 notwithstanding :-):

>>> try:
...     foo
... except\
... *\
... ValueError\
... :
...     pass
... 

I agree with Irit that the location we want to show in the traceback is the except keyword, which in practice will be on the same line as the * and most likely also on the type.

(Sorry for the delay Irit. Today I am attempting to catch up with a lot of email I've been procrastinating on.)

Copy link
Member Author

@iritkatriel iritkatriel Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following the discussion with @pablogsal on #129162 I think there are two different things here: (1) the location we put for the instruction in the code object and (2) what we display in the traceback.

For the traceback, I now think we should ideally hilight the whole except or except* statement, but place carets under the specific type (if there is one and it's relevant). If there is more than one type, we want to point to the one that cause the exception. This is similar to the case of a with statement with multiple context managers - we want to point to the with keyword, but also indicate which of the context managers was involved.

I think Pablo's PR should just add the missing location info to the AST, and then we can take it from there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did find an example for plain except where putting the traceback on the expression might make more sense -- the CHECK_EXC_MATCH instruction has only one possible failure, which is when the expression returns something that's not an exception, and then we get

TypeError: catching classes that do not inherit from BaseException is not allowed

I'd like to see that attached to the expression that gave a non-exception rather than on except (if they are on different lines).

For except* it is more complicated (many failure points, not all related to the given type) and I think attaching to the except is more acceptable there.

@iritkatriel iritkatriel changed the title gh-129025: fix too wide source location for bytecodes emitted for and gh-129025: fix too wide source location for bytecodes emitted for except* Jan 20, 2025
f' ~~~~~~~~~~~~~~~~~~~~~^^\n'
f' File "{__file__}", line {exc.__code__.co_firstlineno + 7}, in exc\n'
f' except* ValueError:\n'
f' \n'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This blank line is weird. I'd expect either an arrow and/or squiggles pointing at except or at the first character of the line or at the whole line, OR no arrows/squiggles and no blank line.

The old way showed the entire except block, but no extra blank line.

@@ -2365,7 +2365,8 @@ codegen_try_except(compiler *c, stmt_ty s)
for (i = 0; i < n; i++) {
excepthandler_ty handler = (excepthandler_ty)asdl_seq_GET(
s->v.Try.handlers, i);
location loc = LOC(handler);
/* Set location to the entire line of the except keyword */
location loc = LOCATION(handler->lineno, handler->lineno, 0, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the extra blank line I noticed in the test output is because we use zeros for the column positions?

So maybe this needs to be more careful and figure out the position of the last character on the line, or just put it only on the except keyword? (Which would require figuring out the position of the except keyword, which we might not have.

I guess if it's hard to do that right, I'm okay with the blank line.

@@ -2365,7 +2365,8 @@ codegen_try_except(compiler *c, stmt_ty s)
for (i = 0; i < n; i++) {
excepthandler_ty handler = (excepthandler_ty)asdl_seq_GET(
s->v.Try.handlers, i);
location loc = LOC(handler);
/* Set location to the entire line of the except keyword */
location loc = LOCATION(handler->lineno, handler->lineno, 0, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did find an example for plain except where putting the traceback on the expression might make more sense -- the CHECK_EXC_MATCH instruction has only one possible failure, which is when the expression returns something that's not an exception, and then we get

TypeError: catching classes that do not inherit from BaseException is not allowed

I'd like to see that attached to the expression that gave a non-exception rather than on except (if they are on different lines).

For except* it is more complicated (many failure points, not all related to the given type) and I think attaching to the except is more acceptable there.

@@ -2546,7 +2547,8 @@ codegen_try_star_except(compiler *c, stmt_ty s)
for (Py_ssize_t i = 0; i < n; i++) {
excepthandler_ty handler = (excepthandler_ty)asdl_seq_GET(
s->v.TryStar.handlers, i);
location loc = LOC(handler);
/* Set location to the entire line of the except keyword */
location loc = LOCATION(handler->lineno, handler->lineno, 0, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion here:

Suggested change
location loc = LOCATION(handler->lineno, handler->lineno, 0, 0);
location loc = LOCATION(handler->lineno, handler->lineno, handler->col_offset, handler->col_offset + 6); // len("except")

This highlights the except keyword.

For plain except you could do the same, or dig the coordinates out of the type expression (a bit harder because there are a lot of possibilities there).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting core review needs backport to 3.12 bug and security fixes needs backport to 3.13 bugs and security fixes type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Too wide location info for CHECK_EG_MATCH
3 participants