Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] Resolve references in eager nested scopes eagerly #16079

Merged
merged 53 commits into from
Feb 19, 2025

Conversation

dcreager
Copy link
Member

@dcreager dcreager commented Feb 10, 2025

We now resolve references in "eager" scopes correctly — using the bindings and declarations that are visible at the point where the eager scope is created, not the "public" type of the symbol (typically the bindings visible at the end of the scope).

Co-authored by @AlexWaygood

@dcreager dcreager added the red-knot Multi-file analysis & type inference label Feb 10, 2025
@dcreager dcreager self-assigned this Feb 10, 2025
* main: (991 commits)
  [red-knot] Resolve `Options` to `Settings` (#16000)
  Bump version to 0.9.6 (#16074)
  Revert tailwindcss v4 update (#16075)
  Improve migration document (#16072)
  Fix reference definition labels for backtick-quoted shortcut links (#16035)
  RUF009 should behave similar to B008 and ignore attributes with immutable types (#16048)
  [`pylint`] Also report when the object isn't a literal (`PLE1310`) (#15985)
  Update Rust crate rustc-hash to v2.1.1 (#16060)
  Root exclusions in the server to project root (#16043)
  Directly include `Settings` struct for the server (#16042)
  Update Rust crate clap to v4.5.28 (#16059)
  Update Rust crate strum_macros to 0.27.0 (#16065)
  Update NPM Development dependencies (#16067)
  Update Rust crate uuid to v1.13.1 (#16066)
  Update Rust crate strum to 0.27.0 (#16064)
  Update pre-commit dependencies (#16063)
  Update dependency ruff to v0.9.5 (#16062)
  Update Rust crate toml to v0.8.20 (#16061)
  [`flake8-builtins`] Make strict module name comparison optional (`A005`) (#15951)
  [`ruff`] Indented form feeds (`RUF054`) (#16049)
  ...
* main:
  add diagnostic `Span` (couples `File` and `TextRange`) (#16101)
  Remove `Hash` and `Eq` from `AstNodeRef` for types not implementing `Eq` or `Hash` (#16100)
  Fix release build warning about unused todo type message (#16102)
  [`pydocstyle`] Handle arguments with the same names as sections (`D417`) (#16011)
  [red-knot] Reduce usage of `From<Type>` implementations when working with `Symbol`s (#16076)
  Transition to salsa coarse-grained tracked structs (#15763)
  [`pyupgrade`] Handle micro version numbers correctly (`UP036`) (#16091)
  [red-knot] `T | object == object` (#16088)
  [`ruff`] Skip singleton starred expressions for `incorrectly-parenthesized-tuple-in-subscript` (`RUF031`) (#16083)
  Delete left-over `verbosity.rs (#16081)
  [red-knot] User-level configuration (#16021)
  Add `user_configuration_directory` to `System` (#16020)
@dcreager dcreager marked this pull request as ready for review February 11, 2025 22:30
Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great. I haven't reviewed the semantic changes but I do like it a lot that we moved the state to the UseDefMap

@dcreager
Copy link
Member Author

Probably the best solution here is to add a new query that gets the type of the global symbol at the point the eager scope was created, rather than to add a new parameter to the global_scope() query.

It might actually be okay to push this issue to a followup PR! But I'd prefer to at least add some failing tests, since at the moment the behaviour here is a little inconsistent.

Added some failing tests for this. I think I can get the fix onto this PR as well

@dcreager
Copy link
Member Author

Added some failing tests for this. I think I can get the fix onto this PR as well

This is fixed! This fix also affected the deferred expressions tests that @carljm suggested. (We were always showing lazy results because we weren't hitting the global scope, not because of how we were handling deferred expressions)

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big fan of this implementation! Really nice and clean, avoids duplicating eager-lookup logic between semantic index and type inference. Thank you!!

// which bindings reach each of the uses in the scope. Loop through each enclosing scope,
// looking for any that bind each symbol.
for enclosing_scope_info in self.scope_stack.iter().rev() {
let enclosing_scope_id = enclosing_scope_info.file_scope_id;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this locally and all tests passed. I think it's correct and could save on recording some class-scope bindings? Though only in relatively unusual cases, when a class or comprehension is nested directly in a class scope. So I suppose it's possible the cost of doing this check isn't actually worth it? One reason to do it regardless is just that it makes the comment in TypeInferenceBuilder::infer_name_load that "the semantic index builder takes care of only registering eager bindings for nested scopes that are actually eager, and for enclosing scopes that actually contain bindings that we should use when resolving the reference" more fully true. (Although it works even without being fully true because infer_name_load skips class scopes before checking for an eager binding.)

Suggested change
let enclosing_scope_id = enclosing_scope_info.file_scope_id;
let enclosing_scope_id = enclosing_scope_info.file_scope_id;
// Names bound in class scopes are never visible to nested scopes, so we never need to
// save eager scope bindings in a class scope.
if matches!(self.scopes[enclosing_scope_id].kind(), ScopeKind::Class) {
continue;
}

(If we actually do this we should probably make it a bit nicer with an is_class() method on ScopeKind, and perhaps consolidate the two different self.scopes[enclosing_scope_id] lookups, here and below where we check eager-ness.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent -- thank you!

Comment on lines +78 to +79
Generator expressions don't necessarily run eagerly, but in practice usually they do, so assuming
they do is the better default.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it might be good to explicitly include an example of a generator-expression scope that does not run eagerly? It might be good to document that yes, this causes some incorrect assumptions from us in some edge cases, but that this is a cost we accept because (as you say) generator expressions nearly always run eagerly in practice. One example might be something like this: we report the revealed type of z is Literal[42]:

from typing_extensions import reveal_type

def _(flag: bool):
    y = (0,)
    z = 42
    gen = (x + reveal_type(z) for x in y)
    if flag:
        z = 56
    print(next(gen))

but at runtime, the print() call reveals that the runtime value of z can actually be 56 when the scope of gen() is actually evaluated:

>>> from typing import reveal_type
... 
... def _(flag: bool):
...     y = (0,)
...     z = 42
...     gen = (x + reveal_type(z) for x in y)
...     if flag:
...         z = 56
...     print(next(gen))
...     
>>> _(True)
Runtime type is 'int'
56

(Again, I'm not saying we should try to account for this -- generator expressions are almost always evaluated eagerly, and it would be hard to detect cases like this when they're not! Just suggesting we could add a test to explicitly document the shortcoming.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I also added a test for how the "first iterable" is evaluated eagerly, even if the generator itself is not used immediately.

This example tells me that we'd want to handle generators the same as functions, if we ever start tracking where they're called as part of the public types work (#16079 (review)). After all, that's precisely the weirdness that's happening here — the generator includes a __next__ method that might be called at any arbitrary place, just like the f function in your linked comment might be called at any arbitrary place. If we start tracking that, we should use the same mechanism for both.

* main:
  [red-knot] Allow any `Ranged` argument for `report_lint` and `report_diagnostic` (#16252)
  [pycodestyle] Exempt `site.addsitedir(...)` calls (E402) (#16251)
  red_knot_python_semantic: improve diagnostic message for "invalid argument type"
  ruff_db: add "secondary" messages to `Diagnostic` trait
  ruff_db: refactor snippet rendering
  red_knot_python_semantic: remove `Ranged` impl for `TypeCheckDiagnostic`
  [red-knot] Refactor `infer_chained_boolean_types` to have access to `TypeInferenceBuilder` (#16222)
  Add `red_knot/README.md` (#16230)
  [airflow] move class attributed related cases to AIR302_class_attribute (AIR302) (#16226)
  [red-knot] Update tests for attributes inferred from parameters (#16208)
  [red-knot] update TODO comment in mdtest (#16242)
  [`refurb`] Correctly handle lengths of literal strings in `slice-to-remove-prefix-or-suffix` (`FURB188`) (#16237)
  Pass `ast::PythonVersion` to `type_hint_resolves_to_any` (#16236)
  Use `ast::PythonVersion` internally in the formatter and linter (#16170)
  Add `SECURITY.md` (#16224)
Comment on lines 95 to 125
But that does lead to incorrect results when the generator expression isn't run immediately:

```py
def evaluated_later():
x = 1

# revealed: Literal[1]
y = (reveal_type(x) for a in range(1))

x = 2

# The generator isn't evaluated until here, so at runtime, `x` will evaluate to 2, contradicting
# our inferred type.
print(next(y))
```

Though note that “the iterable expression in the leftmost for clause is immediately evaluated”:

```py
def iterable_evaluated_eagerly():
x = 1

# revealed: Literal[1]
y = (a for a in [reveal_type(x)])

x = 2

# Even though the generator isn't evaluated until here, the first iterable was evaluated
# immediately, so our inferred type is correct.
print(next(y))
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are excellent, thank you!

@dcreager dcreager merged commit cfc6941 into main Feb 19, 2025
21 checks passed
@dcreager dcreager deleted the alex/eager-scopes branch February 19, 2025 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
red-knot Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants