Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] Port type inference tests to new test framework #13719

Merged
merged 19 commits into from
Oct 15, 2024

Conversation

Lexxxzy
Copy link
Contributor

@Lexxxzy Lexxxzy commented Oct 11, 2024

Summary

Porting infer tests to new markdown tests framework.

Link to the corresponding issue: #13696

@Lexxxzy
Copy link
Contributor Author

Lexxxzy commented Oct 11, 2024

I've changed parser's CODE_RE regex, cause without that fix it was not correctly parsing code snippets with empty files, like:

```py path=package/__init__.py
```

Of course feel free to comment on this fix...

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome, thank you for taking this on!

I'm going to wait for removal of the extraneous quadruple-backtick markdown fences before reviewing the tests in detail, because removing those will make them a lot easier to read :)

Comment on lines 7 to 16
````markdown
```py path=a.py
from b import C as D; E = D
reveal_type(E) # revealed: Literal[C]
```

```py path=b.py
class C: pass
```
````
Copy link
Contributor

@carljm carljm Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, the outer markdown fenced code block is just how the mdtest README handles showing an example markdown document within a markdown README, it's not needed in the tests and should be removed here and in all other cases:

Suggested change
````markdown
```py path=a.py
from b import C as D; E = D
reveal_type(E) # revealed: Literal[C]
```
```py path=b.py
class C: pass
```
````
```py path=a.py
from b import C as D; E = D
reveal_type(E) # revealed: Literal[C]
```
```py path=b.py
class C: pass
```

It's funny that it still works either way, because of our naive regex parsing :)

I'll look at updating the README to address this potential confusion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, my bad. Once again - thanks for clarifying

We can follow import to class:

````markdown
```py path=a.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say it's probably better style in general if we don't specify the path on the "main" test file, if it isn't imported by another file and its location/name don't actually matter to the test. (This wasn't possible in the previous test format.) I don't care too much about this, though, not a blocking issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed it

@carljm
Copy link
Contributor

carljm commented Oct 11, 2024

I've changed parser's CODE_RE regex

This fix looks good, thank you!

@carljm
Copy link
Contributor

carljm commented Oct 11, 2024

Also it looks like the mdformat pre-commit check is failing, you can check the ruff CONTRIBUTING doc for info on how to run pre-commit locally so you can catch those issues without having to push and wait for CI.

Copy link
Contributor

github-actions bot commented Oct 11, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

@Lexxxzy
Copy link
Contributor Author

Lexxxzy commented Oct 11, 2024

wait for removal of the extraneous quadruple-backtick

Oh, ok, understood, I will redo that, thanks for clarifing

Also it looks like the mdformat pre-commit check is failing, you can check the ruff CONTRIBUTING doc for info on how to run pre-commit locally so you can catch those issues without having to push and wait for CI.

Got it, totally forgot about it

@carljm carljm added the red-knot Multi-file analysis & type inference label Oct 11, 2024
carljm added a commit that referenced this pull request Oct 11, 2024
Address a potential point of confusion that bit a contributor in
#13719

Also remove a no-longer-accurate line about bare `error: ` assertions
(which are no longer allowed) and clarify another point about which
kinds of error assertions to use.
@Lexxxzy
Copy link
Contributor Author

Lexxxzy commented Oct 12, 2024

Rebased on the current main branch. Changed format of tests after new clarifications from @carljm. Also added tests for booleans infer.

I have problems with porting few tests so far:

  1. not_literal_string, multiplied_string, multiplied_literal_string, truncated_string_literals_become_literal_string, adding_string_literals_and_literal_string, ...
    In Rust's test there are formatting for variable in tests file like y = "a".repeat(TypeInferenceBuilder::MAX_STRING_LITERAL_SIZE + 1). The question is: how do I correctly insert variable in tests in mdtest the same way?

  2. from_import_with_no_module_name and exception_handler_with_invalid_syntax

NOTE: Parsing diagnostic's error is not resolving, also already tried using full message
```py
from import bar   # error: "Expected a module name"
reveal_type(bar)  # revealed: Unknown
```
  1. All comprehension tests (and some functions), because variables should be infered inside it's scope. Don't know how to do it in mdtest.

@Lexxxzy Lexxxzy marked this pull request as draft October 12, 2024 10:46
@Lexxxzy
Copy link
Contributor Author

Lexxxzy commented Oct 13, 2024

Most of the work is done. Now I’m just waiting for feedback on what’s finished and looking for some help with the issues I’m facing before I can keep going.

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much, this is a lot of good work! Excellent translation of the tests to the new assertion style.

A few general comments, which I commented partway through but not everywhere they occur:

  • Avoid TODO in test titles, and keep TODO comments minimal in length and as close as possible to the specific line/assertion that isn't quite right yet.
  • Avoid boilerplate phrases like "This test ensures that" or "Check that" or "We can infer the type when ..." or anything similar -- minimize the words that are not adding specific value to the test in question.
  • Break up into more smaller files; I commented on this throughout to try to give a clear idea of an organization that I think would work well.

I appreciate your quick work on this PR and would like to get it landed ASAP to minimize conflicts as new tests are added in other PRs! If you won't have time to update it in the next day-ish, please just let me know, I'm also happy to make the updates and get it landed!

Thanks again.

Comment on lines 23 to 37
### TODO: Not function

Unknown should not be part of the type of typing.reveal_type

```py
from typing import reveal_type

def f():
return 1

a = not f
b = not reveal_type

reveal_type(a) # revealed: Literal[False]
# reveal_type(b) # TODO: revealed: Literal[False]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's consolidate the stuff that's about the TODO to just the one line where it's relevant, that's not the main point of this test.

Suggested change
### TODO: Not function
Unknown should not be part of the type of typing.reveal_type
```py
from typing import reveal_type
def f():
return 1
a = not f
b = not reveal_type
reveal_type(a) # revealed: Literal[False]
# reveal_type(b) # TODO: revealed: Literal[False]
### Not function
```py
from typing import reveal_type
def f():
return 1
a = not f
b = not reveal_type
reveal_type(a) # revealed: Literal[False]
# TODO Unknown should not be part of the type of typing.reveal_type
# reveal_type(b) # revealed: Literal[False]

reveal_type(f) # revealed: Literal[False]
```

## Comparison
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to break up this large file a bit, lets move out this whole section into a set of files, e.g. comparison/integers.md, comparison/non_boolean_returns.md, comparison/strings.md, comparison/unsupported.md. There's probably more that we could break out as well, but that's probably good enough for now.


## Cyclical class definition

Python supports classes that can reference themselves in their base class definitions. Although it may seem unusual, such a structure is not uncommon, particularly in type hinting systems like `typeshed`, where base classes can be self-referential: `class str(Sequence[str]): ...`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's generally hard-wrap lines at 80 columns in text, and also edit this text a bit for clarity/conciseness:

Suggested change
Python supports classes that can reference themselves in their base class definitions. Although it may seem unusual, such a structure is not uncommon, particularly in type hinting systems like `typeshed`, where base classes can be self-referential: `class str(Sequence[str]): ...`.
In type stubs, classes can reference themselves in their base class definitions. For example, in `typeshed`, we have `class str(Sequence[str]): ...`.

Comment on lines 246 to 259
NOTE: `j = "ab" < "ab_cd"` is a very cornercase test ensuring we're not comparing the interned salsa symbols, which compare by order of declaration.

```py
def str_instance() -> str: ...
a = "abc" == "abc"
b = "ab_cd" <= "ab_ce"
c = "abc" in "ab cd"
d = "" not in "hello"
e = "--" is "--"
f = "A" is "B"
g = "--" is not "--"
h = "A" is not "B"
i = str_instance() < "..."
j = "ab" < "ab_cd"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd rather put notes like this as Python comments right next to the relevant line

Suggested change
NOTE: `j = "ab" < "ab_cd"` is a very cornercase test ensuring we're not comparing the interned salsa symbols, which compare by order of declaration.
```py
def str_instance() -> str: ...
a = "abc" == "abc"
b = "ab_cd" <= "ab_ce"
c = "abc" in "ab cd"
d = "" not in "hello"
e = "--" is "--"
f = "A" is "B"
g = "--" is not "--"
h = "A" is not "B"
i = str_instance() < "..."
j = "ab" < "ab_cd"
```py
def str_instance() -> str: ...
a = "abc" == "abc"
b = "ab_cd" <= "ab_ce"
c = "abc" in "ab cd"
d = "" not in "hello"
e = "--" is "--"
f = "A" is "B"
g = "--" is not "--"
h = "A" is not "B"
i = str_instance() < "..."
# ensure we're not comparing the interned salsa symbols, which compare by order of declaration.
j = "ab" < "ab_cd"

Comment on lines 275 to 286
TODO: `d = 5 < object()` should be `Unknown` but we don't check if __lt__ signature is valid for right operand type.

```py
a = 1 in 7 # error: "Operator `in` is not supported for types `Literal[1]` and `Literal[7]`"
b = 0 not in 10 # error: "Operator `not in` is not supported for types `Literal[0]` and `Literal[10]`"
c = object() < 5 # error: "Operator `<` is not supported for types `object` and `Literal[5]`"
d = 5 < object()

reveal_type(a) # revealed: bool
reveal_type(b) # revealed: bool
reveal_type(c) # revealed: Unknown
reveal_type(d) # revealed: bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here:

Suggested change
TODO: `d = 5 < object()` should be `Unknown` but we don't check if __lt__ signature is valid for right operand type.
```py
a = 1 in 7 # error: "Operator `in` is not supported for types `Literal[1]` and `Literal[7]`"
b = 0 not in 10 # error: "Operator `not in` is not supported for types `Literal[0]` and `Literal[10]`"
c = object() < 5 # error: "Operator `<` is not supported for types `object` and `Literal[5]`"
d = 5 < object()
reveal_type(a) # revealed: bool
reveal_type(b) # revealed: bool
reveal_type(c) # revealed: Unknown
reveal_type(d) # revealed: bool
```py
a = 1 in 7 # error: "Operator `in` is not supported for types `Literal[1]` and `Literal[7]`"
b = 0 not in 10 # error: "Operator `not in` is not supported for types `Literal[0]` and `Literal[10]`"
c = object() < 5 # error: "Operator `<` is not supported for types `object` and `Literal[5]`"
d = 5 < object()
reveal_type(a) # revealed: bool
reveal_type(b) # revealed: bool
reveal_type(c) # revealed: Unknown
# TODO should be `Unknown` but we don't yet check if `__lt__` signature is valid for right operand
reveal_type(d) # revealed: bool

reveal_type(a) # revealed: str
```

## Subscript
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These belong in subscript/string.md

reveal_type(a) # revealed: str
```

## Bytes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bytes are a separate type from strings, this should go in bytes.md or literal/bytes.md

reveal_type(x) # revealed: Unbound
```

### Annotation only transparent to local inference
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the tests from here onward should go in shadowing.md along with some other tests I commented above.

reveal_type(x) # revealed: Literal[1, 2]
```

## Assignment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These could go in assignment.md

@@ -0,0 +1,134 @@
# Variables

## Union resolution
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test can go with some others in conditional/if_statement.md

@Lexxxzy
Copy link
Contributor Author

Lexxxzy commented Oct 14, 2024

would like to get it landed ASAP

All clear. Will try to push fixes as soon as possible

@Lexxxzy
Copy link
Contributor Author

Lexxxzy commented Oct 15, 2024

@carljm

A few general comments, which I commented partway through but not everywhere they occur

I tried to follow the style you described. Please take a look at the tests with the new structure. Sorry for all the issues!

@Lexxxzy Lexxxzy requested a review from carljm October 15, 2024 11:45
sharkdp added a commit that referenced this pull request Oct 15, 2024
…ype guards (#13758)

## Summary

- Fix a bug with `… is not …` type guards.
 
  Previously, in an example like
  ```py
  x = [1]
  y = [1]
  
  if x is not y:
      reveal_type(x)
  ```
  we would infer a type of `list[int] & ~list[int] == Never` for `x`
  inside the conditional (instead of `list[int]`), since we built a
  (negative) intersection with the type of the right hand side (`y`).
  However, as this example shows, this assumption can only be made for
  singleton types (types with a single inhabitant) such as `None`.
- Add support for `… is …` type guards.

closes #13715

## Test Plan

Moved existing `narrow_…` tests to Markdown-based tests and added new
ones (including a regression test for the bug described above). Note
that will create some conflicts with
#13719. I tried to establish the
correct organizational structure as proposed in
#13719 (comment)
@sharkdp
Copy link
Contributor

sharkdp commented Oct 15, 2024

@Lexxxzy I already ported the two narrowing-related tests to Markdown while fixing a bug in the narrowing logic (#13758). Let me know if you need help resolving the conflicts here.

@Lexxxzy
Copy link
Contributor Author

Lexxxzy commented Oct 15, 2024

@sharkdp Thanks for letting me know! I rebased on the current main, but somehow subscript/string.md - Subscript on strings - Function return test started to fail. Now it says that revealed type is @Todo, and not str. Can somebody check what went wrong? Maybe it is because of #5fa82f (Sync vendored typeshed stubs) commit?

@Lexxxzy
Copy link
Contributor Author

Lexxxzy commented Oct 15, 2024

Can somebody check what went wrong?

Yep, I checked, actually on @sharkdp pr merge commit (74bf4b) test is running fine, but on 5fa82f it started to fail

@AlexWaygood
Copy link
Member

Can somebody check what went wrong?

Yep, I checked, actually on @sharkdp pr merge commit (74bf4b) test is running fine, but on 5fa82f it started to fail

Yes -- unfortunately I had to change the assertion the test was making as part of that PR (#13753). This is because typeshed now declares str.__getitem__ to be an overloaded function, and we don't understand overloaded functions yet.

@AlexWaygood
Copy link
Member

AlexWaygood commented Oct 15, 2024

This is the specific commit in that PR where I tweaked the test @Lexxxzy: 8ec3b30. You'll need to change the assertion to @Todo in your mdtest port of the test as well.

Sorry for the bother!

@Lexxxzy
Copy link
Contributor Author

Lexxxzy commented Oct 15, 2024

You'll need to change the assertion to @todo in your mdtest port of the test as well.

Ok, i see. Thanks for the info!

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great!! Thank you so much for the quick and thorough work. I'm going to make a few minor cosmetic tweaks, push those, wait for CI, and then merge.

@carljm carljm merged commit d774807 into astral-sh:main Oct 15, 2024
20 checks passed
@AlexWaygood
Copy link
Member

Thanks so much for taking this on @Lexxxzy! A big first contribution, and a really helpful one for us!!

@Lexxxzy
Copy link
Contributor Author

Lexxxzy commented Oct 15, 2024

Big thanks @carljm for reviewing my first PR to ruff! Learned a lot about that project while porting tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
red-knot Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants