-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stubgen: unify C extension and pure python stub generators with object oriented design #15770
stubgen: unify C extension and pure python stub generators with object oriented design #15770
Conversation
428405e
to
7c066f3
Compare
This comment has been minimized.
This comment has been minimized.
598847a
to
ab25c53
Compare
ab25c53
to
4cc6723
Compare
This comment has been minimized.
This comment has been minimized.
1 similar comment
This comment has been minimized.
This comment has been minimized.
@@ -59,7 +64,68 @@ def __eq__(self, other: Any) -> bool: | |||
class FunctionSig(NamedTuple): | |||
name: str | |||
args: list[ArgSig] | |||
ret_type: str | |||
ret_type: str | None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ret_type
is now optional to match ArgSig.type
. When either of these is None
, FunctionSig .format_sig
will omit the type from the signature.
def __hash__(self) -> int: ... | ||
def __index__(self) -> int: ... | ||
def __int__(self) -> int: ... | ||
def __ne__(self, other: object) -> bool: ... | ||
def __setstate__(self, state: int) -> None: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the AST-based stub generator specifically excludes __getstate__/__setstate__
, so I think it makes sense for the inspection-based generator to also exclude these. I can't think of a circumstance where exposing these within stubs would be useful, but if we decide that it is then I think it makes sense to do it for both types of stub generators (which is sort of the theme of this PR).
) -> None: | ||
"""Ensure that objects which implement old-style iteration via __getitem__ | ||
are considered iterable. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: add a test for this
This comment has been minimized.
This comment has been minimized.
4 similar comments
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
mypy/stubdoc.py
Outdated
"""Return if this signature is the catchall identity: (*args, **kwargs) -> Any""" | ||
return self.has_catchall_args() and self.ret_type in (None, "Any", "typing.Any") | ||
|
||
def format_sig(self, any_val: str | None = None, suffix: str = ": ...") -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
previous the two types of stub generators each had their own code for formatting signatures. now they both use FunctionSig.format_sig
This comment has been minimized.
This comment has been minimized.
@@ -94,6 +94,7 @@ | |||
from mypy.visitor import NodeVisitor | |||
|
|||
|
|||
@trait | |||
@mypyc_attr(allow_interpreted_subclasses=True) | |||
class TraverserVisitor(NodeVisitor[None]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this a trait because ASTStubGenerator
inherits from both BaseStubGenerator
and this.
@@ -0,0 +1 @@ | |||
from . import basics as basics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is a result of the improved inspection capabilities, and it is accurate. As you can see the sub-module is imported by default:
>>> import pybind11_mypy_demo
>>> dir(pybind11_mypy_demo)
['__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'basics']
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering why the content of the __init__.pyi
actually differs for the two generated output folders:
stubgen-include-docs/pybind11_mypy_demo/__init__.pyi
(this one is empty)
vs
stubgen/pybind11_mypy_demo/__init__.pyi
(this one is not empty)
Is that expected / on purpose?
c516888
to
bcf162a
Compare
This comment has been minimized.
This comment has been minimized.
@ikonst @hauntsaninja @JelleZijlstra Can someone give this PR a look, please! It's totally awesome, I promise. |
@@ -112,7 +112,6 @@ def run(self): | |||
"stubtest.py", | |||
"stubgenc.py", | |||
"stubdoc.py", | |||
"stubutil.py", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One remaining question is "which stubgen modules to compile?"
Currently, stubutil contains the base classes and utilities for stubgen and stubgenc, however, it's a bit asymmetrical, because of the two leaf modules, only stubgen is compiled with mypyc.
stubutil.so*
/ \
stubgenc.py stubgen.so*
The reason for skipping compilation of these is provided in the comment above: "Skip these to reduce the size of the build".
Now that many of the supporting classes have moved into stubutil, which is now compiled, I'm tempted to skip compilation of stubgen, to strike a balance between size and performance:
stubutil.so*
/ \
stubgenc.py stubgen.py
Is there any concern about hurting performance of stubgen
if I do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's probably fine to skip compiling stubgen. stubtest is already non-compiled, and in my experience it's plenty fast enough. Moreover, for stubgen, performance isn't nearly as much as an issue as with mypy proper or stubtest. Unlike with mypy proper or stubtest, you're unlikely to be running stubgen repeatedly in CI -- you generally generate stubs once, and then you're done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those were my thoughts as well. stubgen does perform an analysis of pure python source files (if not disabled), but that will use the compiled mypy modules.
Thanks for working on this, and your previous work on stubgen! I probably won't be able to review this soon — this is a big PR and I'm not that familiar with stubgen. However, if someone else takes a look and feels confident, I'd be happy to take a secondary look, just tag me. |
Hi all, we're already getting merge conflicts. Can someone have a look at this, please! Thanks in advance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks good and I'm happy to merge it once my small comments are addressed and CI is green.
Two bigger points:
- It seems we don't really handle the case where a name we want to use is locally shadowed, e.g. we want to emit
type[X]
from builtins buttype
is something else in this module. I assume that's out of scope, though. - There aren't any test cases for the pyc support as far as I can see. It's probably a pain to do in general, but maybe we can add a test case where we generate a pyc from some Python code, then generate a stub from that pyc.
PI: float | ||
__version__: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we previous excluded this on purpose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I support this change. I think I've seen several typeshed PRs where adding __version__
to the stubs fixed mypy false positives on user code (according to mypy_primer). I don't think there's a good reason to exclude __version__
from the generated stubs.
2780a54
to
a714309
Compare
This comment has been minimized.
This comment has been minimized.
2 similar comments
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
@JelleZijlstra @AlexWaygood Sorry for the delay, I recently changed jobs. I've rebased the changes and addressed the notes. Hopefully we can get this merged. Thanks for your help! |
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Some minor changes in the docs; I'll apply that change and then land this.
This comment has been minimized.
This comment has been minimized.
for more information, see https://pre-commit.ci
isbuiltin is isinstance(object, types.BuiltinFunctionType), and type(ord) is BuiltinFunctionType, so these conditions are the same Co-authored-by: Jelle Zijlstra <[email protected]>
d0bf830
to
03d01a4
Compare
I've noticed that mypyc-38-macos is flaky, so I rebased and pushed. |
Thanks, was just going to retry it after the other jobs finish. |
I poked around looking for an option to retry a single failed job. Is there a way to do that? |
If you're a maintainer, after all CI has finished there is a button for it. But I don't think you can retry until all other jobs are done. |
Yes, but only maintainers can do it, and the "retry" button only appears after all jobs have finished |
According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅ |
Thanks! |
This addresses several regressions identified in #16486 The primary regression from #15770 is that pybind11 properties with docstrings were erroneously assigned `typeshed. Incomplete`. The reason for the regression is that as of the introduction of the `--include-docstring` feature (#13284, not my PR, ftr), `./misc/test-stubgenc.sh` began always reporting success. That has been fixed. It was also pointed out that `--include-docstring` does not work for C-extensions. This was not actually a regression as it turns out this feature was never implemented for C-extensions (though the tests suggested it had been), but luckily my efforts to unify the pure-python and C-extension code-paths made fixing this super easy (barely an inconvenience)! So that is working now. I added back the extended list of `typing` objects that generate implicit imports for the inspection-based stub generator. I originally removed these because I encountered an issue generating stubs for `PySide2` (and another internal library) where there was an object with the same name as one of the `typing` objects and the auto-import created broken stubs. I felt somewhat justified in this decision as there was a straightforward solution -- e.g. use `list` or `typing.List` instead of `List`. That said, I recognize that the problem that I encountered is more niche than the general desire to add import statements for typing objects, so I've changed the behavior back for now, with the intention to eventually add a flag to control this behavior.
This MR is a major overhaul to
stubgen
. It has been tested extensively in the process of creating stubs for multiple large and varied libraries (detailed below).User story
The impetus of this change is as follows: as a maintainer of third-party stubs I do not want to use
stubgen
as a starting point for hand-editing stub files, I want a framework to regenerate stubs against upstream changes to a library.Summary of Changes
--include-private
and--export-less
to c-extensions (inspection-based generation).--inspect-mode
flag. Useful for packages that employ dynamic function or class factories. Also makes it possible to generate stubs for pyc-only modules (yes, this is a real use case)--no-analysis
for--parse-only
to clarify the purpose of this option.__version__
attribute from modules: I've encountered a number of cases in real-world code that utilize this attribute.Below I've compiled some basic information about each stub library that I've created using my changes, and a link to the specialized code for procedurally generating the stubs.
I know that this is a pretty big PR, and I know it's a lot to go through, but I've spent a huge amount of time on it and I believe this makes mypy's stubgen tool the absolute best available. If it helps, I also have 13 merged mypy PRs under my belt and I'll be around to fix any issues if they come up.