-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Type annotation erasure at compile time #400
Comments
If we get an agreement, I'm going to PEPify it and implement it. |
I think I am +1 on this, but I would prefer to not make this a default behaviour in 3.7 (at least for one release cycle or as it was done for PEP 479) but rather use a flag or Also I noticed you changed title to "... at parse time", but note that some syntax errors are detected in |
I agree it should be in compile.c. |
I agree that something like this would be better than the current situation. There are bunch of cases where this wouldn't help as the type is used outside an annotation. Examples (there may be others):
For these we could continue to use the existing workarounds, such as |
Yes, for some cases we would need to keep using the current syntax but for most things it could go away. As for aliases, you'd put them in
Solves our problem and looks more elegant. |
As discussed in python/mypy#2869, generic base classes would also not support this syntax, and there we also can't use string literal escaping. We could continue to write |
But what about using them in base classes? I think some people still might want to write: class UserList(List[X]):
...
class MyTuple(Tuple[int, str]):
... etc. @ambv T: TypeVar = (str, int) # for variable with type restrictions
T: TypeVar = Bound[SomeType]
T: NewType = OldType etc. |
@JukkaL |
I didn't propose to downright deprecate List, and I also just wanted to signal that there are ways in which we can solve TypeVars. Defining new collection types is the outlier situation that would need to exercise the current style. Otherwise we can put most things either in But before we dive deep into followups from this idea, I'd like to make sure I get the "go ahead" from @gvanrossum first :) |
My understanding was that access to the types at runtime (like reflection) was a large motivator for resolving the annotations rather than leaving them as strings. But the need to wrap some types in strings comes almost entirely from these names being resolved at definition time rather than at runtime. Python could eliminate the need to string wrap forward references while increasing the types available at runtime if it parsed the annotations, but kept them as expressions (probably behind some kind of getter) that could be evaluated at runtime. |
Just another minor (and probably obvious) idea: we could add a decorator |
I need to think more about this but right now I really am not keen on this. It really is a big hack and not very principled. EDIT: I am pondering it but still have a visceral reaction against the idea. Let's not panic just because some users didn't read the docs. And mypy will now complain if you forget to import List. |
@gvanrossum, it sounds like most of all you dislike the idea to support fake generics on builtins. Since mypy already removed support for them, that cat is out of the bag. But another important point of this Do you disagree with this as well? |
No, I don't think that's it. What gets all my hackles up is that there's no precedent for a syntactic construct (apart from the identifiers in It would be a complicated change to a Python parser to recover the source code for a specific expression (the heroics that I don't recall if you are proposing that this should eventually (e.g. in Python 3.8) become the default and only behavior. If you're not, a PS. I think you got your "cat out of the bag" idiom backwards -- https://en.wikipedia.org/wiki/Letting_the_cat_out_of_the_bag |
It would be enough to essentially "unparse" the AST. It would be semantically equivalent (compiles the same) but doesn't have to be syntactically equivalent (whitespace/parentheses/commas not necessarily the same). I can't come up with any use for verbatim preservation of the original string. Yes, the intent would be to make this the default in 3.8. The other option (a new kind of -O) is not applicable in this case because it's global and would create code which might not work at runtime depending on the args given to the interpreter. PS. Oh, TIL about the idiom. I have been using it wrong all along. You're the first to point it out, or to notice! |
Oh, and I noticed you saying: "at runtime". I'd like this to happen at compile time so the .pyc would preserve the transformation. The unparsing cost would be paid once, and likely offset by cheaper instantiation of strings in subsequent runs. |
What are the advantages of the unparsed AST rather than the AST itself? You still get forward references, and then the type checker doesn't have to reparse. |
If you're saying that the generated byte code should just contain the string object, yes, that's what I imagined you wanted. However nothing brought up so far has managed to soften me up about this. To the contrary I am no envisioning people abusing this proposal for all sorts of nefarious purposes (like defaults that are evaluated at call time rather than at run time, using some clever decorator). |
You likely have the best calibrated intuition in the matter. It would be crazy not to trust that. I'd like to understand the risk involved here, e.g. how this addition would make the language worse. Let me try again. I get your argument about shoving strings into The argument that "nefarious purposes" invalidate the idea is new to me. What happened to "consenting adults"? Most Python features can be abused and yet we don't limit people from overloading operators, import hooks, source file codecs, accessing "private" members (even across stack frames), monkey patching, etc. etc. The community has consistently kept the insanity at bay by avoiding abuse of the features available. Am I missing something? Is there anything else I'm not seeing? |
I'll have to do more reflection to explain my reaction. Here is a partial response. One thing that comes to mind is that it's a very random change to the language. It might be useful to have a more compact way to indicate deferred execution of expressions (using less syntax than Also, I notice that the first two bullets of your original motivation are no longer valid, at least not with mypy master -- they're solved by python/mypy#2863 and python/mypy#2869 respectively. So we're left with:
My response to these:
But I think my reasoning is more related to the nature of the proposed change than to its use cases. There are no other places in Python where an expression is stringified like this -- you'd have to add a significant amount of new logic to the AST to implement it. (Come to think of this, this part of my discomfort would go away if you were to change the proposal to just turn all annotations into code objects or lambdas, though I'd still be unhappy with the implicitness.) |
I know I am jumping on this late, and am probably the least qualified here to boot, but I wanted to add that in many discussions at PyCon, I have heard that people do not like the current state of forward declarations. I think that not requiring the string syntax is needed. I like the idea of compile time type erasure, but on the other hand, I understand Guido's discomfort in changing the language like this.
I don't think that annotations are "important" enough to change the language, I think that the argument could be made that they are different enough to merit a change, but I suppose that depends on what type annotations truely are. In my mind, an annotation is a comment, and so making a comment a string is a completely valid decision. However, currently they are not just comments. On the other hand it would really bother me to make them not objects, since essentially everything evaluates to be an object in Python. I suppose with a method of deferred resolution this might work, however, I am not sure that would simplify anything, as it too would complicate Python internally. I suppose at this point I am asking myself is generalized deferred resolution needed elsewhere? Apologies if this is off point or confusing. |
Total agree. Even more so, annotations should be, to the Python application/library developer, first class citizens that can express not only expected types but relationship between types (as seems to be the use case for TypeVar) in such a way that they remain easy to understand by people and can be used by IDE's to provide better code completion, refactoring, and symbol usage validation. Type hints should appear in source code as fully integrated to the language, not as strings (which makes them look as an after-thought, a patch). The application/library programmer should not have to resort to TypeVar (their use should be for those who implement the type hinting system). Type hinting will be big in Python, it deserves new syntax that remains in the spirit of Python (clean and simple and expressive). However much of these thoughts may not be relevant to this thread so I have created a gist as suggested by EthanHS. I will edit it so it stands on its own and post a link on gitter/mypy. |
I think parts of this are related to this discussion, but a fair amount is not related to this. The suggestion to change the syntax will likely not happen, as the function annotation syntax was part of Python 3000, and has been tested well. I also think that discussion the syntax of PEP 484 naming is out of scope of this issue, as Łukasz is not intending that change. So TypeVar and the current naming scheme is here to stay. I think the main thing you raise in your comment is that you think annotations should be a special type. You don't really talk much about that however. This would likely mean more syntax to specify that a forward declaration is made. Which is an entirely valid decision. I wanted to make it clear when I talk about object vs string, I don't mean require people to say |
I'm going to ignore the long slightly off-topic proposal for now. Responding to Ethan and Łukasz, maybe we could add a "future import" like this to Python 3.7: from __future__ import stringify_annotations # Name can be bikeshedded after which all annotations (argument, return and variable) are turned into strings that are stored in Annotations must still be syntactically valid, and at the end of the containing scope, they should be semantically valid, i.e. evaluating the string as an expression in that scope should not raise an exception. This latter requirement is to prevent abuse -- unquoted annotations may be used in place of forward references, but the resulting string must still abide by the rules for forward references. Annotations can not be ignored entirely -- they must still end up (as strings) in I am happy for the stringification to happen at compile time (when the bytecode is generated), but I don't insist on it. Stringification may alter whitespace as long as the AST resulting from parsing the string is the same. It may not add/remove redundant parentheses. The "future" referenced by the magical import won't happen until Python 4.0 (and even then maybe we'll end up doing something else). This proposal has to be a separate PEP (it can't be an update to PEP 484). The PEP doesn't have to be much longer than what I wrote in this comment. Generalized deferred resolution sounds like a topic for a totally different PEP and out of scope for this tracker. |
IIUC this is exactly the same requirement for current forward references: they should be valid when evaluated by
I think that "stringification" is a much better solution than any kind of deferred resolution. First, the former is very easy to implement in CPython, second, |
This is now PEP 563: python/peps@454c889 |
The PEP 563 is now accepted and implemented, so I think this can be closed now. We can open separate issues for any new usability improvements. |
Problem definition
Current usage of type hints shows the following patterns:
On top of this, I noticed the following tricky typing situation (generalization of actual code at FB):
You cannot simply write
SomeType
orCycleType
because it would fail at runtime. So you wrap it in strings. But when you wrap, linters start reporting the imports as unused, or shadowing previous unused imports. So you need to additionally add silencing comments. The resulting code is pretty hideous.Solution
I'm proposing revisiting the PEP 484 suggestion to make all annotations evaluate to strings at runtime with Python 3.7.
Rationale:
if TYPE_CHECKING:
block;Details:
typing.get_type_hints()
;static_annotations
(2 characters longer thanabsolute_imports
, 4 characters thanwith_statement
andprint_function
)Summing up, I think this greatly improves the user experience.
The text was updated successfully, but these errors were encountered: