-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement support for returning TypedDict for dataclasses.asdict #8583
base: master
Are you sure you want to change the base?
Conversation
957420a
to
03dfc4a
Compare
I had previously made a "draft" PR at #8339. This supersedes that. This should be ready for review now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! General approach looks reasonable, here I have some comments.
reveal_type(result['staff']) # N: Revealed type is 'Any' | ||
|
||
[typing fixtures/typing-full.pyi] | ||
[builtins fixtures/tuple.pyi] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add couple fine grained (daemon) test cases. The tricky part is that when type of a dataclass attribute changes, then the call to asdict()
should be reprocessed (put it in a different module). I am not 100% sure we already have enough extra dependencies added by the plugin to guarantee this, so such tests will be really useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added one. It revealed that there was no dependency on the fallback type for TypedDict. I tried to add that. Since it depends on python version which TypedDict fallback to use (I think?), I had to add a dependency on all modules (e.g. typing/typing_extensions/mypy_extensions) to the default plugin. Doesn't seem very nice, but I'm not sure of the right way to do it. Do you have any ideas?
Do you have any ideas for more fine-grained tests to add?
Thanks for the review @ilevkivskyi. I've addressed your comments. Also, I'm not sure if it makes sense to make the TypedDicts total or not (I've left them non-total because it's more lenient). |
Oops, looks like tests got messed up.. I'll have a look |
I decided to make I also changed the Pending your re-review of course, as far as I can see, the only thing missing is some more fine-grained tests (though I'm unsure what makes sense to test there). Please let me know if there are any other issues :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updates! This looks almost ready, here are few more comments.
mypy/plugins/default.py
Outdated
def get_additional_deps(self, file: MypyFile) -> List[Tuple[int, str, int]]: | ||
if self.python_version >= (3, 8): | ||
# Add module needed for anonymous TypedDict (used to support dataclasses.asdict) | ||
return [(10, "typing", -1)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this is needed? I think typing
module is always loaded. It may be needed to add only dependency on typing_extensions
, which is available on all versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems adding that dependency on typing_extensions
means a lot of test fixtures need updating (it's the tuple
builtin missing, which I believe is transitively imported from typing_extensions
).
Thanks @ilevkivskyi for the re-review. I tried to address your comments (thanks, I now think I understand how the fine-grained test works). I have no idea if the code copied from SemanticAnalyzer regarding I could not get the new fine-grained test of |
If something doesn't get reported properly in fine-grained incremental mode, this is often related to missing fine-grained dependencies. We have test cases in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple more comments here. For the fine grained tests, see what Jukka said.
mypy/checker.py
Outdated
@@ -4538,6 +4538,21 @@ def named_type(self, name: str) -> Instance: | |||
any_type = AnyType(TypeOfAny.from_omitted_generics) | |||
return Instance(node, [any_type] * len(node.defn.type_vars)) | |||
|
|||
def named_type_or_none(self, qualified_name: str, | |||
args: Optional[List[Type]] = None) -> Optional[Instance]: | |||
sym = self.lookup_qualified(qualified_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one no better than it was. You need to copy/duplicate the important logic. In particular, my whole idea was to not use local lookup functions, but a global lookup like lookup_fully_qualified_or_none()
in semantic analyzer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I misunderstood your comment. Not really knowing how the checker or semantic analyzer work internally, I don't really think I am qualified to make this change. However, I will try.
I have copied the lookup_fully_qualified_or_none function to checker (and only modified it to raise KeyError if the result is None) and used it in named_type_or_none
.
I am not sure if this makes sense and whether the docstring/TODO comments should be altered/removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, late-night coding mistake :) I removed the part about raising KeyError, and it returns None now instead if it can't find the name of course.
Thanks for all your time reviewing -- II'm going to try to pick this up again. @JukkaL wrote:
Thanks for the pointer. I am not sure from which module it makes sense to try and output dependencies for. I tried with this example: [case testDataclassAsdictDeps]
# flags: --python-version 3.8
from dataclasses import dataclass
from b import AttributeInOtherModule
@dataclass
class MyDataclass:
attr: AttributeInOtherModule
my_var: MyDataclass
[file b.py]
AttributeInOtherModule = str
[typing fixtures/typing-typeddict.pyi]
[builtins fixtures/fine_grained.pyi]
[out] Output is:
Shouldn't b.AttributeInOtherModule affect m.MyDataclass? Or am I misinterpreting how this should work? |
I made another experiment with outputting deps. [case testDataclassesAsdictDeps]
# flags: --python-version 3.8
from dataclasses import asdict
from a import my_var
x = my_var
x['attr'] + "foo"
[file a.py]
from dataclasses import dataclass
from b import AttributeInOtherModule
@dataclass
class MyDataclass:
attr: AttributeInOtherModule
my_var: MyDataclass
[file b.py]
AttributeInOtherModule = str
[typing fixtures/typing-typeddict.pyi]
[builtins fixtures/fine_grained.pyi]
[out] Output:
Here, I think it makes sense that But when changing the code to x = asdict(my_var) the output is:
After trying to add an My guess is that because the dataclass attributes are loaded from the cache, and the failing test is testing incremental checking when there is no cache, which fails of course because the cache is empty. However, I'm not sure what to do about it. Populating the dataclass attribute metadata is done during semantic analysis by the dataclass plugin as part of a class decorator hook. I would like to somehow add a dependency to get that phase to run again. I tried adding a dependency on both the dataclass itself and a wildcard trigger on the module defining the dataclass, but that does not seem to help. I feel like I am getting very close to getting this working. |
…s in proper_plugin
…ove recursion error).
…in get_anonymous_typeddict_type.
# Conflicts: # mypy/plugin.py # mypy/plugins/common.py # mypy/plugins/dataclasses.py # mypy/plugins/default.py # mypy/semanal_typeddict.py # test-data/unit/check-dataclasses.test
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not super happy with the copying of all those lookup related stuff. But this is kind of an existing problem, so I will not block this PR on this issue. If there are no objections from other maintainers, they can merge it.
Thanks for working on this!
Oh, btw, it looks like there are now some merge conflicts that need to be fixed (again). |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
I fixed the test errors caused by |
Diff from mypy_primer, showing the effect of this PR on open source code: poetry (https://github.com/python-poetry/poetry)
+ src/poetry/config/source.py:15: error: Incompatible return value type (got "TypedDict({'name': str, 'url': str, 'default': bool, 'secondary': bool})", expected "Dict[str, Union[str, bool]]")
rotki (https://github.com/rotki/rotki)
+ rotkehlchen/chain/ethereum/structures.py:71: error: Incompatible return value type (got "TypedDict({'event_type': Union[Literal['deposit'], Literal['withdrawal'], Literal['interest'], Literal['borrow'], Literal['repay'], Literal['liquidation']], 'block_number': int, 'timestamp': Timestamp, 'tx_hash': str, 'log_index': int})", expected "Dict[str, Any]")
zulip (https://github.com/zulip/zulip)
+ zerver/tornado/event_queue.py:976: error: TypedDict key "user_id" cannot be deleted [misc]
+ zerver/tornado/event_queue.py:977: error: "mentioned_user_group_id" is not a valid TypedDict key; expected one of ("user_id", "online_push_enabled", "pm_email_notify", "pm_push_notify", "mention_email_notify", ...) [misc]
core (https://github.com/home-assistant/core)
+ homeassistant/config_entries.py:1406: error: Argument 1 to "async_step_discovery" of "ConfigFlow" has incompatible type "TypedDict({'host': str, 'port': Optional[int], 'hostname': str, 'type': str, 'name': str, 'properties': Dict[str, Any]})"; expected "Dict[str, Any]" [arg-type]
+ homeassistant/config_entries.py:1412: error: Argument 1 to "async_step_discovery" of "ConfigFlow" has incompatible type "TypedDict({'topic': str, 'payload': Union[str, bytes], 'qos': int, 'retain': bool, 'subscribed_topic': str, 'timestamp': datetime})"; expected "Dict[str, Any]" [arg-type]
+ homeassistant/config_entries.py:1418: error: Argument 1 to "async_step_discovery" of "ConfigFlow" has incompatible type "TypedDict({})"; expected "Dict[str, Any]" [arg-type]
+ homeassistant/config_entries.py:1424: error: Argument 1 to "async_step_discovery" of "ConfigFlow" has incompatible type "TypedDict({'host': str, 'port': Optional[int], 'hostname': str, 'type': str, 'name': str, 'properties': Dict[str, Any]})"; expected "Dict[str, Any]" [arg-type]
+ homeassistant/config_entries.py:1430: error: Argument 1 to "async_step_discovery" of "ConfigFlow" has incompatible type "TypedDict({'ip': str, 'hostname': str, 'macaddress': str})"; expected "Dict[str, Any]" [arg-type]
+ homeassistant/config_entries.py:1436: error: Argument 1 to "async_step_discovery" of "ConfigFlow" has incompatible type "TypedDict({'device': str, 'vid': str, 'pid': str, 'serial_number': Optional[str], 'manufacturer': Optional[str], 'description': Optional[str]})"; expected "Dict[str, Any]" [arg-type]
+ homeassistant/components/recorder/statistics.py:177: error: Incompatible return value type (got "TypedDict({'type': str, 'data': Optional[Dict[str, Optional[str]]]})", expected "Dict[Any, Any]") [return-value]
+ homeassistant/components/energy/validate.py:70: error: Incompatible return value type (got "TypedDict({'energy_sources': List[List[TypedDict({'type': str, 'identifier': str, 'value': Optional[Any]})]], 'device_consumption': List[List[TypedDict({'type': str, 'identifier': str, 'value': Optional[Any]})]]})", expected "Dict[Any, Any]") [return-value]
|
The incompatible cases with |
@ilevkivskyi Thank you for the review 👍 The stack trace in the mypy primer result mentions the key Regarding false positives: Despite providing more strictness, this PR introduces some false-positives to existing code, which either uses an overly-precise type |
Could you please add a minimal reproduce if possible?
Yeah it's a typical solution. |
I think it's a good feature for transforming one dataclass to another with simple line P.S. In my case, I would like to tell DataclassTwo constructor to ignore unknown fields from DataclassOne. I'll try to make another feature request for that. |
There are a couple of limitations that would be nice to remove, but I am hoping this is enough for an initial version.
Subclasses of list/dict/tuple are transformed into the base class rather than a new version of the subclass.
This is because the transformation happens in the
get_function_hook
plugin hook which only has access to theCheckerPluginInterface
and so can't (without hacking) add things to the symbol table.Besides that, there could be violations of variance/constraints if the new type is just constructed without being checked.
Also,
NamedTuple
s found within dataclasses are transformed intoAny
.This is because the new namedtuple needs a partial_fallback generated/added to the symbol table (e.g. same problem as the above).
Supporting generating new
NamedTuple
s properly would probably require refactoringNamedTupleAnalyzer.build_namedtuple_typeinfo
to make it reusable.As far as I know, I was looking for some way of somehow adding new types to the symbol table from
CheckerPluginInterface
and then hoping that I can trigger a new type-checking pass so that variance/constraints are checked on the newly added types. But I'm sure there's a better way to do it.Relates to #5152