Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve inference for context sensitive functions within reverse mapped types #54029

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

Andarist
Copy link
Contributor

it's an extension of the algorithm introduced in #48538
closes #53018
fixes #48798

@typescript-bot typescript-bot added the For Backlog Bug PRs that fix a backlog bug label Apr 26, 2023
Comment on lines 24114 to 24115
const sourceValueDeclaration = sourceType.symbol?.valueDeclaration;
if (sourceValueDeclaration) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect that this part is not fully correct. What I'd like to do here is to only get sourceValueDeclaration if it's an anonymous/fresh/smth declaration within within the argument position. I'm not sure how to check for this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use findAncestor to check for argument position membership, use ObjectFlags.FreshLiteral to check for freshness on the type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added .valueDeclaration to reverse mapped properties. This grants me clean access to the node I'm looking for and fixes an issue with array literals used as values of object reverse mapped types (test case). I think that this gives me now exactly what I'm looking for when combined with the freshness check and I don't need to do any findAncestor traversal.

@@ -24121,6 +24111,19 @@ export function createTypeChecker(host: TypeCheckerHost): TypeChecker {
const templateType = getTemplateTypeFromMappedType(target);
const inference = createInferenceInfo(typeParameter);
inferTypes([inference], sourceType, templateType);
const sourceValueDeclaration = sourceType.symbol?.valueDeclaration;
if (sourceValueDeclaration) {
const intraExpressionInferenceSites = getInferenceContext(sourceValueDeclaration)?.intraExpressionInferenceSites?.filter(site => isNodeDescendantOf(site.node, sourceValueDeclaration));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .filter here is unfortunate but my current goal is to make it correct and only after that I can try to make it fast 😉

The goal here is to only try to infer from the intra expression inference sites relevant for this specific reverse mapped symbol. Perhaps the simplest and good enough solution would be to get an appropriate trailing slice of all intraExpressionInferenceSites.

They are "aggregated" throughout the call to checkExpressionWithContextualType so when encountering a first context sensitive function it's a single-element array and when encountering a second one it's a two-element array and so on. inferFromIntraExpressionSites happens while pushing items into intraExpressionInferenceSites so that's why the producer has to always come first before the consumer (TS playground):

declare function f<T>(arg: {
  produce: (n: string) => T;
  consume: (x: T) => void;
}): void;

f({
  produce: (n) => n,
  consume: (x) => x.toLowerCase(), // ok, `x` is inferred
});

f({
  consume: (x) => x.toLowerCase(), // doesn't work, 'x' is of type 'unknown'.(18046)
  produce: (n) => n,
});

And thus, by extension... if we'd look for "relevant" nodes from the end we could slice the trailing elements until we meet one that is not relevant. Because intra intra expression inference sites are dependent on source order this should work just fine as it should be guaranteed that by using such a trailing slice we use all current "relevant" nodes and nothing else

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But doesn't this all just imply we should store intra-expression inference sites in a per-property map or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do u mean properties of the reverse mapped type? IIRC we might not know if a property is going to end up as property of such - or I don't know how to check that. So I don't know how to aggregate such a structure since I don't know when it should be created in the first place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I opened this PR I got a way better understanding of how the whole intra-expression inference works. What is in this PR right now isn't ideal on the high level of things since it comes with a perf penalty cost. I'm trying to figure out how to best store those sites to accommodate the needs of this change and I'm hitting the wall.

Some things to understand about this inference type:

  1. those sites are continuously ~ gathered in objects/arrays and they are cleared when something "pulls" the information from them
  2. this is OK because the inference actually infers into all type parameters and not only into the one that got "pulled"
  3. given the recursive nature of the algorithm sites are gathered in the reverse order - ancestor nodes might only be gathered after descendants because we need to exit the inner work to get the type of an ancestor
  4. there is quite some repetition in the inference here because if we end up inferring from the ancestor that will also include inferring from its descendants and those descendants might also be in the gathered sites. I was experimenting with some changes to the algorithm to "replace" inner sites with outer ones to skip over some redundant work here. That's only a partial solution though because we can always end up with a sequence like this: push a.b.c > replace with a.b > pull > push a. In a situation like this, we might still infer from the whole a even though we already inferred from a.b and there is no mechanism to "skip over" this slice of the object. So I didn't end up pursuing this optimization as it seems that it's only a partial one.
  5. In the most common situations though we usually just enter a different part of the node after pulling from those sites so the redundancy problem is likely not that big. Often we just push something, pull from it, clear the sites, push again info unrelated to what we already processed and the cycle continues. So what this PR does right now is that it doesn't clear those sites. This isn't incorrect - it's just redundant since all sites are left behind and each pull infers from all the ones that were gathered up so far.
  6. Reverse mapped type inference happens at different stages of the overall algorithm, it's a little bit "tacked on" the standard inference thing - it doesn't exactly play by the same rules etc. A special type/symbol is set up as an inference candidate for it and when its members are accessed then we infer from those. This is nice because the whole thing doesn't have to be processed immediately, it happens "on-demand"
  7. So what happens today on main is that those intra-expression inference sites are cleared before the reverse mapped type has a chance to infer from it.
  8. What we can deduce from all of that is that we still need to clear up those sites for the "regular inference" but somehow "keep them around" for the reverse mapped type inference.
  9. Ideally, we should also have a way of "scoping" those sites within any given object/array property so the reverse mapped type inference could only try to infer from the sites relevant to that slice of an object instead of trying to infer from all of the ones that we were able to gather.

So I'm looking for some way to store/read/clear those sites without hitting any of the mentioned drawbacks and for two different/similar-ush purposes but I can't figure this out so far. cc @jakebailey

if (intraExpressionInferenceSites?.length) {
const templateType = (getApparentTypeOfContextualType(sourceValueDeclaration.parent.parent as Expression, ContextFlags.NoConstraints) as MappedType).templateType;
if (templateType) {
Debug.assert(isExpressionNode(sourceValueDeclaration));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not fond of the cast to Expression in the line below this. That's why I've added this Debug.assert here. If I reason about this correctly... with the correct checks before this line and all it should be 100% guaranteed that sourceValueDeclaration is an expression. Otherwise, we wouldn't find any "relevant" intra expression inference sites.

I'm not sure what's the codebase policy around stuff like that so if you have any suggestions on how this should be done here I'm all ears

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're looking for Debug.assertNode?

Copy link
Contributor Author

@Andarist Andarist Apr 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the tip, I pushed out a change to use it + I changed isExpressionNode to be an asserts function. (isExpressionNode can't be made an asserts function here because it would start discarding things incorrectly through CFA)

Comment on lines 24121 to 24123
pushContextualType(sourceValueDeclaration as any as Expression, templateType, /*isCache*/ false);
inferFromIntraExpressionSites([inference], intraExpressionInferenceSites);
popContextualType();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first this felt quite hacky but now I'm leaning towards saying that it's the easiest and cleanest solution to this problem. The problem that this is solving (together with the getApparentTypeOfContextualType(sourceValueDeclaration.parent.parent) call a few lines back is that it avoids instantiating the property name. Essentially it allows me use the bare templateType without substituting indexed mapped types so inferFromIntraExpressionSites can see a target type like (x: string) => T[K] instead of (x: string) => T["a"] and thus it can infer things for T[K].

I wasn't sure what value I should pass as isCache here though. I'm pretty sure that @ahejlsberg would know this and could advise here

Comment on lines 1 to 4
tests/cases/compiler/mappedTypeContextualTypesApplied.ts(18,10): error TS7023: 'foo' implicitly has return type 'any' because it does not have a return type annotation and is referenced directly or indirectly in one of its return expressions.
tests/cases/compiler/mappedTypeContextualTypesApplied.ts(18,15): error TS7006: Parameter 's' implicitly has an 'any' type.
tests/cases/compiler/mappedTypeContextualTypesApplied.ts(19,10): error TS7023: 'foo' implicitly has return type 'any' because it does not have a return type annotation and is referenced directly or indirectly in one of its return expressions.
tests/cases/compiler/mappedTypeContextualTypesApplied.ts(19,15): error TS7006: Parameter 's' implicitly has an 'any' type.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are not expected. I'm almost certain that all of those would get fixed if this PR would get reviewed and merged: #52095 . This currently fails exactly because the contextual type involves an intersection and getTypeOfPropertyOfContextualType doesn't handle such cases appropriately.

Comment on lines 1 to 6
tests/cases/conformance/types/typeRelationships/typeInference/intraExpressionInferencesReverseMappedTypes.ts(67,21): error TS18046: 'x' is of type 'unknown'.
tests/cases/conformance/types/typeRelationships/typeInference/intraExpressionInferencesReverseMappedTypes.ts(71,21): error TS18046: 'x' is of type 'unknown'.
tests/cases/conformance/types/typeRelationships/typeInference/intraExpressionInferencesReverseMappedTypes.ts(80,21): error TS18046: 'x' is of type 'unknown'.
tests/cases/conformance/types/typeRelationships/typeInference/intraExpressionInferencesReverseMappedTypes.ts(86,21): error TS18046: 'x' is of type 'unknown'.
tests/cases/conformance/types/typeRelationships/typeInference/intraExpressionInferencesReverseMappedTypes.ts(95,21): error TS18046: 'x' is of type 'unknown'.
tests/cases/conformance/types/typeRelationships/typeInference/intraExpressionInferencesReverseMappedTypes.ts(101,21): error TS18046: 'x' is of type 'unknown'.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of those relate to the tuple-based tests that I added and those are not expected.

It doesn't work because in tuples both T and T[K] are not reverse mapped symbols/types and thus the inferReverseMappedType isn't called from within checkExpressionWithContextualType (and only within that call intraExpressionInferenceSites and "pushed inference contexts" are available).

With objects createReverseMappedType only creates a ReverseMapped type and doesn't do much else. inferReverseMappedType gets called for properties later from getTypeOfReverseMappedSymbol). However, with tuples it calls inferReverseMappedType eagerly:
https://github.dev/microsoft/TypeScript/blob/eb014a26522dd809ae4d0e85634a62eabda2755a/src/compiler/checker.ts#L24090-L24101

I could try to fix this in this PR but I also feel like this would require changes that could be moved to a separate followup PR. This PR could focus on the base mechanism for this improvement and on object-based cases.

@@ -7539,13 +7539,14 @@ export function getCheckFlags(symbol: Symbol): CheckFlags {

/** @internal */
export function getDeclarationModifierFlagsFromSymbol(s: Symbol, isWrite = false): ModifierFlags {
if (s.valueDeclaration) {
const checkFlags = getCheckFlags(s);
if (!(checkFlags & CheckFlags.ReverseMapped) && s.valueDeclaration) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I added .valueDeclaration to reverse mapped properties, we need to ignore this here - otherwise, we break stripping of readonly modifiers that is required by #12589

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, this change just maintains compatibility with the old behavior as 0 was always returned from here in the implicit case of checkFlags & CheckFlags.ReverseMapped

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
For Backlog Bug PRs that fix a backlog bug
Projects
Status: Waiting on reviewers
5 participants