Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

incr.comp.: Span fingerprinting cannot afford inaccuracies when caching query results #46303

Closed
michaelwoerister opened this issue Nov 27, 2017 · 1 comment
Labels
A-incr-comp Area: Incremental compilation C-cleanup Category: PRs that clean code up or issues documenting cleanup.

Comments

@michaelwoerister
Copy link
Member

In the current implementation of the incremental compilation change tracking system, the compiler will sometimes explicitly ignore source-span-related information (e.g. when hashing type definitions). We do this in order to avoid false positives that spring from conflating span and HIR information. For example, when the compiler generates debuginfo for a data type, it will access the HIR of the type's definition in order to extract certain information. Thus, even though we explicitly do not include the source location of a type definition in the corresponding debuginfo, changing the source location would still cause the debuginfo to be considered as having changed and the whole object file would need to be compiled.

Explicitly ignoring some span information during change detection was never pretty but it was feasible as long as we only had to consider the consequences this had for cached machine code. Now, that we are starting to cache arbitrary query results I think it is a bad idea to try and keep following this strategy. It seems likely that having such manual exceptions would break every few weeks due to largely unrelated changes to some query.

Also, up until now the compiler makes the assumption that source location information is only relevant when generating debuginfo. This too was only true as long as we just cache machine code. This assumption actually already doesn't hold anymore since we started caching error messages a while ago. These contain source locations but are independent of whether debuginfo is generated or not.

I'm pretty convinced that we should always hash all span information. The question is how to best avoid the negative consequences this will have on change detection accuracy. I think that splitting span information out of HIR and into a side-table will be one of the things we'll have to do. Spans would then be accessed by the NodeId of the corresponding HIR node. This would allow for simply not storing span information in most query results (e.g. MIR) and for avoiding the conflation of source location and other HIR information.

cc @nikomatsakis

@michaelwoerister michaelwoerister added the A-incr-comp Area: Incremental compilation label Nov 27, 2017
@TimNN TimNN added the C-cleanup Category: PRs that clean code up or issues documenting cleanup. label Nov 28, 2017
@michaelwoerister
Copy link
Member Author

We are hashing spans unconditionally since #46556.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-incr-comp Area: Incremental compilation C-cleanup Category: PRs that clean code up or issues documenting cleanup.
Projects
None yet
Development

No branches or pull requests

2 participants