Dirty the dependents of uncacheable nodes #9015

stuhood · 2020-01-26T04:33:21Z

Problem

The rust level Node::cacheable flag is currently only used to mark @goal_rules as uncacheable (because they are allowed to operate on @sideeffecting types, such as the Console and the Workspace). But since the implementation of cacheable did not allow it to operate deeply in the Graph, we additionally needed to mark their parent Select nodes uncacheable, and could not use the flag in more positions.

Via #7350, #8495, #8347, and #8974, it has become clear that we would like to safely allow nodes deeper in the graph to be uncacheable, as this allows for the re-execution of non-deterministic processes, or re-consumption of un-trackable state, such as:

a process receiving stdin from a user
an intrinsic rule that pokes an un-watched file on the filesystem
reading from a stateful process like git

Note that these would all be intrinsic Nodes: it's not clear that we want to expose this facility to @rules directly.

Solution

Finish adding support for uncacheable nodes. Fixes #6598.

When an uncacheable node completes, it will now keep the value it completed with (in order to correctly compute a Generation value), but it will re-compute the value once per Session. The accurate Generation value for the uncacheable node allows its dependents to "clean" themselves and not re-run unless the uncacheable node produced a different value than it had before.

Result

The Node::cacheable flag may be safely used deeper in the graph, with the semantics that requests for any of an uncacheable node's dependents will cause it to re-run once per Session. The dependents will not re-run unless the value of the uncacheable node changes (regardless of the Session).

stuhood · 2020-01-26T04:34:39Z

The first commit removes some unnecessary generic parameters that I encountered while adding the Context parameter to cacheable, but does not change any logic: the second contains the described change.

stuhood · 2020-01-26T20:59:13Z

Ok, reviewable!

illicitonion

Looks good, thanks! One open question about uncacheable nodes having dependencies we should resolve, then I'm happy to approve :)

illicitonion · 2020-01-27T14:46:59Z

src/rust/engine/graph/src/entry.rs

    }
  }

+  /// Iff the value is Clean, mark it Dirty.


Feels like either this should happen also if Uncacheable, or Uncacheable nodes shouldn't be allowed to have dependencies?

Soooo, possibly? dirty means: "might run again in any session", uncacheable means: "will only run again in a new session". So the gap between them is that if something uncacheable is dirtied during a session, it's unclear whether to re-run it.

I think that because we're asserting that these are not side-effecting operations, re-running them (in or out of a single session) is safe. Not clear... will take a look at what it might do to the data model.

Yeah, I suspect if we're not going to dirty them, we shouldn't allow deps; I'm imagining something like (I'm aware we don't support non-cacheable @rules now, but we should make the framework force us to think about it if we add them):

@rule(cacheable=False) def foo(): executable = yield Get[Snapshot](some_path_globs) do_sideeffecting_thing_with(executable)

where the side-effecting rule has a dep, someone edits the snapshotted file, we re-snapshot because that was invalidated, but we don't re-run the rule because it's side-effecting. Erroring out (ideally on the first get from the side-effecting rule, rather than on invalidation), rather than silently not running the rule, feels safer/less surprising.

Just ftr: this is not supposed to represent side-effecting things. Those are still allowed only at the root.

Oh, apologies. I just realized that your example is still relevant in a @goal_rule (marked uncacheable), meaning that sideeffects are in fact relevant here.

But I'm not sure I understand why your example motivates 1) erroring, or 2) re-running the uncacheable rule. If anything, it feels like that motivates not dirtying it, and keeping the "exactly once per session" semantics, which would have the effect that the rule ran once to completion per session, even if things below it and above it in the graph re-ran.

A specific (admittedly, somewhat contrived) example I can think of is:

Some checked-in file contains a "git branch name which should be used to fetch golden test files"

Some uncacheable rule will run git to get the contents of a file at that sha. It depends on reading that file, to construct the command line to run. Something like:

git_branch_name = read(some_path_in_git) golden_value = execute("git", "show", f"{git_branch_name}:golden_file") use_golden_value()

As I believe the current PR stands, if this rule is marked as once-per-session, and the contents of some_path_in_git changes, we won't re-run this rule.

It's unclear whether we should.
I can argue that we shouldn't re-run (because in intent, "once-per-session" isn't actually an "exactly-once" guarantee, it's an "at-least once" guarantee - the point of the feature is to say "Yeah, this otherwise would be a cache hit, but we're going to pretend it wasn't because we know it reads external state").
I can argue that it should (because something regular which it depends on was invalidated, and that's how invalidation generally works).

Not allowing deps feels like a safe thing to do, because forcing people to decompose their rules to be one of "uncacehable, but no deps" or "if any of its deps change, it will rerun" seems reasonable. (Except ugh then goal_rules still need to be special).
Re-running feels like a safe thing to do, because we're saying these things shouldn't be side-effecting, so it's safe for us to run these things as often as we want.
Allowing deps, but not re-running, feels like it opens up a chance for stale results and unsurprising behaviour.

Except ugh then goal_rules still need to be special

Right... this was my realization above. And it's less about whether we want side-effecting things deeper in the graph (we definitely do not), and more about whether we want multiple types of classifications of nodes (IMO, we probably do not any time soon).

I think the way I see this is that because cacheable rules are deterministic-ish and not side-effecting, automatically retrying them to attempt to improve the observed transactionality of rules is good and healthy, and that's what we do when we dirty nodes while they run. But, it's not strictly required... it's best effort. Given that, not making an effort for uncacheable nodes (and not attempting to differentiate them from side-effecting nodes) seems reasonable.

illicitonion · 2020-01-27T15:11:43Z

src/rust/engine/graph/src/lib.rs

+          // independent of matching Generation values. This is to allow for the behaviour that an
+          // uncacheable Node should always have dirty dependents, transitively.
+          if !entry.node().cacheable(context) || !entry.is_clean(context) {
+            complete_as_dirty = true;


complete_as_dirty is ignored if !cacheable - should this maybe be a tri-state enum we can match on, rather than a function call which we'll repeat and a bool?

rather than a function call which we'll repeat and a bool?

We don't repeat the function call... the calls to cacheable(..) here are for the dependencies, while the call inside Entry::complete is for the node itself. Given that, I don't think the enum makes sense.

On re-read, I think I see what you are suggesting... can try that.

So, this isn't straightforward: the complete_as_dirty boolean comes into play for most of the states of the Entry::complete(.., result) parameter, which means that replacing those two parameters with an enum would require a five state enum I think. I tried it and ran into a roadblock.

Interesting... Do you know what the names of those 5 enum variants would be? I'm curious as to whether that enum would be more clear (even if it's more verbose), given that my read of this code was that there would be 3 enum variants, it seems like I mis-understood the behaviour as it's currently written...

The relevant properties are:

node was/wasn't cleaned - It's possible for a node that depends on an uncacheable node to be "cleaned", which means that our result value is None and skips cloning its value to complete it.

complete as dirty/clean - This applies to both "node was/wasn't cleaned" states: a node that was cleaned without re-running uses it's previous value, but stays dirty. It's not possible for the caller of Entry::complete to know this though, because the previous value is stored in the Entry (without shenangians).

(un)cacheable - When a node is uncacheable, the properties above aren't relevant; otherwise they are.

It's possible that the closure that "cleans" a Node (here) could clone the previous value, and then Graph::complete would be able to compute the final result value to pass to Entry::complete... but I think that would require cloning the previous result value in order to clean things, which is not cheap.

In short: I don't see a good way to do this right now.

Perhaps it would clarify things to rename "complete_as_dirty" to "has_dirty_dependencies", which makes it less verby, and clarifies that not using that value in some codepaths is "a-ok"?

gshuflin

This looks good to me - I don't have any comments other than the ones @illicitonion already mentioned.

stuhood · 2020-02-10T06:30:18Z

Rebased and added a commit that uses the SessionId in Node::peek, which required propagating it quite a few places, but fixed tests which were failing because they could no longer view errors rendered in traces.

stuhood · 2020-02-13T05:40:28Z

@illicitonion , @cosmicexplorer : Aside from figuring out what order to land this in relative to #8858, I think it is ready to go.

cosmicexplorer

Awesome!! As mentioned in #8858 this PR is cleared to land first!

cosmicexplorer · 2020-02-13T06:35:54Z

src/rust/engine/graph/src/lib.rs

-  fn reachable_digest_count(&self, roots: &[N]) -> usize {
+  fn reachable_digest_count(&self, roots: &[N], context: &N::Context) -> usize {
+    // TODO: This is a surprisingly expensive method, because it will clone all reachable values by
+    // calling `peek` on them.


Good comment!

cosmicexplorer · 2020-02-14T01:50:00Z

src/rust/engine/graph/src/node.rs

+  /// have Session-specific semantics. More than one context object might be associated with a
+  /// single caller "session".
+  ///
+  type SessionId: Clone + Debug + Eq;


I like this!

cosmicexplorer · 2020-02-14T01:52:26Z

src/rust/engine/src/scheduler.rs

@@ -267,6 +277,14 @@ impl Scheduler {
    m
  }

+  ///
+  /// Return all Digests currently in memory in this Scheduler.


Possibly add a note on what this is useful for?

…sure that an uncacheable node runs (no more than once) per session.

…the appropriate Sessions.

…s verby.

stuhood · 2020-02-15T22:12:10Z

Renamed complete_as_dirty to has_dirty_dependencies, and gambled that that wouldn't break CI.

Thanks for the reviews!

#9271) ### Problem `StoreGCService` was expecting a `SchedulerSession`, but getting a `Scheduler`. This meant that if `pantsd` ran long enough (the default value for "long enough" was too long to ever be caught in tests), it would crash with a failure to call `lease_files_in_graph` due to the method signature change in #9015. ### Solution Although just adding enough type hints here would hypothetically be enough for mypy to catch this (?), additionally added a coverage test for `StoreGCService`. ### Result `pantsd` stays up as long as it should.

stuhood requested review from illicitonion, blorente and gshuflin January 26, 2020 04:33

stuhood removed request for illicitonion, blorente and gshuflin January 26, 2020 18:01

This comment has been minimized.

Sign in to view

stuhood force-pushed the stuhood/dirty-the-dependents-of-uncacheable-nodes branch from 2aead47 to e6c3662 Compare January 26, 2020 20:55

stuhood requested review from illicitonion, blorente and gshuflin January 26, 2020 20:58

stuhood force-pushed the stuhood/dirty-the-dependents-of-uncacheable-nodes branch from e6c3662 to b3dfed9 Compare January 26, 2020 21:05

illicitonion reviewed Jan 27, 2020

View reviewed changes

stuhood force-pushed the stuhood/dirty-the-dependents-of-uncacheable-nodes branch from b3dfed9 to 550b650 Compare January 27, 2020 19:04

gshuflin reviewed Feb 4, 2020

View reviewed changes

stuhood force-pushed the stuhood/dirty-the-dependents-of-uncacheable-nodes branch from 550b650 to 4c02bc1 Compare February 10, 2020 06:28

stuhood requested a review from cosmicexplorer February 13, 2020 04:41

stuhood mentioned this pull request Feb 13, 2020

don't cache errored engine node results #8858

Closed

cosmicexplorer approved these changes Feb 14, 2020

View reviewed changes

stuhood added 4 commits February 15, 2020 14:04

Remove excess generic params from Graph methods.

04302cd

Dirty the dependents of uncacheable nodes, and use a session id to en…

cb4a046

…sure that an uncacheable node runs (no more than once) per session.

Use the SessionId in peek, to ensure that values are only visible in …

1f82583

…the appropriate Sessions.

Rename complete_as_dirty to has_dirty_dependencies to make it les…

aa82b43

…s verby.

stuhood force-pushed the stuhood/dirty-the-dependents-of-uncacheable-nodes branch from 4c02bc1 to aa82b43 Compare February 15, 2020 22:10

stuhood merged commit da5992d into pantsbuild:master Feb 15, 2020

stuhood deleted the stuhood/dirty-the-dependents-of-uncacheable-nodes branch February 15, 2020 22:11

stuhood mentioned this pull request Feb 16, 2020

Optionally execute processes exclusively in the foreground #8974

Closed

stuhood mentioned this pull request Mar 11, 2020

Add a coverage test for pantsd garbage collection, and fix type error #9271

Merged

stuhood mentioned this pull request Apr 11, 2020

Add a --query option for unifying target selection #7350

Closed

stuhood mentioned this pull request Sep 19, 2020

Clean nodes with uncacheable dependencies once per session #10814

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dirty the dependents of uncacheable nodes #9015

Dirty the dependents of uncacheable nodes #9015

stuhood commented Jan 26, 2020 •

edited

Loading

stuhood commented Jan 26, 2020

This comment has been minimized.

stuhood commented Jan 26, 2020

illicitonion left a comment

illicitonion Jan 27, 2020

stuhood Jan 27, 2020

illicitonion Jan 28, 2020

stuhood Jan 28, 2020

stuhood Feb 10, 2020 •

edited

Loading

illicitonion Feb 10, 2020

stuhood Feb 11, 2020 •

edited

Loading

illicitonion Jan 27, 2020

stuhood Feb 10, 2020

stuhood Feb 10, 2020

stuhood Feb 10, 2020

illicitonion Feb 10, 2020

stuhood Feb 11, 2020

stuhood Feb 11, 2020 •

edited

Loading

gshuflin left a comment

stuhood commented Feb 10, 2020

stuhood commented Feb 13, 2020

cosmicexplorer left a comment

cosmicexplorer Feb 13, 2020

cosmicexplorer Feb 14, 2020

cosmicexplorer Feb 14, 2020

stuhood commented Feb 15, 2020

Dirty the dependents of uncacheable nodes #9015

Dirty the dependents of uncacheable nodes #9015

Conversation

stuhood commented Jan 26, 2020 • edited Loading

Problem

Solution

Result

stuhood commented Jan 26, 2020

This comment has been minimized.

stuhood commented Jan 26, 2020

illicitonion left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stuhood Feb 10, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stuhood Feb 11, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stuhood Feb 11, 2020 • edited Loading

Choose a reason for hiding this comment

gshuflin left a comment

Choose a reason for hiding this comment

stuhood commented Feb 10, 2020

stuhood commented Feb 13, 2020

cosmicexplorer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stuhood commented Feb 15, 2020

stuhood commented Jan 26, 2020 •

edited

Loading

stuhood Feb 10, 2020 •

edited

Loading

stuhood Feb 11, 2020 •

edited

Loading

stuhood Feb 11, 2020 •

edited

Loading