Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-cachable products for @console_rules #6146

Closed
kwlzn opened this issue Jul 16, 2018 · 3 comments
Closed

Non-cachable products for @console_rules #6146

kwlzn opened this issue Jul 16, 2018 · 3 comments
Assignees

Comments

@kwlzn
Copy link
Member

kwlzn commented Jul 16, 2018

A @console_rule is a normal @rule that results in a synthetic product. But since the execution of all @rules are memoized/cached, @console_rules currently only actually execute the first time they are requested (think ./pants list with pantsd only outputting something the first time you run it).

To fix this, we should not cache the result of running a @console_rule.

Not caching the result of running a @console_rule might look like adding a new fn cacheable method to the Node trait, and then implementing fn cacheable for NodeKey::Task using the existing (but currently unused) cacheable flag.

Passing the cacheable flag down to Task from python will involve editing the tasks_task_begin function definition on the python side: https://github.com/pantsbuild/pants/blob/master/src/python/pants/engine/native.py#L171 and on the rust side: https://github.com/pantsbuild/pants/blob/master/src/rust/engine/src/lib.rs#L418-L423 ... and then fixing any runtime/compile errors that result! The @rule created by @console_rule should pass down cacheable=False, and everything else should pass True.

Finally (if there is time left! if not: can open a followup ticket for this portion), the last step is to use Node::cacheable in graph::Entry::complete to:

  1. send the result value to all waiters (similar to the else case in that block)
  2. transition the Node to NotStarted with previous_result: None (to definitely not store the result value).

To integration test this, it should be possible to modify this test to run the --v2 ./pants list command twice in a row.


slack discussion re-paste for posterity:

...oh! actually, this is straightforward.
forget what i said about `clear()`, heh
instead, `Graph::create` is probably the area to focus on.
a trivial implementation of "don't cache this root, but do cache its dependencies" would be to not actually create the root
`Graph::create` creates a root in the graph: it assigns an `EntryId` for the node, and then runs it
`Graph::get` is used by nodes to request their dependencies
(via `Context::get`)
so `get` creates an edge between one node and a dependency, creating the dependency if it doesn't exist (edited)
and `create` just creates a node
...i'm just dumping info that should be in the docs (but isn't) at this point
so, to run a `@console_rule` i see two options:
1) don't actually create a Node in the Graph for it at all, and instead run it with a `Context` that uses `Graph::create` (rather than `get`) to get dependencies
...which (because there is no Node in the graph) would definitely not cache the `@console_rule`, but would cache the things it depends on
or
2) _do_ create a Node in the graph for the `@console_rule`, but add explicit support to `Graph` for not recording a value for a Node
...which might look like "never actually holding a `Future` on Entry for that node type" (edited)
(2) is probably a bit better long term, because it is more general. but unfortunately it would definitely collide with what i'm doing in https://github.com/pantsbuild/pants/pull/6013 and https://github.com/pantsbuild/pants/pull/6059
...but (2) might also actually be easier, so it might still be the way to go.
i think what it would look like would be adding a method to `trait Node` that is approximately `fn memoize(&self) -> bool`
and in `graph/src/lib.rs` `Entry::get`, if `!memoize`, don't ever actually store a Future for the Node. (edited)
... so yea. i think 2 is probably pretty straightforward. depending whether https://github.com/pantsbuild/pants/pull/6013 or your change lands first, one of us would need to do a little bit of rebasing, but it's small enough that i don't think it would be that bad.
...
> What will keep the `Future` to drive execution, though? Dropped Future == dropped execution, right?
would still be spawned on the executor, and would still have a Future returned from `Scheduler::execute`... we would just compute it... pre-dirtied, maybe?```
...oddly enough, we already have a `cacheable` flag on the rust task struct, that i think is just completely unused right now: https://github.com/pantsbuild/pants/blob/master/src/rust/engine/src/tasks.rs#L17
@kwlzn kwlzn self-assigned this Jul 16, 2018
@stuhood
Copy link
Member

stuhood commented Jul 17, 2018

Sounds reasonable. Regarding 2: I take back this comment:

we would just compute it... pre-dirtied, maybe?

Computing it pre-dirtied wouldn't actually work. What we're doing here is creating it "pre-cleared"... do the Node/Entry would just stay in the NotStarted state.

@illicitonion
Copy link
Contributor

@blorente Updated https://github.com/illicitonion/pants/tree/blorente/6146/no-cache-console-rules with some new commits - may be reviewable (possibly with some new code comments and possibly some new tests)

@stuhood
Copy link
Member

stuhood commented Oct 5, 2018

@illicitonion, @blorente : That looks good! Please add a TODO next to the !cacheable Select change that points to #6598.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants