feat: Ability to explain an executed request #1188

shahzadlone · 2023-03-16T20:27:53Z

Relevant issue(s)

Resolves #326

Description

Adds ability to return datapoints / information gathered at every planner step. The information is stored during execution, and gathered post execution.

Usage

Add @explain(type: execute) after the query or mutation operation.

Execute explain request for query operation - example:

query @explain(type: execute) {
	Address(groupBy: [country]) {
		country
		_group {
			city
		}
	}
}

Execute explain request for mutation operation - example:

mutation @explain(type: execute) {
	update_address(
		ids: ["bae-c8448e47-6cd1-571f-90bd-364acb80da7b"],
		data: "{\"country\": \"USA\"}"
	) {
		country
		city
	}
}

For Reviewers

Commits should be fairly clean and would be easy to review commit by commit.

Note

I wanted to add the TotalElapsedTime datapoint for the request, however, it will need to wait until the explain testing framework is integrated properly (Integrate the new explain test setup into the new test action system #1243) and we can control how we want to test the varying time datapoint.
The information from this execute explain graph will be dumped to a metric system outside the planner nodes, should follow in a PR after this.
I think we can do better than typeIndexJoin, as most chunk of the join execution stuff happens under typeJoinMany and typeJoinOne. In the future perhaps make them explainable nodes (to avoid the hacky explaining like we did for simple explain).
In future also split Explain() interface function into separate SimpleExplain() and ExecuteExplain() functions.
In future introduce a verbose flag to allow simple explain attributes inside the execute explain.

Need Feedback:

~~Wondering if we should have a verbose = false/true option to add ability to hide some results like the actual document results?~~ Resolved: if needed can be added in a later PR.

Tasks

I made sure the code is well commented, particularly hard-to-understand areas.
I made sure the repository-held documentation is changed accordingly.
I made sure the pull request title adheres to the conventional commit style (the subset used in the project can be found in tools/configs/chglog/config.yml).
I made sure to discuss its limitations such as threats to validity, vulnerability to mistake and misuse, robustness to invalidation of assumptions, resource requirements, ...

How has this been tested?

CI running the integration tests.

Specify the platform(s) on which this was tested:

Wsl 2 (Manjaro)

codecov · 2023-03-23T22:11:11Z

Codecov Report

Merging #1188 (5d8eb56) into develop (a261c2f) will decrease coverage by 0.05%.
The diff coverage is 74.25%.

@@             Coverage Diff             @@
##           develop    #1188      +/-   ##
===========================================
- Coverage    70.17%   70.13%   -0.05%     
===========================================
  Files          184      184              
  Lines        17392    17700     +308     
===========================================
+ Hits         12205    12414     +209     
- Misses        4251     4340      +89     
- Partials       936      946      +10

Impacted Files	Coverage Δ
planner/errors.go	`0.00% <0.00%> (ø)`
planner/planner.go	`76.84% <ø> (-0.55%)`	⬇️
planner/top.go	`68.53% <0.00%> (-3.01%)`	⬇️
request/graphql/schema/types/types.go	`100.00% <ø> (ø)`
planner/explain.go	`56.02% <53.33%> (-7.78%)`	⬇️
planner/average.go	`84.61% <83.33%> (-0.84%)`	⬇️
planner/type_join.go	`73.25% <84.00%> (+0.34%)`	⬆️
planner/create.go	`67.70% <84.61%> (+2.23%)`	⬆️
planner/order.go	`83.17% <84.61%> (+0.01%)`	⬆️
planner/commit.go	`81.72% <85.00%> (+0.10%)`	⬆️
... and 9 more

... and 6 files with indirect coverage changes

AndrewSisley · 2023-03-27T19:45:07Z

Wondering if we should have a verbose = false/true option to add ability to hide some results like the actual document results?
Any other datapoints that others would like to add?

IMO we can add that/those in later if we want, no need to complicate things straight off

planner/average.go

planner/group.go

jsimnz

Going into this task I had a different structure in mind, but after looking at the concrete implementation, much like Andy had mentioned, there is some unknown issues lurking in other designs that wouldve been trying to hard to be smart and fancy about collecting runtime metrics.

I will say that the goal of doing #970 (metrics) first, was to use them here, instead of just raw "counters" as youre doing now. You mentioned in another comment that you didnt want the otel stuff leaking around, but the goal of the metrics lib is to act as the abstracted interface between the underlying metrics provider/collector, and the metrics types (guages, counters, histograms, etc). As this would be a much more powerful primitive to use as you could do more interesting collection of metrics beyond a simple increment.

However, going through the various metrics packages, there are some issues in trying to track metrics on a "per request" basis, rather than the traditional app-wide aggregates. So will likely need to look further into that. There is likely some combination of tracing and metrics that will be ideal, as tracing is great for per request tracking of info.

Theres prob more to be said, but a lot of it is useless until we come up with a more long-term design that protects the prod code a bit more. But I recognize the difficulties here.

At the moment, I think this is a good enough solution, short of the noted todos and suggestions.

planner/explain.go

planner/average.go

jsimnz · 2023-03-30T05:43:53Z

planner/explain.go

todo: The execution explain should technically also include the info from the simple explain as well as the runtime execution metrics as well.

So you wouldnt need to run them both seperately. This can technically be left as is, and we can just run two, but feels nicer to have Execute include both.

I thought about that and I was thinking to have that as an option either under the verbose: false to hide it if someone doesn't want to see the simple attributes, or a separate flag maybe like showSimpleAttributes: true to show/hide them. I did not want to group the somewhat "static" simple attributes with the execution datapoints. Whichever approach we take, will be outside the scope of this PR

planner/explain.go

AndrewSisley

praise: I appreciated the clean commits! I found it made it easier to review.

suggestion: You do seem to be doing what I used to do when I first started using git like this though - some of the commits are too small for reviewers - for example by splitting the new tests from the commit that introduces the feature you create a disconnect that reduces the context that the reviewer has. It can make like easier for sure whilst developing, but in the future I would suggest squashing them before opening the PR so that any individual commit will contain both the production code changes and the tests for that change.

Overall the change looks good - my brain is getting a bit foggy though, so I'll have another look in the morning (I covered the core code, but mostly skimmed a bit over the node-specific metrics and the tests today).

request/graphql/schema/types/types.go

AndrewSisley · 2023-03-31T00:58:02Z

planner/average.go

@@ -100,6 +109,17 @@ func (n *averageNode) SetPlan(p planNode) { n.plan = p }

 // Explain method returns a map containing all attributes of this node that
 // are to be explained, subscribes / opts-in this node to be an explainablePlanNode.
-func (n *averageNode) Explain() (map[string]any, error) {
-	return map[string]any{}, nil
+func (n *averageNode) Explain(explainType request.ExplainType) (map[string]any, error) {


suggestion: It feels a bit wasteful both in terms of execution, and maintenance to have Explain take explainType as a param, and then perform the same switch for each node. It might be nicer to instead have two (parameterless) functions on planNode ExplainSimple() and ExplainExecute() and just do the switch once in explain.go

Is a good point, I think I went this way because I didn't want to add another function to this interface:

type explainablePlanNode interface { planNode Explain(explainType request.ExplainType) (map[string]any, error) }

Which would then result in another public function on all the explainable nodes:

averageNode countNode createNode dagScanNode deleteNode groupNode limitNode orderNode scanNode selectNode selectTopNode sumNode topLevelNode typeIndexJoin updateNode

How strong of a preference do you have for this?

Nothing too strong as it doesnt really affect anything besides the explain code - I just think it would be slightly nicer. If you prefer it as is, for sure keep it as is - you'll almost certainly be working more than me with it :)

Actually, the potential upside of making it 2 exlicit methods instead of one, is controlling which nodes actually need a "custom" explain implementation, and which can kinda just skate by with either no explain, or a basic wrappedExplain which just tracks some basic info that is common across all nodes (like #of invocations, results, etc..).

All good points, will keep in mind to split these outside this PR when implementing debug or predict explain, keeping as is for now, unless someone has a really strong preference.

planner/explain.go

EDIT: I had a merge conflict that was due to the name change of plan to planNode. The extra line diff you see is due to that resolve.

to the `planner/explain.go` file.

fredcarle

I'm late to reviewing this PR as I had my own struggles this week with the document delete stuff but I just took the time to go through it and it's nice work Shahzad! Andy has already approved and John will certainly do so shortly so I don't feel like I need to add mine.

I like that the testing covers a wide range of possibilities and that it includes a more complex set of schemas. I also appreciate the effort to get this through. 🤘

planner/group.go

jsimnz

Looks amazing! 2 minuscule suggestions. Approving!

Really great job on all the tests, and going through my various refactors 👍

planner/select.go

islamaliev · 2023-04-04T12:54:20Z

tested by running all queries (similar) from tests/integration/explain/default package and few more and made sure the results show meaningful values

@Explain

- Resolves #326 - Description: Adds ability to return datapoints / information gathered at every planner step. The information is stored during execution, and gathered post execution. - Usage: Add `@explain(type: execute) ` after the `query` or `mutation` operation. - Execute explain request for `query` operation - example: ``` query @Explain(type: execute) { Address(groupBy: [country]) { country _group { city } } } ``` - Execute explain request for `mutation` operation - example: ``` mutation @Explain(type: execute) { update_address( ids: ["bae-c8448e47-6cd1-571f-90bd-364acb80da7b"], data: "{\"country\": \"USA\"}" ) { country city } } ```

@Explain

- Resolves sourcenetwork#326 - Description: Adds ability to return datapoints / information gathered at every planner step. The information is stored during execution, and gathered post execution. - Usage: Add `@explain(type: execute) ` after the `query` or `mutation` operation. - Execute explain request for `query` operation - example: ``` query @Explain(type: execute) { Address(groupBy: [country]) { country _group { city } } } ``` - Execute explain request for `mutation` operation - example: ``` mutation @Explain(type: execute) { update_address( ids: ["bae-c8448e47-6cd1-571f-90bd-364acb80da7b"], data: "{\"country\": \"USA\"}" ) { country city } } ```

This comment was marked as off-topic.

Sign in to view

shahzadlone added the action/no-benchmark Skips the action that runs the benchmark. label Mar 16, 2023

shahzadlone added this to the DefraDB v0.5 milestone Mar 22, 2023

shahzadlone added area/query Related to the query component area/parser Related to the parser components area/planner Related to the planner system labels Mar 22, 2023

shahzadlone self-assigned this Mar 22, 2023

shahzadlone force-pushed the lone/feat/execute-explain branch 2 times, most recently from d67024c to 8cd844e Compare March 23, 2023 22:05

shahzadlone requested a review from a team March 27, 2023 19:11

AndrewSisley reviewed Mar 27, 2023

View reviewed changes

planner/average.go Outdated Show resolved Hide resolved

AndrewSisley reviewed Mar 27, 2023

View reviewed changes

planner/group.go Outdated Show resolved Hide resolved

shahzadlone force-pushed the lone/feat/execute-explain branch 4 times, most recently from 02eaf3c to b5f8ed1 Compare March 29, 2023 13:29

jsimnz reviewed Mar 30, 2023

View reviewed changes

shahzadlone force-pushed the lone/feat/execute-explain branch 4 times, most recently from 36af49c to ad1f99c Compare March 30, 2023 19:36

shahzadlone marked this pull request as ready for review March 30, 2023 19:38

shahzadlone requested review from a team, jsimnz and AndrewSisley March 30, 2023 23:24

AndrewSisley requested changes Mar 31, 2023

View reviewed changes

shahzadlone added 2 commits March 31, 2023 02:58

PR: Add ability to specify execute explain request

074634f

EDIT: I had a merge conflict that was due to the name change of plan to planNode. The extra line diff you see is due to that resolve.

PR: Move explainRequest function

85e13d2

to the `planner/explain.go` file.

PR: Remove commented typeIndexJoin code

86f440d

shahzadlone force-pushed the lone/feat/execute-explain branch from 24d25ae to 491c86a Compare April 1, 2023 13:16

shahzadlone requested review from jsimnz and a team April 1, 2023 13:26

fredcarle reviewed Apr 1, 2023

View reviewed changes

fredcarle added the feature New feature or request label Apr 1, 2023

jsimnz reviewed Apr 1, 2023

View reviewed changes

planner/group.go Outdated Show resolved Hide resolved

jsimnz approved these changes Apr 1, 2023

View reviewed changes

planner/select.go Outdated Show resolved Hide resolved

shahzadlone added 16 commits April 1, 2023 18:41

Add select node execute datapoints

aafb1c0

PR: Add scan node execute datapoints

ed5c6c0

PR: Rename utils file to fixture.go

4105fa4

PR: Add testing fixtures to create documents

5395311

PR: Add type index join node execute datapoints

b807a7f

PR: Add average node execute datapoints

487fbf5

PR: Add count node execute datapoints

584d35f

PR: Add sum node execute datapoints

467335c

PR: Add toplevel node execute test (no datapoint)

a7966f1

PR: Add create node execute datapoints

c3ea916

PR: Add delete node execute datapoints

7b735e6

PR: Add dagscan node execute datapoints

fd76a10

PR: Add order node execute datapoints

690bc89

PR: Add limit node execute datapoints

4da30a1

PR: Add update node execute datapoints

77cffc6

PR: Add group node execute datapoints

5d8eb56

shahzadlone force-pushed the lone/feat/execute-explain branch from 491c86a to 5d8eb56 Compare April 1, 2023 23:03

shahzadlone merged commit b9d2e32 into develop Apr 1, 2023

shahzadlone deleted the lone/feat/execute-explain branch April 1, 2023 23:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Ability to explain an executed request #1188

feat: Ability to explain an executed request #1188

shahzadlone commented Mar 16, 2023 •

edited

Loading

This comment was marked as off-topic.

codecov bot commented Mar 23, 2023 •

edited

Loading

AndrewSisley commented Mar 27, 2023

jsimnz left a comment

jsimnz Mar 30, 2023

shahzadlone Mar 30, 2023 •

edited

Loading

AndrewSisley left a comment

AndrewSisley Mar 31, 2023

shahzadlone Mar 31, 2023

AndrewSisley Mar 31, 2023

jsimnz Mar 31, 2023

shahzadlone Apr 1, 2023

fredcarle left a comment

jsimnz left a comment

islamaliev commented Apr 4, 2023

feat: Ability to explain an executed request #1188

feat: Ability to explain an executed request #1188

Conversation

shahzadlone commented Mar 16, 2023 • edited Loading

Relevant issue(s)

Description

Usage

For Reviewers

Note

Need Feedback:

Tasks

How has this been tested?

This comment was marked as off-topic.

codecov bot commented Mar 23, 2023 • edited Loading

Codecov Report

AndrewSisley commented Mar 27, 2023

jsimnz left a comment

Choose a reason for hiding this comment

jsimnz Mar 30, 2023

Choose a reason for hiding this comment

shahzadlone Mar 30, 2023 • edited Loading

Choose a reason for hiding this comment

AndrewSisley left a comment

Choose a reason for hiding this comment

AndrewSisley Mar 31, 2023

Choose a reason for hiding this comment

shahzadlone Mar 31, 2023

Choose a reason for hiding this comment

AndrewSisley Mar 31, 2023

Choose a reason for hiding this comment

jsimnz Mar 31, 2023

Choose a reason for hiding this comment

shahzadlone Apr 1, 2023

Choose a reason for hiding this comment

fredcarle left a comment

Choose a reason for hiding this comment

jsimnz left a comment

Choose a reason for hiding this comment

islamaliev commented Apr 4, 2023

shahzadlone commented Mar 16, 2023 •

edited

Loading

codecov bot commented Mar 23, 2023 •

edited

Loading

shahzadlone Mar 30, 2023 •

edited

Loading