Add task exclude filter #844

ebadyano · 2019-12-16T21:06:20Z

Closes #496

ebadyano · 2019-12-16T21:07:55Z

tests/track/loader_test.py

+
+        filtered = loader.filter_tasks(full_track, [track.TaskNameFilter("index-3")], exclude=True)
+
+        schedule = filtered.challenges[0].schedule


If we exclude one of the parallel tasks then nothing in that group runs.. should I correct the behaviour for that or does it make sense like that?

I think if we exclude a single task in a parallel element, only that single one should be excluded but all others should still be included.

I had another look and I meant something else than what is implemented at the moment. I have visualized the challenge index-and-query-logs-fixed-daily-volume from the eventdata track here which contains such a parallel element (I've simplified it to only simulate a single day):

1. measure-maximum-utilization (8 clients) 2. delete-measurement-index 3. 5 parallel tasks (12 clients): 3.1 bulk-index-logs-100%-utilization (8 clients) 3.2 current-kibana-traffic-country-dashboard_60m-querying-100%-utilization 3.3 current-kibana-discover_30m-querying-100%-utilization 3.4 current-kibana-traffic-dashboard_30m-querying-100%-utilization 3.5 current-kibana-content_issues-dashboard_30m-querying-100%-utilization

If we specify --exclude-tasks="type:search", I expect the following schedule:

1. measure-maximum-utilization (8 clients) 2. delete-measurement-index 3. 1 parallel task (8 clients): 3.1 bulk-index-logs-100%-utilization (8 clients)

but instead we get:

1. measure-maximum-utilization (8 clients) 2. delete-measurement-index 3. 5 parallel tasks (12 clients): 3.1 bulk-index-logs-100%-utilization (8 clients) 3.2 current-kibana-traffic-country-dashboard_60m-querying-100%-utilization 3.3 current-kibana-discover_30m-querying-100%-utilization 3.4 current-kibana-traffic-dashboard_30m-querying-100%-utilization 3.5 current-kibana-content_issues-dashboard_30m-querying-100%-utilization

Can you please have another look?

fyi, I've created this output with a new info subcommand that I've added to Rally. It already includes support for your change but as it is not merged yet I've only pushed it as draft PR #850.

Scratch that. As discussed offline, the operation type is kibana not search. So the filter is actually working fine. Sorry for the confusion. :)

danielmitterdorfer

I did an initial pass and it looks quite good already. Can you please also add the new flag to the command line reference documentation?

danielmitterdorfer · 2019-12-17T07:00:46Z

tests/track/loader_test.py

@@ -1257,21 +1257,21 @@ def test_sets_absolute_path(self, path_exists):

 class TrackFilterTests(TestCase):


Can you please change the test case method names? They sometimes reference the term "included" which is outdated now.

danielmitterdorfer · 2019-12-17T07:06:49Z

tests/track/loader_test.py

+
+        filtered = loader.filter_tasks(full_track, [track.TaskNameFilter("index-3")], exclude=True)
+
+        schedule = filtered.challenges[0].schedule


I think if we exclude a single task in a parallel element, only that single one should be excluded but all others should still be included.

danielmitterdorfer

Thanks for iterating. I left a few more comments.

danielmitterdorfer · 2019-12-18T07:32:17Z

tests/track/loader_test.py

+
+        filtered = loader.filter_tasks(full_track, [track.TaskNameFilter("index-3")], exclude=True)
+
+        schedule = filtered.challenges[0].schedule


I had another look and I meant something else than what is implemented at the moment. I have visualized the challenge index-and-query-logs-fixed-daily-volume from the eventdata track here which contains such a parallel element (I've simplified it to only simulate a single day):

1. measure-maximum-utilization (8 clients) 2. delete-measurement-index 3. 5 parallel tasks (12 clients): 3.1 bulk-index-logs-100%-utilization (8 clients) 3.2 current-kibana-traffic-country-dashboard_60m-querying-100%-utilization 3.3 current-kibana-discover_30m-querying-100%-utilization 3.4 current-kibana-traffic-dashboard_30m-querying-100%-utilization 3.5 current-kibana-content_issues-dashboard_30m-querying-100%-utilization

If we specify --exclude-tasks="type:search", I expect the following schedule:

1. measure-maximum-utilization (8 clients) 2. delete-measurement-index 3. 1 parallel task (8 clients): 3.1 bulk-index-logs-100%-utilization (8 clients)

but instead we get:

1. measure-maximum-utilization (8 clients) 2. delete-measurement-index 3. 5 parallel tasks (12 clients): 3.1 bulk-index-logs-100%-utilization (8 clients) 3.2 current-kibana-traffic-country-dashboard_60m-querying-100%-utilization 3.3 current-kibana-discover_30m-querying-100%-utilization 3.4 current-kibana-traffic-dashboard_30m-querying-100%-utilization 3.5 current-kibana-content_issues-dashboard_30m-querying-100%-utilization

Can you please have another look?

danielmitterdorfer · 2019-12-18T07:32:42Z

docs/command_line_reference.rst

+``exclude-tasks``
+~~~~~~~~~~~~~~~~~
+
+Similarly to :ref:`include-tasks <clr_include_tasks>` when challenge consists of one or more tasks you might be interested in excluding a single operations but include the rest.


nit: when a challenge.

danielmitterdorfer · 2019-12-18T07:34:19Z

docs/adding_tracks.rst

@@ -305,7 +305,7 @@ To specify different workloads in the same track you can use so-called challenge

 When should you use challenges? Challenges are useful when you want to run completely different workloads based on the same track but for the majority of cases you should get away without using challenges:

-* To run only a subset of the tasks, you can use :ref:`task filtering <clr_include_tasks>`, e.g. ``--include-tasks="create-index,bulk"`` will only run these two tasks in the track above.
+* To run only a subset of the tasks, you can use :ref:`task filtering <clr_include_tasks>`, e.g. ``--include-tasks="create-index,bulk"`` will only run these two tasks in the track above or ``--exclude-tasks="bulk"`` will run all tasks except for `bulk`


Nit: missing period.

You need to use double-backticks so it is

``bulk``

instead of

`bulk`

danielmitterdorfer · 2019-12-18T07:34:56Z

docs/command_line_reference.rst

+
+**Examples**:
+
+* Do not execute any tasks with the name ``index`` and ``term``: ``--exclude-tasks="index,term"``


Suggestion: Use "Skip" instead of "Do not execute"?

danielmitterdorfer

Thanks! LGTM

Relates: elastic#844

Relates: #844

Add task exclude filter

ad98efe

Closes elastic#496

ebadyano added enhancement Improves the status quo :Usability Makes Rally easier to use :Load Driver Changes that affect the core of the load driver such as scheduling, the measurement approach etc. labels Dec 16, 2019

ebadyano added this to the 1.4.0 milestone Dec 16, 2019

ebadyano requested a review from danielmitterdorfer December 16, 2019 21:06

ebadyano commented Dec 16, 2019

View reviewed changes

danielmitterdorfer reviewed Dec 17, 2019

View reviewed changes

Address Daniel's comments

704340e

danielmitterdorfer reviewed Dec 18, 2019

View reviewed changes

danielmitterdorfer mentioned this pull request Dec 18, 2019

Remove frozen-data-generation challenge elastic/rally-eventdata-track#66

Open

Fix docs

87a5b07

ebadyano requested a review from danielmitterdorfer December 18, 2019 16:35

danielmitterdorfer approved these changes Dec 19, 2019

View reviewed changes

Merge remote-tracking branch 'origin/master' into master-filter

460db6d

ebadyano merged commit 15948fb into elastic:master Dec 19, 2019

ebadyano added a commit to ebadyano/rally that referenced this pull request Jan 10, 2020

Add support for excluded tasks in chart_generator

c376fde

Relates: elastic#844

ebadyano mentioned this pull request Jan 10, 2020

Add support for excluded tasks in chart_generator #862

Merged

ebadyano added a commit that referenced this pull request Jan 10, 2020

Add support for excluded tasks in chart_generator (#862)

7a9722b

Relates: #844

ebadyano deleted the master-filter branch December 16, 2022 15:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add task exclude filter #844

Add task exclude filter #844

ebadyano commented Dec 16, 2019

ebadyano Dec 16, 2019

danielmitterdorfer Dec 17, 2019

ebadyano Dec 17, 2019

danielmitterdorfer Dec 18, 2019

danielmitterdorfer Dec 18, 2019

danielmitterdorfer Dec 18, 2019

danielmitterdorfer left a comment

danielmitterdorfer Dec 17, 2019

danielmitterdorfer Dec 17, 2019

danielmitterdorfer left a comment

danielmitterdorfer Dec 18, 2019

danielmitterdorfer Dec 18, 2019

danielmitterdorfer Dec 18, 2019

danielmitterdorfer Dec 18, 2019

danielmitterdorfer Dec 18, 2019

danielmitterdorfer left a comment


		filtered = loader.filter_tasks(full_track, [track.TaskNameFilter("index-3")], exclude=True)

		schedule = filtered.challenges[0].schedule

		@@ -1257,21 +1257,21 @@ def test_sets_absolute_path(self, path_exists):

		class TrackFilterTests(TestCase):


		Examples:

		* Do not execute any tasks with the name ``index`` and ``term``: ``--exclude-tasks="index,term"``

Add task exclude filter #844

Add task exclude filter #844

Conversation

ebadyano commented Dec 16, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danielmitterdorfer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danielmitterdorfer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danielmitterdorfer left a comment

Choose a reason for hiding this comment