Add an asyncio-based load generator #935

danielmitterdorfer · 2020-03-20T14:01:16Z

With this commit we add a new experimental subcommand race-aync to Rally. It
allows to specify significantly more clients than the current race subcommand.
The reason for this is that under the hood, race-async uses asyncio and runs
all clients in a single event loop. Contrary to that, race uses an actor
system under the hood and maps each client to one process.

As the new subcommand is very experimental and not yet meant to be used broadly,
there is no accompanying user documentation in this PR. Instead, we plan to
build on top of this PR and expand the load generator to take advantage of
multiple cores before we consider this usable in production (it will likely keep
its experimental status though).

In this PR we also implement a compatibility layer into the current load
generator so both work internally now with asyncio. Consequently, we have
already adapted all Rally tracks with a backwards-compatibility layer (see
elastic/rally-tracks#97 and elastic/rally-eventdata-track#80).

Closes #852
Relates #916

With this commit we add an async load generator implementation. This implementation is work in progress, extremely incomplete and hacky. We also implement an async compatibility layer into the previous load generator which allows us to compare both load generator implementations in realistic scenarios.

With this commit we bump the minimum required Python version to Python 3.6 (thus dropping support for Python 3.5). Python 3.5 will be end of life on September 13, 2020 (source: [1]). We also intend to use several features that require at least Python 3.6 in future versions of Rally thus we drop support for Python 3.5 now. [1] https://devguide.python.org/#status-of-python-branches

With this commit we change Rally's internal implementation to always use the async code path so runner implementations stay the same.

danielmitterdorfer · 2020-03-25T13:42:07Z

@dliappis In my benchmarks I have noticed that the new load generator added a small overhead (1-2 ms) which was noticeable in queries with small service times in the single-digit millisecond range. It turns out that this has been caused by parsing the JSON response (although it's already event-based and lazy). Switching the parser backend from Python to C was (complex and) ineffective; thus I have added a parameter to skip response parsing entirely (it's on by default). Can you please take another look?

dliappis

I reviewed the latest changes. I left a comment about a contradiction about the default value of detailed-results in different places in the docs.

dliappis · 2020-03-26T09:21:16Z

docs/track.rst

@@ -402,6 +402,7 @@ With the operation type ``search`` you can execute `request body searches <http:
        2. Rally will not attempt to serialize the parameters and pass them as is. Always use "true" / "false" strings for boolean parameters (see example below).

 * ``body`` (mandatory): The query body.
+* ``detailed-results`` (optional, defaults to ``true``): Records more detailed meta-data about queries. As it analyzes the corresponding response in more detail, this might incur additional overhead which can skew measurement results. Set this value to ``false`` for queries that return within single digit milliseconds to increase measurement accuracy. This flag is ineffective for scroll queries.


Perhaps we should mention some important fields like took and hits fields from Elasticsearch that won't be captured when ``detailed-results:false`; might save some head scratching for people who disable this and wonder what is it that they'll be losing?

I see it is included in the migration docs but it could be mentioned here for reference?

I pushed 5ab9523.

dliappis · 2020-03-26T09:30:27Z

docs/migrate.rst

+* ``timed_out``
+* ``took``
+
+If you still want to retrieve them (risking skewed results due to additional overhead), set the new property ``detailed-results`` to ``true`` for any operation of type ``search``.


This contradicts what we are saying in 4fcaf9d#diff-a4e117fd9659940c6f1756d76f5dfe5cR405, i.e. that at least for the search operation the default is true?

I changed this fo false already in 4c2e78a?

Ah! Ok then.

dliappis

LGTM

With this commit we change some of our tracks to make sure nightly Elasticsearch benchmarks continue to run smoothly with the new asyncio load generator (see elastic/rally#935): * We lower the target throughput of some queries * We disable HTTP response compression for PMC scroll queries Relates elastic/rally#935 Relates elastic/rally#941

With this commit we improve response processing speed by taking the following measures: 1. We use the raw bytes for JSON parsing instead of first converting them into a string. 2. We expose an additional option for scroll queries to disable HTTP response compression as we have seen that this can cause significant overhead for very large responses (large documents). The default is (still) to have response compression enabled. Our experiments have shown the following changes to the 50th percentile service time for the PMC track: * Baseline: 3203 ms * With measure 1: 3014 ms * With measure 1 and measure 2: 774 ms Relates #935

With this commit we change how clients are assigned to worker processes in the load generator. While historically Rally has assigned one worker (process) to one client, with the changes done in #935 we can assign multiple clients to a single worker process now and run the clients in an asyncio event loop. This allows us to simulate many more clients than what is possible the process-based approach which is very heavy-weight. By default Rally will only create as many workers as there are CPUs available on the system. This choice can be overridden in Rally's config file with the configuration setting `available.cores` in the section `system` to allocate more or less workers. If multiple load generator machines note that we assume that all of them have the same hardware specifications and only take the coordinating machine's CPU count into account.

In elastic#935 we added a new metric `processing time` to the command line report that can be used to determine Rally's internal overhead. While the regular command line report shows this metric if configured in `rally.ini`, the comparison report did not. With this commit we also show `processing time` in the comparison report as well. Relates elastic#935

In #935 we added a new metric `processing time` to the command line report that can be used to determine Rally's internal overhead. While the regular command line report shows this metric if configured in `rally.ini`, the comparison report did not. With this commit we also show `processing time` in the comparison report as well. Relates #935

danielmitterdorfer added 30 commits February 5, 2020 10:30

Enable async driver profiling

25ab4d7

Add uvloop support

936f232

Fine-tune profile log format

5fc0042

Use uvloop only if async

a7780d3

Merge remote-tracking branch 'origin/master' into async

7077bb3

Implement readlines for StaticSource

7a4e5e5

Remove uvloop

fe3f7ab

Run sub-tasks in parallel

dcba70d

Merge remote-tracking branch 'origin/master' into drop-py-35

596f242

Merge remote-tracking branch 'origin/master' into async

fd359a9

Merge branch 'drop-py-35' into async

ec32cf7

Implement async load generator

7f90c72

With this commit we change Rally's internal implementation to always use the async code path so runner implementations stay the same.

Merge remote-tracking branch 'origin/master' into async

139d143

Support completion of parallel task structures in async mode

65925e7

Expose meta-data that async-runner is required

440c5a3

async fallback for queries

7568fae

Properly integrate async driver with racecontrol

d1023c6

Merge remote-tracking branch 'origin/master' into async

6fd6bf0

Temporarily use elasticsearch-py-async master

0c9ddd9

Properly shutdown async components in adapter layer

3cc3861

Expose kwargs in runner registry

4e2c951

Add docs

78a4fc2

Simplifications and more tests

a7da504

Properly close transport

a0bca34

Improve error handling

135b685

More cleanups

419d0d6

Merge remote-tracking branch 'origin/master' into async

513ccc8

Merge remote-tracking branch 'origin/master' into async

8a666d1

danielmitterdorfer added 4 commits March 25, 2020 13:33

Allow to skip response parsing

4fcaf9d

Expose processing time only if needed

488eb3c

Remove leftover

bf4ef02

Don't retrieve meta-data by default

4c2e78a

danielmitterdorfer requested a review from dliappis March 25, 2020 13:42

dliappis reviewed Mar 26, 2020

View reviewed changes

Add hint about query metadata

5ab9523

dliappis self-requested a review March 26, 2020 10:39

dliappis approved these changes Mar 26, 2020

View reviewed changes

danielmitterdorfer merged commit 3b5eee2 into elastic:master Mar 29, 2020

danielmitterdorfer mentioned this pull request Mar 30, 2020

Improve response processing #941

Merged

danielmitterdorfer mentioned this pull request Mar 30, 2020

Adapt for asyncio elastic/rally-tracks#108

Merged

danielmitterdorfer mentioned this pull request Apr 2, 2020

Allow to use significantly more clients #944

Merged

danielmitterdorfer mentioned this pull request Aug 5, 2020

Show processing time also in comparison reports #1045

Merged

dliappis mentioned this pull request Aug 10, 2020

Evaluate whether pysimdjson could be used in Rally #1046

Open

danielmitterdorfer deleted the async branch December 3, 2020 12:26

qiaoxux mentioned this pull request Nov 20, 2023

[FEATURE] Ability to dynamically spin up more workers to increase benchmarking throughput for indexing and search opensearch-project/opensearch-benchmark#417

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add an asyncio-based load generator #935

Add an asyncio-based load generator #935

danielmitterdorfer commented Mar 20, 2020 •

edited

Loading

danielmitterdorfer commented Mar 25, 2020

dliappis left a comment

dliappis Mar 26, 2020 •

edited

Loading

danielmitterdorfer Mar 26, 2020

dliappis Mar 26, 2020

danielmitterdorfer Mar 26, 2020

dliappis Mar 26, 2020

dliappis left a comment

Add an asyncio-based load generator #935

Add an asyncio-based load generator #935

Conversation

danielmitterdorfer commented Mar 20, 2020 • edited Loading

danielmitterdorfer commented Mar 25, 2020

dliappis left a comment

Choose a reason for hiding this comment

dliappis Mar 26, 2020 • edited Loading

Choose a reason for hiding this comment

danielmitterdorfer Mar 26, 2020

Choose a reason for hiding this comment

dliappis Mar 26, 2020

Choose a reason for hiding this comment

danielmitterdorfer Mar 26, 2020

Choose a reason for hiding this comment

dliappis Mar 26, 2020

Choose a reason for hiding this comment

dliappis left a comment

Choose a reason for hiding this comment

danielmitterdorfer commented Mar 20, 2020 •

edited

Loading

dliappis Mar 26, 2020 •

edited

Loading