Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an asyncio-based load generator #935

Merged
merged 52 commits into from
Mar 29, 2020

Conversation

danielmitterdorfer
Copy link
Member

@danielmitterdorfer danielmitterdorfer commented Mar 20, 2020

With this commit we add a new experimental subcommand race-aync to Rally. It
allows to specify significantly more clients than the current race subcommand.
The reason for this is that under the hood, race-async uses asyncio and runs
all clients in a single event loop. Contrary to that, race uses an actor
system under the hood and maps each client to one process.

As the new subcommand is very experimental and not yet meant to be used broadly,
there is no accompanying user documentation in this PR. Instead, we plan to
build on top of this PR and expand the load generator to take advantage of
multiple cores before we consider this usable in production (it will likely keep
its experimental status though).

In this PR we also implement a compatibility layer into the current load
generator so both work internally now with asyncio. Consequently, we have
already adapted all Rally tracks with a backwards-compatibility layer (see
elastic/rally-tracks#97 and elastic/rally-eventdata-track#80).

Closes #852
Relates #916

With this commit we add an async load generator implementation. This
implementation is work in progress, extremely incomplete and hacky. We
also implement an async compatibility layer into the previous load
generator which allows us to compare both load generator implementations
in realistic scenarios.
With this commit we bump the minimum required Python version to Python
3.6 (thus dropping support for Python 3.5). Python 3.5 will be end of
life on September 13, 2020 (source: [1]). We also intend to use several
features that require at least Python 3.6 in future versions of Rally
thus we drop support for Python 3.5 now.

[1] https://devguide.python.org/#status-of-python-branches
With this commit we change Rally's internal implementation to always use
the async code path so runner implementations stay the same.
@danielmitterdorfer
Copy link
Member Author

@dliappis In my benchmarks I have noticed that the new load generator added a small overhead (1-2 ms) which was noticeable in queries with small service times in the single-digit millisecond range. It turns out that this has been caused by parsing the JSON response (although it's already event-based and lazy). Switching the parser backend from Python to C was (complex and) ineffective; thus I have added a parameter to skip response parsing entirely (it's on by default). Can you please take another look?

Copy link
Contributor

@dliappis dliappis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the latest changes. I left a comment about a contradiction about the default value of detailed-results in different places in the docs.

docs/track.rst Outdated
@@ -402,6 +402,7 @@ With the operation type ``search`` you can execute `request body searches <http:
2. Rally will not attempt to serialize the parameters and pass them as is. Always use "true" / "false" strings for boolean parameters (see example below).

* ``body`` (mandatory): The query body.
* ``detailed-results`` (optional, defaults to ``true``): Records more detailed meta-data about queries. As it analyzes the corresponding response in more detail, this might incur additional overhead which can skew measurement results. Set this value to ``false`` for queries that return within single digit milliseconds to increase measurement accuracy. This flag is ineffective for scroll queries.
Copy link
Contributor

@dliappis dliappis Mar 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should mention some important fields like took and hits fields from Elasticsearch that won't be captured when ``detailed-results:false`; might save some head scratching for people who disable this and wonder what is it that they'll be losing?

I see it is included in the migration docs but it could be mentioned here for reference?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed 5ab9523.

* ``timed_out``
* ``took``

If you still want to retrieve them (risking skewed results due to additional overhead), set the new property ``detailed-results`` to ``true`` for any operation of type ``search``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This contradicts what we are saying in 4fcaf9d#diff-a4e117fd9659940c6f1756d76f5dfe5cR405, i.e. that at least for the search operation the default is true?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this fo false already in 4c2e78a?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! Ok then.

@dliappis dliappis self-requested a review March 26, 2020 10:39
Copy link
Contributor

@dliappis dliappis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@danielmitterdorfer danielmitterdorfer merged commit 3b5eee2 into elastic:master Mar 29, 2020
danielmitterdorfer added a commit to danielmitterdorfer/rally-tracks that referenced this pull request Mar 30, 2020
With this commit we change some of our tracks to make sure nightly
Elasticsearch benchmarks continue to run smoothly with the new asyncio
load generator (see elastic/rally#935):

* We lower the target throughput of some queries
* We disable HTTP response compression for PMC scroll queries

Relates elastic/rally#935
Relates elastic/rally#941
danielmitterdorfer added a commit to elastic/rally-tracks that referenced this pull request Mar 30, 2020
With this commit we change some of our tracks to make sure nightly
Elasticsearch benchmarks continue to run smoothly with the new asyncio
load generator (see elastic/rally#935):

* We lower the target throughput of some queries
* We disable HTTP response compression for PMC scroll queries

Relates elastic/rally#935
Relates elastic/rally#941
danielmitterdorfer added a commit to elastic/rally-tracks that referenced this pull request Mar 30, 2020
With this commit we change some of our tracks to make sure nightly
Elasticsearch benchmarks continue to run smoothly with the new asyncio
load generator (see elastic/rally#935):

* We lower the target throughput of some queries
* We disable HTTP response compression for PMC scroll queries

Relates elastic/rally#935
Relates elastic/rally#941
danielmitterdorfer added a commit to elastic/rally-tracks that referenced this pull request Mar 30, 2020
With this commit we change some of our tracks to make sure nightly
Elasticsearch benchmarks continue to run smoothly with the new asyncio
load generator (see elastic/rally#935):

* We lower the target throughput of some queries
* We disable HTTP response compression for PMC scroll queries

Relates elastic/rally#935
Relates elastic/rally#941
danielmitterdorfer added a commit to elastic/rally-tracks that referenced this pull request Mar 30, 2020
With this commit we change some of our tracks to make sure nightly
Elasticsearch benchmarks continue to run smoothly with the new asyncio
load generator (see elastic/rally#935):

* We lower the target throughput of some queries
* We disable HTTP response compression for PMC scroll queries

Relates elastic/rally#935
Relates elastic/rally#941
danielmitterdorfer added a commit to elastic/rally-tracks that referenced this pull request Mar 30, 2020
With this commit we change some of our tracks to make sure nightly
Elasticsearch benchmarks continue to run smoothly with the new asyncio
load generator (see elastic/rally#935):

* We lower the target throughput of some queries
* We disable HTTP response compression for PMC scroll queries

Relates elastic/rally#935
Relates elastic/rally#941
danielmitterdorfer added a commit to elastic/rally-tracks that referenced this pull request Mar 30, 2020
With this commit we change some of our tracks to make sure nightly
Elasticsearch benchmarks continue to run smoothly with the new asyncio
load generator (see elastic/rally#935):

* We lower the target throughput of some queries
* We disable HTTP response compression for PMC scroll queries

Relates elastic/rally#935
Relates elastic/rally#941
danielmitterdorfer added a commit to elastic/rally-tracks that referenced this pull request Mar 30, 2020
With this commit we change some of our tracks to make sure nightly
Elasticsearch benchmarks continue to run smoothly with the new asyncio
load generator (see elastic/rally#935):

* We lower the target throughput of some queries
* We disable HTTP response compression for PMC scroll queries

Relates elastic/rally#935
Relates elastic/rally#941
danielmitterdorfer added a commit that referenced this pull request Mar 30, 2020
With this commit we improve response processing speed by taking the following
measures:

1. We use the raw bytes for JSON parsing instead of first converting them into a
string.
2. We expose an additional option for scroll queries to disable HTTP response
compression as we have seen that this can cause significant overhead for very
large responses (large documents). The default is (still) to have response
compression enabled.

Our experiments have shown the following changes to the 50th percentile service
time for the PMC track:

* Baseline: 3203 ms
* With measure 1: 3014 ms
* With measure 1 and measure 2: 774 ms

Relates #935
danielmitterdorfer added a commit that referenced this pull request Apr 6, 2020
With this commit we change how clients are assigned to worker processes in the
load generator. While historically Rally has assigned one worker (process) to
one client, with the changes done in #935 we can assign multiple clients to a
single worker process now and run the clients in an asyncio event loop. This
allows us to simulate many more clients than what is possible the process-based
approach which is very heavy-weight.

By default Rally will only create as many workers as there are CPUs available on
the system. This choice can be overridden in Rally's config file with the
configuration setting `available.cores` in the section `system` to allocate more
or less workers. If multiple load generator machines note that we assume that
all of them have the same hardware specifications and only take the coordinating
machine's CPU count into account.
danielmitterdorfer added a commit to danielmitterdorfer/rally that referenced this pull request Aug 5, 2020
In elastic#935 we added a new metric `processing time` to the command line
report that can be used to determine Rally's internal overhead. While
the regular command line report shows this metric if configured in
`rally.ini`, the comparison report did not. With this commit we also
show `processing time` in the comparison report as well.

Relates elastic#935
danielmitterdorfer added a commit that referenced this pull request Aug 13, 2020
In #935 we added a new metric `processing time` to the command line
report that can be used to determine Rally's internal overhead. While
the regular command line report shows this metric if configured in
`rally.ini`, the comparison report did not. With this commit we also
show `processing time` in the comparison report as well.

Relates #935
@danielmitterdorfer danielmitterdorfer deleted the async branch December 3, 2020 12:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improves the status quo highlight A substantial improvement that is worth mentioning separately in release notes :Load Driver Changes that affect the core of the load driver such as scheduling, the measurement approach etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

An actorless load generator
2 participants