Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple operation scheduling from execution in load generator #108

Closed
10 tasks done
danielmitterdorfer opened this issue Jun 29, 2016 · 3 comments
Closed
10 tasks done
Labels
meta A high-level issue of a larger topic which requires more fine-grained issues / PRs
Milestone

Comments

@danielmitterdorfer
Copy link
Member

danielmitterdorfer commented Jun 29, 2016

Currently, we couple scheduling and execution rather tightly in the load generator. This leads to multiple issues (see #58 and #64 for examples) and also constrains us to load generation on a single node. As Rally should eventually be able to benchmark multi-node clusters (#71), it will at some point not be possible to generate the necessary load anymore with a single load generator machine.

Therefore, we need to decouple scheduling from execution.

Sub tasks:

  • Abstract execution of operations (no more specific IndexBenchmark or LatencyBenchmark). For each operation, we'll gather throughput and latency metrics.
  • Adapt metrics store and reporter to abstract execution format
  • Adapt race store and tournament reporter to abstract execution format (needed for tournament mode)
  • Do scheduling on the main thread and offload execution to a dedicated executor (this will finally eliminate coordinated omission as we can achieve constant throughput)
  • Introduce a "role" concept in Rally: For now we'll have coordinator and load-generator nodes (later, for multi-node benchmarks we'll also have provisioner nodes). This role concept will not be exposed to the user, it is just needed internally.
  • Implement a communication mechanism between the coordinator process and the load-generator processes (most likely some kind of RPC, so we can use it later for multi-node benchmarks).
  • Spawn a subprocess for each client defined in the track specification and have the coordinating Rally process communicate with the load-generators.
  • Introduce global coordination points. E.g., we (may) want all clients to finish indexing before they start the search phase. Note that whether we need a global checkpoint depends on the track specification (see next subtask).
  • Introduce the notion of parallel tasks (e.g. we want to index with n clients and to search with m clients in parallel). (we need at least a possibility to restore the old behavior of running all queries within an iteration instead of running all iterations per query)
  • Introduce the notion of traffic mixes (e.g. we want issue 30% scroll queries and 70% phrase queries)( taken out of scope of this ticket, separate issue Mixed workload benchmark #119 targeting the release after)
@danielmitterdorfer danielmitterdorfer added the meta A high-level issue of a larger topic which requires more fine-grained issues / PRs label Jun 29, 2016
@danielmitterdorfer danielmitterdorfer added this to the 0.4.0 milestone Jun 29, 2016
@danielmitterdorfer danielmitterdorfer self-assigned this Jun 29, 2016
This was referenced Jun 30, 2016
This was referenced Jul 13, 2016
@danielmitterdorfer danielmitterdorfer modified the milestones: 0.4.0, 0.3.1 Jul 27, 2016
@danielmitterdorfer
Copy link
Member Author

We're nearing completion of this ticket. Here are a few numbers:

Metric Rally 0.3.2 (without #108) Rally 0.4.0.dev (with #108)
Min indexing throughput [docs/s] 35701 35703
Median indexing Throughput [docs/s] 37864 38195
Max indexing Throughput [docs/s] 39104 39124

In both cases we ran esrally --pipeline=benchmark-only --target-hosts="192.168.2.2:9200". Both machines where connected via a direct 1GBit network. In both cases we used 8 clients to index documents.

I also ran a stress test of the load test driver where I stubbed out Elasticsearch with mock responses returned from nginx.

Metric Rally 0.3.2 (without #108) Rally 0.4.0.dev (with #108)
Min indexing throughput [docs/s] 129094 274425
Median indexing Throughput [docs/s] 132230 338586
Max indexing Throughput [docs/s] 135557 424786

Although the variance is higher, we can clearly see that the new load test driver can achieve significantly higher throughput rates.

@danielmitterdorfer danielmitterdorfer added the blocked This item cannot be finished because of a dependency label Aug 11, 2016
@danielmitterdorfer
Copy link
Member Author

Waiting on 3.0.3 release of Thespian to avoid initialization problems in certain edge cases.

@danielmitterdorfer danielmitterdorfer removed the blocked This item cannot be finished because of a dependency label Aug 12, 2016
@danielmitterdorfer
Copy link
Member Author

The release is out. We can move on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta A high-level issue of a larger topic which requires more fine-grained issues / PRs
Projects
None yet
Development

No branches or pull requests

1 participant