Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🕷️ Rework integration tests to allow parallel test execution #257

Merged
merged 1 commit into from
May 4, 2023

Conversation

acfoltzer
Copy link
Contributor

@acfoltzer acfoltzer commented May 3, 2023

No more TEST_LOCK! The integration tests now take less than 5 seconds to execute on my workstation, vs 79 seconds previously.

This required two main pieces of work.

💽 Run the Viceroy test server on an ephemeral port

The server spun up to run incoming requests through Wasm now is bound to an ephemeral port rather than being hardcoded to port 7878, and the test requests have their URLs modified to include the ephemeral port assigned to the test.

This change technically means the Test::against*() methods can no longer be used to send requests to destinations other than the Viceroy that's been spun up for this test, but that's the only way they're currently used anyway.

The main concrete change here for test authors is that you no longer specify localhost:7878 in your test requests, you instead just provide a request path.

🤵 Redesign test backend servers to also use ephemeral ports

This is a broader-reaching change in a few ways:

  • Backends now must have a service function defined, or test setup will panic. This was already how the integration tests behaved semantically, but this does change some of the syntax around their definitions.

  • Test servers for backends are spun up once per Test, rather than per invocation of Test::against*(). The existing suite of test backends was already mostly stateless, so again not a huge impact on test authors.

  • A full URL can no longer be specified for backends. Instead, a path may be provided in order to exercise the backend path prefix behavior of Viceroy.

  • Tests that must know dynamically how to reach a backend server now must explicitly start the backend servers before getting a URI suitable for the purpose. This only shows up in dynamic backend tests.

🛍️ Grab bag

  • The level of parallelism made possible by this change meant that I had to tweak the default number of instances in the Wasmtime pooling allocator down from its default of 1000. I don't expect a limit of 100 to cause anyone problems, but this also raised the possibility of switching away from the pooling allocator for most use cases. A followup issue is available at Use the wasmtime on-demand allocator #255.

  • The additional parallelism also caused a massive CPU bottleneck when JIT compiling the test fixture Wasm modules. The main workspace now enables opt-level 1 for debug builds, substantially improving runtime performance of the tests without heavily impacting compile speed.

  • The trap-test was using a #[path = "..."] mechanism to reuse code from the integration test suite. I didn't know this was a mechanism that existed, and it was cool to learn about it. But it also was incompatible with common now having a child module, so I inlined the very small handful of definitions that test was reusing.

  • I added a couple extra functions to support cases where a test might want to define a backend config separately from its service function, for example if starting a Test configuration from a fastly.toml file. These currently go unused because none of the existing fastly.toml-using tests define any backends.

  • The new TestBackends interfaces are awfully panicky. I figure this is OK for test infrastructure, but am open to suggestions if folks feel this is unwise. I tried to at least document the various expectations, and make the panic messages relevant beyond just a line number.

  • The graceful shutdown logic is no longer strictly necessary unless we write enough tests with enough backends to run out of ports. But because it seems to work fine and is a tidy way to handle things, I left it in.

@acfoltzer acfoltzer added the maintenance Maintenance and gardening label May 3, 2023
@acfoltzer acfoltzer self-assigned this May 3, 2023
@acfoltzer acfoltzer force-pushed the acf/parallel-tests branch 2 times, most recently from a471a89 to 7f6a17c Compare May 3, 2023 23:51
No more `TEST_LOCK`! The integration tests now take less than 5 seconds to execute on my
workstation, vs 79 seconds previously.

This required two main pieces of work.

💽 Run the Viceroy test server on an ephemeral port
===================================================

The server spun up to run incoming requests through Wasm now is bound to an
ephemeral port rather than being hardcoded to port 7878, and the test requests
have their URLs modified to include the ephemeral port assigned to the test.

This change technically means the `Test::against*()` methods can no longer be
used to send requests to destinations other than the Viceroy that's been spun
up for this test, but that's the only way they're currently used anyway.

The main concrete change here for test authors is that you no longer specify
`localhost:7878` in your test requests, you instead just provide a request path.

🤵 Redesign test backend servers to also use ephemeral ports
============================================================

This is a broader-reaching change in a few ways:

- Backends now _must_ have a service function defined, or test setup will
  panic. This was already how the integration tests behaved semantically, but
  this does change some of the syntax around their definitions.

- Test servers for backends are spun up once per `Test`, rather than per
  invocation of `Test::against*()`. The existing suite of test backends was
  already mostly stateless, so again not a huge impact on test authors.

- A full URL can no longer be specified for backends. Instead, a path may be
  provided in order to exercise the backend path prefix behavior of Viceroy.

- Tests that must know dynamically how to reach a backend server now must
  explicitly start the backend servers before getting a URI suitable for the
  purpose. This only shows up in dynamic backend tests.

🛍️ Grab bag
===========

- The level of parallelism made possible by this change meant that I had to
  tweak the default number of instances in the Wasmtime pooling allocator down
  from its default of 1000. I don't expect a limit of 100 to cause anyone
  problems, but this also raised the possibility of switching away from the
  pooling allocator for most use cases. A followup issue is available at
  #255.

- The additional parallelism also caused a massive CPU bottleneck when JIT
  compiling the test fixture Wasm modules. The main workspace now enables
  opt-level 1 for debug builds, substantially improving runtime performance of
  the tests without heavily impacting compile speed.

- The `trap-test` was using a `#[path = "..."]` mechanism to reuse code from the
  integration test suite. I didn't know this was a mechanism that existed, and
  it was cool to learn about it. But it also was incompatible with `common` now
  having a child module, so I inlined the very small handful of definitions that
  test was reusing.

- I added a couple extra functions to support cases where a test might want to
  define a backend config separately from its service function, for example if
  starting a `Test` configuration from a `fastly.toml` file. These currently go
  unused because none of the existing `fastly.toml`-using tests define any
  backends.

- The new `TestBackends` interfaces are awfully panicky. I figure this is OK for
  test infrastructure, but am open to suggestions if folks feel this is
  unwise. I tried to at least document the various expectations, and make the
  panic messages relevant beyond just a line number.

- The graceful shutdown logic is no longer strictly necessary unless we write
  enough tests with enough backends to run out of ports. But because it seems to
  work fine and is a tidy way to handle things, I left it in.
@acfoltzer acfoltzer force-pushed the acf/parallel-tests branch from 7f6a17c to 8fddb0d Compare May 3, 2023 23:51
@acfoltzer acfoltzer marked this pull request as ready for review May 4, 2023 00:11
@acfoltzer
Copy link
Contributor Author

I'm happy to report this also takes our fully-cached CI time from 15m30s to around 5m30s. I suspect the bulk of that is from the opt-level change which took maybe 1% of the overall time I spent on this PR, but still we do love to see it.

Copy link
Contributor

@cratelyn cratelyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🏀 🧺 😤 slam dunk

this is such a lovely change, i'm delighted.

@acfoltzer acfoltzer merged commit a39aca0 into main May 4, 2023
@acfoltzer acfoltzer deleted the acf/parallel-tests branch May 4, 2023 20:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Maintenance and gardening
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants