Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: Parallelize package tests in high-cost packages. #7126

Merged

Conversation

philipaconrad
Copy link
Contributor

@philipaconrad philipaconrad commented Oct 18, 2024

What changed?

This commit adds t.Parallel() calls to the beginning of many tests across several Go packages in OPA. Most of the slowest packages (those taking ~10s or more) have been instrumented where possible as a proof-of-concept. On a machine with many cores, the tests now will complete as fast as the slowest test per package, instead of the sum of all the tests in a particular package.

Motivation (Why these changes?)

  • (Hopefully!) Slightly faster CI times for OPA PRs
  • Faster make test locally, especially on multicore systems
  • Stronger guarantees about test isolation
    • Related: Tests that cannot be run in parallel are now clearly marked in server, topdown/*, plugins/*, download, etc.

Background Notes on t.Parallel

Go runs tests across different packages in parallel, but will run all tests within a package sequentially. By adding t.Parallel() to tests that support it, we are telling the test-runner that it is safe to run that test in parallel with any other tests.

Adding t.Parallel at the top of a test marks that test as safe to run in parallel with other tests, but if it has sub-tests, they are run sequentially by default. In some places, I've added the t.Parallel annotation to sub-tests as well, which allows table-driven tests to benefit from available parallelism as well.

TODOs

Some remaining targets for parallelizing:

  • topdown/ :: Original run time: ~1 minute. ➡️ New run time: ~30s on a multicore machine.
  • tester/ :: Run time: ~1 minute.
    • Cannot be parallelized effectively, as most of the time cost is from running benchmarks as part of the tests.
  • storage/disk/ :: Original run time: ~25-30s. ➡️ New run time: ~5-7s.
  • rego/ takes about 10 seconds.

@philipaconrad philipaconrad added performance go Pull requests that update Go code labels Oct 18, 2024
@philipaconrad philipaconrad self-assigned this Oct 18, 2024
@philipaconrad philipaconrad changed the title test: Parallelize package level tests in high-cost packages. test: Parallelize package tests in high-cost packages. Oct 18, 2024
@philipaconrad
Copy link
Contributor Author

Looks like there's a data race occurring under plugins/bundle. I'll investigate that soon, and will either remove the t.Parallel() annotations for those tests, or will figure out what the underlying shared resource is.

@philipaconrad
Copy link
Contributor Author

ℹ️ Found the culprits! In both cases, the offending tests with race conditions were cases where the tests were modifying package variables of some sort.

The two sets of cases were:

  • server/server_test.go :: 3x tests were removed from the t.Parallel group, because they modified the internal/version package variables directly.
  • plugins/bundle/plugin_test.go :: 2x tests were removed from the t.Parallel group because they modified the goos package variable from their parent plugins/bundle package directly.

@philipaconrad philipaconrad marked this pull request as ready for review November 5, 2024 22:19
@philipaconrad philipaconrad force-pushed the philip/parallel-tests branch 3 times, most recently from dd0f3ac to 673cefc Compare November 6, 2024 21:49
@philipaconrad
Copy link
Contributor Author

philipaconrad commented Nov 6, 2024

ℹ️ Looks like I've got a few more intermittent failures in the topdown/http_test.go and rego/rego_wasmtarget_test.go tests to wrangle. Once those are addressed, I think this PR should be good to ship. 😅

EDIT: I think we're stable once more. There was a timeout value that I'd lowered, and the slow GH Actions runner was causing the test to occasionally time out unexpectedly.

Copy link
Member

@ashutosh-narkar ashutosh-narkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @philipaconrad! The comments about why tests are excluded from parallel execution are helpful. The tests are green so this LGTM.

This commit adds `t.Parallel()` calls to the beginning of many tests
across several Go packages in OPA. The slowest packages (taking ~10s or
more) have been instrumented where possible as a proof-of-concept. On a
machine with many cores, the tests now will complete as fast as the
slowest test per package, instead of the sum of all the tests in a
particular package.

Background: Go runs tests across different packages in parallel, but
will run all tests *within* a package sequentially. By adding
`t.Parallel()` to tests that support it, we are telling the test-runner
that it is safe to run that test in parallel with any other tests.

Signed-off-by: Philip Conrad <[email protected]>
This commit fixes a data race that could occur in the `server` package
tests, because 3x tests were modifying package variables under
`internal/version`. These tests now run sequentially, and are not
included in the parallel test set.

Signed-off-by: Philip Conrad <[email protected]>
Two tests in this package modified a package variable directly, and as
such cannot be safely run in parallel with each other or any other tests
in the package.

Signed-off-by: Philip Conrad <[email protected]>
This commit wraps up a large batch of fairly mechanical refactorings to
add t.Parallel() annotations to almost every test under `topdown`. The
tests that could not be safely parallelized now have explicit warning
comments on them describing why they are not safe to run in parallel.

Signed-off-by: Philip Conrad <[email protected]>
This commit bundles up test parallelization changes for the
`storage/disk` package, dramatically reducing its execution time.

Signed-off-by: Philip Conrad <[email protected]>
This commit includes a bundle of t.Parallel refactoring changes for the
`rego` package, including a timer-related bugfix, and a slight change on
a cancellation test to reduce its overall cost during test runs (the
logic is preserved, but the mandatory timeouts are lower now).

Signed-off-by: Philip Conrad <[email protected]>
@philipaconrad
Copy link
Contributor Author

Rebasing through the Github UI, and then will merge when green.

@philipaconrad philipaconrad merged commit d2c0459 into open-policy-agent:main Nov 12, 2024
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go Pull requests that update Go code performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants