Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the process for running benchmarks on repos other than apache/arrow? #45

Closed
andygrove opened this issue Feb 5, 2022 · 11 comments

Comments

@andygrove
Copy link

I would like to donate compute resources to run benchmarks for apache/arrow-rs and apache/arrow-datafusion but it doesn't seem like this is possible according to the docs and that benchmarks can only be run against apache/arrow?

@ElenaHenderson
Copy link
Contributor

ElenaHenderson commented Feb 6, 2022

Hello @andygrove,

Glad to hear you are interested in running benchmarks using arrow-benchmarks-ci.

I added the ability to run benchmarks for multiple repos today after I got your comment.

PR #48 shows how a repo (e.g., https://github.com/ElenaHenderson/benchmarkable-repo) can be added to be benchmarked. This repo contains both code and benchmarks.

You can find results in Conbench now: https://conbench.ursa.dev/

Screen Shot 2022-02-05 at 6 36 31 PM

Here are benchmark results for last two commits in the repo compared to each other: https://conbench.ursa.dev/compare/runs/106c24eda7db4776aca487dda93a37ee...7f204c78e687406d98f4514e828cfeb2/

Here is the Buildkite pipeline with benchmark builds: https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2

If apache/arrow-rs and apache/arrow-datafusion repos contain benchmarks, you can use benchmarkable-repo as an example.
If benchmarks for apache/arrow-rs and apache/arrow-datafusion repos will live in a different repo, you can use apache/arrow (with benchmarks in https://github.com/ursacomputing/benchmarks) as an example.

I will be documenting the process for adding a new repo to be benchmarked next week. Let me know if you need it sooner.

I can also answer questions if you want to proceed before the document is ready.

Have a great weekend!

@ElenaHenderson
Copy link
Contributor

Please let me know if you are interested in the abilities below and I will figure out how to make them work for multiple repos.

  • Benchmark results can be posted into a Slack channel like this:

Screen Shot 2022-02-05 at 6 46 37 PM

  • Benchmark results can be posted as PR comments after commit is merged into master:

apache/arrow#12275 (comment)

@dianaclarke
Copy link
Contributor

@andygrove I no longer work on Arrow benchmarks, but you might be able to make some use of this initial arrow-datafusion & arrow-rust benchmarking spike:

https://github.com/ursacomputing/benchmarks/pull/79/files

IIUC, you would need to do something similar, but in the arrow-datafusion & arrow-rust repos rather than the ursacomputing/benchmarks repo.

@andygrove
Copy link
Author

Thank you @ElenaHenderson and @dianaclarke for the responses. I am putting time aside to work on this over the coming week and will let you know if I have more questions.

@dianaclarke
Copy link
Contributor

@andygrove
Copy link
Author

Hi @ElenaHenderson. Both the arrow-rs and arrow-datafusion repos now have conbench benchmarks checked in.

Could you point me to the relevant documentation for the next step of adding these to a build pipeline?

@ElenaHenderson
Copy link
Contributor

@andygrove Working on the docs now. Sorry for not getting it done sooner.

@ElenaHenderson
Copy link
Contributor

ElenaHenderson commented Feb 28, 2022

Hello @andygrove ,

The doc for adding new benchmarkable repo: https://github.com/ursacomputing/arrow-benchmarks-ci/blob/main/docs/how-to-add-new-benchmarkable-repo.md

The doc for adding new benchmark machine (once repo is added): https://github.com/ursacomputing/arrow-benchmarks-ci/blob/main/docs/how-to-add-new-benchmark-machine.md

Note that I tested adding arrow-rs repo this morning and ran its benchmarks on one of machines (Ubuntu 20.04) where apache-arrow benchmarks are run and everything worked. Here are results of arrow-rs benchmarks on conbench:

https://conbench.ursa.dev/runs/acb47dec7d3b460da79d55da1ae9db19/
Screen Shot 2022-02-28 at 2 02 11 PM

Ping me if you need anything.

Note that I removed all the code I added to test adding arrow-rs repo.

@dianaclarke
Copy link
Contributor

Nice, thanks @ElenaHenderson!!!

@andygrove I think I've done the next step in this PR: #57

Which I think means this final step for you: https://github.com/ursacomputing/arrow-benchmarks-ci/blob/main/docs/how-to-add-new-benchmark-machine.md

@ElenaHenderson
Copy link
Contributor

#57 is merged. @dianaclarke Thank you!

@ElenaHenderson
Copy link
Contributor

I am closing this issue as done since arrow-benchmarks-ci supports adding other repos to be benchmarked:

See doc: https://github.com/ursacomputing/arrow-benchmarks-ci/blob/main/docs/how-to-add-new-benchmarkable-repo.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants