Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Execution benchmarks now only take execution into account #431

Merged
merged 1 commit into from
Jun 7, 2020

Conversation

Razican
Copy link
Member

@Razican Razican commented May 31, 2020

This Pull Request tries to change the execution benchmarks in order to only benchmark the actual execution. This should give us much more interesting information on the actual execution.

To be determined:

  • Should we rename the benchmarks in order to not show a huge drop on execution time in the benchmarks web page?
  • Should we maintain old benchmarks of Realm + exec too and just duplicate the benchmarks?

I can also add benchmarks for #427, #429 and #430 if you'd like.

@Razican Razican added enhancement New feature or request benchmark Issues and PRs related to the benchmark subsystem. labels May 31, 2020
@Razican Razican added this to the v0.9.0 milestone May 31, 2020
Copy link
Member

@HalidOdat HalidOdat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benchmarks are failing because we define the same variables, maybe putting it in a block scope would help.

@jasonwilliams
Copy link
Member

Should we rename the benchmarks in order to not show a huge drop on execution time in the benchmarks web page?
Should we maintain old benchmarks of Realm + exec too and just duplicate the benchmarks?

I'm personally ok with having the drop, there's enough of us with context as to what happened. When we merge to master we can mention why in the commit and it will show on the benchmarks page when you click through the commit hash

@github-actions
Copy link

github-actions bot commented Jun 1, 2020

Benchmark for 141db24

Click to view benchmark
Test PR Benchmark Master Benchmark %
Create Realm 437.6±16.46µs 419.3±21.85µs 104%
Expression (Lexer) 2.1±0.07µs 2.1±0.09µs 101%
Expression (Parser) 4.9±0.23µs 5.1±0.26µs 96%
Fibonacci (Execution) 2.0±0.05ms 2.6±0.10ms 73%
For loop (Execution) 63.0±1.74µs 453.7±28.11µs -520%
For loop (Lexer) 5.4±0.16µs 5.4±0.21µs 100%
For loop (Parser) 13.3±0.55µs 13.2±0.51µs 101%
Hello World (Lexer) 977.2±28.46ns 969.5±34.86ns 101%
Hello World (Parser) 2.2±0.11µs 2.2±0.09µs 99%
Symbols (Execution) 17.7±0.65µs 452.8±26.87µs -2365%
undefined undefined 100%

@Razican
Copy link
Member Author

Razican commented Jun 1, 2020

This is ready to go! :D

@Razican Razican requested a review from HalidOdat June 1, 2020 09:30
@HalidOdat
Copy link
Member

The Symbol benchmark -2365% 🤣

@github-actions
Copy link

github-actions bot commented Jun 2, 2020

Benchmark for 141db24

Click to view benchmark
Test PR Benchmark Master Benchmark %
Create Realm 421.1±26.77µs 417.4±21.31µs 101%
Expression (Lexer) 2.0±0.12µs 2.1±0.22µs 95%
Expression (Parser) 4.6±0.24µs 4.9±0.35µs 93%
Fibonacci (Execution) 2.2±0.10ms 2.7±0.18ms 74%
For loop (Execution) 64.9±4.06µs 444.5±19.02µs -484.99999999999994%
For loop (Lexer) 5.2±0.29µs 5.3±0.37µs 99%
For loop (Parser) 13.5±1.32µs 14.5±1.22µs 93%
Hello World (Lexer) 973.6±59.79ns 973.4±48.98ns 100%
Hello World (Parser) 2.2±0.14µs 2.3±0.18µs 93%
Symbols (Execution) 17.9±0.89µs 456.5±25.08µs -2350%
undefined undefined 100%

@jasonwilliams
Copy link
Member

Do you think we need some documentation here?
Like a readme in this directory

@Razican
Copy link
Member Author

Razican commented Jun 2, 2020

Do you think we need some documentation here?
Like a readme in this directory

I will add it.

Do you think we should retroactively fix older benchmarks? This could be done by subtracting the Realm creation, the lexing and the parsing from the execution benchmarks programmatically.

@codecov
Copy link

codecov bot commented Jun 2, 2020

Codecov Report

Merging #431 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #431   +/-   ##
=======================================
  Coverage   66.00%   66.00%           
=======================================
  Files         146      146           
  Lines        9180     9180           
=======================================
  Hits         6059     6059           
  Misses       3121     3121           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 691c0d4...a2fd882. Read the comment docs.

@github-actions
Copy link

github-actions bot commented Jun 2, 2020

Benchmark for 7f0e4d9

Click to view benchmark
Test PR Benchmark Master Benchmark %
Create Realm 436.2±15.39µs 442.8±14.63µs 98%
Expression (Lexer) 2.1±0.08µs 2.1±0.08µs 101%
Expression (Parser) 5.0±0.19µs 5.0±0.19µs 100%
Fibonacci (Execution) 2.1±0.06ms 2.6±0.06ms 73%
For loop (Execution) 63.7±1.47µs 464.6±15.60µs -529%
For loop (Lexer) 5.5±0.17µs 5.5±0.14µs 99%
For loop (Parser) 13.6±0.39µs 13.6±0.47µs 100%
Hello World (Lexer) 1007.4±46.89ns 999.0±31.71ns 101%
Hello World (Parser) 2.3±0.07µs 2.3±0.06µs 100%
Symbols (Execution) 17.4±1.02µs 471.6±11.96µs -2514%
undefined undefined 100%

@github-actions
Copy link

github-actions bot commented Jun 2, 2020

Benchmark for f2496d8

Click to view benchmark
Test PR Benchmark Master Benchmark %
Create Realm 472.7±23.61µs 481.6±15.66µs 98%
Expression (Lexer) 2.3±0.08µs 2.3±0.08µs 99%
Expression (Parser) 5.3±0.27µs 5.5±0.23µs 97%
Fibonacci (Execution) 2.4±0.13ms 3.0±0.10ms 74%
For loop (Execution) 73.6±3.14µs 514.5±25.28µs -499%
For loop (Lexer) 5.8±0.22µs 6.0±2.29µs 97%
For loop (Parser) 15.2±1.09µs 15.3±0.57µs 99%
Hello World (Lexer) 1091.0±73.18ns 1076.5±35.99ns 101%
Hello World (Parser) 2.4±0.16µs 2.5±0.07µs 99%
Symbols (Execution) 19.6±1.15µs 519.9±46.52µs -2457%
undefined undefined 100%

@HalidOdat
Copy link
Member

Do you think we should retroactively fix older benchmarks? This could be done by subtracting the Realm creation, the lexing and the parsing from the execution benchmarks programmatically.

You mean the benchmark graphs, so we don't get a misleading drop in benchmarks, right? If so, I think we should if it's possible.

@Razican
Copy link
Member Author

Razican commented Jun 2, 2020

These benchmarks should be using the latest changes we merged yesterday in the master branch of criterion compare, right?

Do you think we should retroactively fix older benchmarks? This could be done by subtracting the Realm creation, the lexing and the parsing from the execution benchmarks programmatically.

You mean the benchmark graphs, so we don't get a misleading drop in benchmarks, right? If so, I think we should if it's possible.

Yes, I meant that. I think it would be nicer. We could use the opportunity to normalize lexer + parser benchmarks.

@HalidOdat
Copy link
Member

HalidOdat commented Jun 2, 2020

These benchmarks should be using the latest changes we merged yesterday in the master branch of criterion compare, right?

Yes. I was thinking about that too, I re-ran the benchmarks, the same thing happened. They should have been updated. 😕

Edit: maybe there is some kind of caching?

@jasonwilliams
Copy link
Member

Do you think we need some documentation here?
Like a readme in this directory

I will add it.

Do you think we should retroactively fix older benchmarks? This could be done by subtracting the Realm creation, the lexing and the parsing from the execution benchmarks programmatically.

Yeah if we can we should, do you have a plan of how to do this?

@github-actions
Copy link

github-actions bot commented Jun 2, 2020

Benchmark for f2496d8

Click to view benchmark
Test PR Benchmark Master Benchmark %
Create Realm 432.4±12.47µs 459.3±38.17µs 94%
Expression (Lexer) 2.1±0.14µs 2.1±0.09µs 99%
Expression (Parser) 4.9±0.24µs 5.1±0.39µs 97%
Fibonacci (Execution) 2.1±0.07ms 2.6±0.15ms 76%
For loop (Execution) 62.6±1.45µs 449.2±12.74µs -517%
For loop (Lexer) 5.3±0.23µs 5.5±0.24µs 97%
For loop (Parser) 13.2±0.42µs 13.2±0.55µs 100%
Hello World (Lexer) 992.0±41.72ns 970.7±38.87ns 102%
Hello World (Parser) 2.2±0.06µs 2.2±0.05µs 100%
Symbols (Execution) 17.3±0.72µs 465.0±13.13µs -2480%
undefined undefined 100%

@github-actions
Copy link

github-actions bot commented Jun 2, 2020

Benchmark for f2496d8

Click to view benchmark
Test PR Benchmark Master Benchmark %
Create Realm 438.1±29.92µs 431.8±36.80µs 101%
Expression (Lexer) 1960.5±156.62ns 1993.1±133.38ns 98%
Expression (Parser) 4.9±0.33µs 4.7±0.36µs 103%
Fibonacci (Execution) 2.2±0.14ms 2.6±0.17ms 81%
For loop (Execution) 66.4±5.01µs 444.7±30.95µs -470%
For loop (Lexer) 4.8±0.40µs 5.1±0.38µs 93%
For loop (Parser) 14.2±0.90µs 13.2±1.11µs 108%
Hello World (Lexer) 895.5±89.57ns 966.5±85.03ns 92%
Hello World (Parser) 2.1±0.13µs 2.2±0.17µs 97%
Symbols (Execution) 17.8±1.24µs 446.2±25.45µs -2307%
undefined undefined 100%

@Razican
Copy link
Member Author

Razican commented Jun 3, 2020

Do you think we need some documentation here?
Like a readme in this directory

I will add it.

Do you think we should retroactively fix older benchmarks? This could be done by subtracting the Realm creation, the lexing and the parsing from the execution benchmarks programmatically.

Yeah if we can we should, do you have a plan of how to do this?

Yep, I'm writing a script to calculate real execution benchmarks by subtracting realm execution + parsing + lexing, if available, if not, these two can be calculated approximately.

@Razican
Copy link
Member Author

Razican commented Jun 6, 2020

I think this can be merged, right, @jasonwilliams? Probably before #458.

@github-actions
Copy link

github-actions bot commented Jun 6, 2020

Benchmark for f3a1ebe

Click to view benchmark
Test PR Benchmark Master Benchmark %
Create Realm 440.0±12.37µs 434.2±13.47µs 101%
Expression (Lexer) 1959.9±43.20ns 1975.5±66.43ns 99%
Expression (Parser) 4.8±0.11µs 4.8±0.10µs 100%
Fibonacci (Execution) 2.0±0.08ms 2.5±0.05ms 78%
For loop (Execution) 61.1±1.50µs 456.7±13.11µs -547%
For loop (Lexer) 5.1±0.19µs 5.0±0.14µs 101%
For loop (Parser) 13.3±0.31µs 13.1±0.59µs 101%
Hello World (Lexer) 905.4±18.17ns 914.0±23.50ns 99%
Hello World (Parser) 2.2±0.08µs 2.2±0.10µs 98%
Symbols (Execution) 16.5±0.77µs 461.2±10.53µs -2597%
undefined undefined 100%

@Razican Razican mentioned this pull request Jun 6, 2020
8 tasks
@Razican Razican changed the title Execution benchmarks only take execution into account Execution benchmarks now only take execution into account Jun 7, 2020
@Razican Razican merged commit d970cf9 into master Jun 7, 2020
@Razican Razican deleted the new_benches branch June 10, 2020 10:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmark Issues and PRs related to the benchmark subsystem. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants