Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(reflow): eliminate most WordWrapper allocations #1239

Merged
merged 1 commit into from
Aug 7, 2024

Conversation

SUPERCILEX
Copy link
Contributor

@SUPERCILEX SUPERCILEX commented Jul 26, 2024

On large paragraphs (~1MB), this saves hundreds of thousands of allocations.

TL;DR: reuse as much memory as possible across next_line calls. Instead of allocating new buffers each time, allocate the buffers once and clear them before reuse.

Copy link

codecov bot commented Jul 26, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.3%. Comparing base (36fa3c1) to head (8fb110e).

Additional details and impacted files
@@          Coverage Diff          @@
##            main   #1239   +/-   ##
=====================================
  Coverage   94.3%   94.3%           
=====================================
  Files         65      65           
  Lines      15552   15565   +13     
=====================================
+ Hits       14676   14689   +13     
  Misses       876     876           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@SUPERCILEX SUPERCILEX changed the title feat(reflow): eliminate most WordWrapper allocations perf(reflow): eliminate most WordWrapper allocations Jul 26, 2024
Copy link
Member

@joshka joshka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some comments about the algorithm used to make it easier to maintain this. In particular what the change was that helps. Think of this as a way to help future readers understand why this code does what it does. One way to do this is to update the first comment on the PR, which often ends up in the changelog verbatim (and at the very least is the first thing people see when clicking through to find more details on the change).

Also I'd like to see some of the allocation measurements you've done and benchmark comparisons if you can show these.

Pinging @tranzystorekk as they were the last person to touch this and have more context on a review than I do.

General approval for this to go ahead assuming it does what it says on the commit message, but definitely want a second review.

src/widgets/reflow.rs Outdated Show resolved Hide resolved
src/widgets/reflow.rs Outdated Show resolved Hide resolved
@joshka joshka added Status: Review Needed PR needs a review / Issue needs buy-in from other maintainers / users Status: Revision Needed PR changes requested labels Aug 6, 2024
Copy link
Contributor

github-actions bot commented Aug 7, 2024

🐰Bencher

ReportWed, August 7, 2024 at 03:30:38 UTC
ProjectRatatui
Branch1239/merge
Testbedubuntu-latest

⚠️ WARNING: The following Measure does not have a Threshold. Without a Threshold, no Alerts will ever be generated!

  • Latency (latency)

Click here to create a new Threshold
For more information, see the Threshold documentation.
To only post results if a Threshold exists, set the --ci-only-thresholds CLI flag.

Click to view all benchmark results
BenchmarkLatencyLatency Results
nanoseconds (ns)
barchart/render/2048➖ (view plot)188,300.00
barchart/render/256➖ (view plot)121,390.00
barchart/render/64➖ (view plot)79,168.00
barchart/render_grouped/2048➖ (view plot)333,070.00
barchart/render_grouped/256➖ (view plot)132,090.00
barchart/render_grouped/64➖ (view plot)120,860.00
barchart/render_horizontal/2048➖ (view plot)161,080.00
barchart/render_horizontal/256➖ (view plot)79,948.00
barchart/render_horizontal/64➖ (view plot)74,059.00
block/render_all_feature/100x50➖ (view plot)10,057.00
block/render_all_feature/200x50➖ (view plot)17,936.00
block/render_all_feature/256x256➖ (view plot)84,454.00
block/render_empty/100x50➖ (view plot)5,626.20
block/render_empty/200x50➖ (view plot)11,020.00
block/render_empty/256x256➖ (view plot)71,178.00
line_render/Center/0➖ (view plot)3.71
line_render/Center/10➖ (view plot)429.17
line_render/Center/3➖ (view plot)224.90
line_render/Center/4➖ (view plot)253.01
line_render/Center/42➖ (view plot)547.79
line_render/Center/6➖ (view plot)268.71
line_render/Center/7➖ (view plot)298.31
line_render/Left/0➖ (view plot)3.71
line_render/Left/10➖ (view plot)386.19
line_render/Left/3➖ (view plot)152.47
line_render/Left/4➖ (view plot)165.97
line_render/Left/42➖ (view plot)547.40
line_render/Left/6➖ (view plot)253.96
line_render/Left/7➖ (view plot)266.72
line_render/Right/0➖ (view plot)3.71
line_render/Right/10➖ (view plot)393.68
line_render/Right/3➖ (view plot)224.62
line_render/Right/4➖ (view plot)266.28
line_render/Right/42➖ (view plot)546.58
line_render/Right/6➖ (view plot)343.11
line_render/Right/7➖ (view plot)382.30
list/render/16384➖ (view plot)1,143,500.00
list/render/2048➖ (view plot)261,960.00
list/render/64➖ (view plot)139,420.00
list/render_scroll_half/16384➖ (view plot)1,152,200.00
list/render_scroll_half/2048➖ (view plot)269,980.00
list/render_scroll_half/64➖ (view plot)97,257.00
paragraph/new/2048➖ (view plot)252,240.00
paragraph/new/64➖ (view plot)6,762.00
paragraph/new/65535➖ (view plot)8,050,200.00
paragraph/render/2048➖ (view plot)444,530.00
paragraph/render/64➖ (view plot)402,310.00
paragraph/render/65535➖ (view plot)1,536,900.00
paragraph/render_scroll_full/2048➖ (view plot)390,160.00
paragraph/render_scroll_full/64➖ (view plot)415,660.00
paragraph/render_scroll_full/65535➖ (view plot)1,467,100.00
paragraph/render_scroll_half/2048➖ (view plot)391,310.00
paragraph/render_scroll_half/64➖ (view plot)424,190.00
paragraph/render_scroll_half/65535➖ (view plot)1,465,900.00
paragraph/render_wrap/2048➖ (view plot)220,950.00
paragraph/render_wrap/64➖ (view plot)189,120.00
paragraph/render_wrap/65535➖ (view plot)1,330,600.00
paragraph/render_wrap_scroll_full/2048➖ (view plot)220,230.00
paragraph/render_wrap_scroll_full/64➖ (view plot)188,870.00
paragraph/render_wrap_scroll_full/65535➖ (view plot)1,325,400.00
rect_rows/rows/1024➖ (view plot)482.88
rect_rows/rows/16➖ (view plot)7.43
rect_rows/rows/65535➖ (view plot)30,454.00
sparkline/render/2048➖ (view plot)117,800.00
sparkline/render/256➖ (view plot)118,170.00
sparkline/render/64➖ (view plot)36,686.00

Bencher - Continuous Benchmarking
View Public Perf Page
Docs | Repo | Chat | Help

@SUPERCILEX
Copy link
Contributor Author

Benchmarks:

paragraph/render_wrap/64
                        time:   [203.71 µs 204.68 µs 205.74 µs]
                        change: [-14.266% -11.921% -9.5791%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  3 (3.00%) low mild
  5 (5.00%) high mild
  2 (2.00%) high severe
paragraph/render_wrap_scroll_full/64
                        time:   [202.86 µs 203.71 µs 204.62 µs]
                        change: [-5.6208% -4.4032% -3.2639%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe
paragraph/render_wrap/2048
                        time:   [229.06 µs 229.88 µs 230.80 µs]
                        change: [-8.6312% -8.1080% -7.6043%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe
paragraph/render_wrap_scroll_full/2048
                        time:   [229.72 µs 230.77 µs 231.92 µs]
                        change: [-7.7813% -7.2111% -6.6529%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  7 (7.00%) high mild
  3 (3.00%) high severe
paragraph/render_wrap/65535
                        time:   [1.6269 ms 1.6399 ms 1.6593 ms]
                        change: [+4.9704% +6.1672% +7.6439%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
  5 (5.00%) high mild
  7 (7.00%) high severe
paragraph/render_wrap_scroll_full/65535
                        time:   [1.4048 ms 1.4114 ms 1.4190 ms]
                        change: [-30.278% -26.925% -23.530%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
  2 (2.00%) high mild
  11 (11.00%) high severe

@SUPERCILEX SUPERCILEX requested a review from a team as a code owner August 7, 2024 03:15
@SUPERCILEX
Copy link
Contributor Author

The semver stuff seems unrelated?

@joshka
Copy link
Member

joshka commented Aug 7, 2024

The semver stuff seems unrelated?

Yep - we're experimenting with a new CI tool that tells us when the're breaking changes, but haven't fully tweaked it to understand already breaking changes in main.

Copy link
Member

@joshka joshka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - thanks for the changes

@joshka joshka merged commit 4753b72 into ratatui:main Aug 7, 2024
38 of 39 checks passed
joshka pushed a commit to erak/ratatui that referenced this pull request Oct 14, 2024
On large paragraphs (~1MB), this saves hundreds of thousands of
allocations.

TL;DR: reuse as much memory as possible across `next_line` calls.
Instead of allocating new buffers each time, allocate the buffers once
and clear them before reuse.

Signed-off-by: Alex Saveau <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Review Needed PR needs a review / Issue needs buy-in from other maintainers / users Status: Revision Needed PR changes requested
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants