perf(reflow): eliminate most WordWrapper allocations #1239

SUPERCILEX · 2024-07-26T09:39:54Z

On large paragraphs (~1MB), this saves hundreds of thousands of allocations.

TL;DR: reuse as much memory as possible across next_line calls. Instead of allocating new buffers each time, allocate the buffers once and clear them before reuse.

codecov · 2024-07-26T09:43:05Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.3%. Comparing base (36fa3c1) to head (8fb110e).

Additional details and impacted files

@@          Coverage Diff          @@
##            main   #1239   +/-   ##
=====================================
  Coverage   94.3%   94.3%           
=====================================
  Files         65      65           
  Lines      15552   15565   +13     
=====================================
+ Hits       14676   14689   +13     
  Misses       876     876

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

joshka

Can you add some comments about the algorithm used to make it easier to maintain this. In particular what the change was that helps. Think of this as a way to help future readers understand why this code does what it does. One way to do this is to update the first comment on the PR, which often ends up in the changelog verbatim (and at the very least is the first thing people see when clicking through to find more details on the change).

Also I'd like to see some of the allocation measurements you've done and benchmark comparisons if you can show these.

Pinging @tranzystorekk as they were the last person to touch this and have more context on a review than I do.

General approval for this to go ahead assuming it does what it says on the commit message, but definitely want a second review.

src/widgets/reflow.rs

github-actions · 2024-08-07T02:34:44Z

Bencher

Report	Wed, August 7, 2024 at 03:30:38 UTC
Project	Ratatui
Branch	1239/merge
Testbed	ubuntu-latest

⚠️ WARNING: The following Measure does not have a Threshold. Without a Threshold, no Alerts will ever be generated!
Latency (latency)
Click here to create a new Threshold
For more information, see the Threshold documentation.
To only post results if a Threshold exists, set the --ci-only-thresholds CLI flag.

Click to view all benchmark results

Benchmark	Latency	Latency Results nanoseconds (ns)
barchart/render/2048	➖ (view plot)	188,300.00
barchart/render/256	➖ (view plot)	121,390.00
barchart/render/64	➖ (view plot)	79,168.00
barchart/render_grouped/2048	➖ (view plot)	333,070.00
barchart/render_grouped/256	➖ (view plot)	132,090.00
barchart/render_grouped/64	➖ (view plot)	120,860.00
barchart/render_horizontal/2048	➖ (view plot)	161,080.00
barchart/render_horizontal/256	➖ (view plot)	79,948.00
barchart/render_horizontal/64	➖ (view plot)	74,059.00
block/render_all_feature/100x50	➖ (view plot)	10,057.00
block/render_all_feature/200x50	➖ (view plot)	17,936.00
block/render_all_feature/256x256	➖ (view plot)	84,454.00
block/render_empty/100x50	➖ (view plot)	5,626.20
block/render_empty/200x50	➖ (view plot)	11,020.00
block/render_empty/256x256	➖ (view plot)	71,178.00
line_render/Center/0	➖ (view plot)	3.71
line_render/Center/10	➖ (view plot)	429.17
line_render/Center/3	➖ (view plot)	224.90
line_render/Center/4	➖ (view plot)	253.01
line_render/Center/42	➖ (view plot)	547.79
line_render/Center/6	➖ (view plot)	268.71
line_render/Center/7	➖ (view plot)	298.31
line_render/Left/0	➖ (view plot)	3.71
line_render/Left/10	➖ (view plot)	386.19
line_render/Left/3	➖ (view plot)	152.47
line_render/Left/4	➖ (view plot)	165.97
line_render/Left/42	➖ (view plot)	547.40
line_render/Left/6	➖ (view plot)	253.96
line_render/Left/7	➖ (view plot)	266.72
line_render/Right/0	➖ (view plot)	3.71
line_render/Right/10	➖ (view plot)	393.68
line_render/Right/3	➖ (view plot)	224.62
line_render/Right/4	➖ (view plot)	266.28
line_render/Right/42	➖ (view plot)	546.58
line_render/Right/6	➖ (view plot)	343.11
line_render/Right/7	➖ (view plot)	382.30
list/render/16384	➖ (view plot)	1,143,500.00
list/render/2048	➖ (view plot)	261,960.00
list/render/64	➖ (view plot)	139,420.00
list/render_scroll_half/16384	➖ (view plot)	1,152,200.00
list/render_scroll_half/2048	➖ (view plot)	269,980.00
list/render_scroll_half/64	➖ (view plot)	97,257.00
paragraph/new/2048	➖ (view plot)	252,240.00
paragraph/new/64	➖ (view plot)	6,762.00
paragraph/new/65535	➖ (view plot)	8,050,200.00
paragraph/render/2048	➖ (view plot)	444,530.00
paragraph/render/64	➖ (view plot)	402,310.00
paragraph/render/65535	➖ (view plot)	1,536,900.00
paragraph/render_scroll_full/2048	➖ (view plot)	390,160.00
paragraph/render_scroll_full/64	➖ (view plot)	415,660.00
paragraph/render_scroll_full/65535	➖ (view plot)	1,467,100.00
paragraph/render_scroll_half/2048	➖ (view plot)	391,310.00
paragraph/render_scroll_half/64	➖ (view plot)	424,190.00
paragraph/render_scroll_half/65535	➖ (view plot)	1,465,900.00
paragraph/render_wrap/2048	➖ (view plot)	220,950.00
paragraph/render_wrap/64	➖ (view plot)	189,120.00
paragraph/render_wrap/65535	➖ (view plot)	1,330,600.00
paragraph/render_wrap_scroll_full/2048	➖ (view plot)	220,230.00
paragraph/render_wrap_scroll_full/64	➖ (view plot)	188,870.00
paragraph/render_wrap_scroll_full/65535	➖ (view plot)	1,325,400.00
rect_rows/rows/1024	➖ (view plot)	482.88
rect_rows/rows/16	➖ (view plot)	7.43
rect_rows/rows/65535	➖ (view plot)	30,454.00
sparkline/render/2048	➖ (view plot)	117,800.00
sparkline/render/256	➖ (view plot)	118,170.00
sparkline/render/64	➖ (view plot)	36,686.00

Bencher - Continuous Benchmarking
View Public Perf Page
Docs | Repo | Chat | Help

Signed-off-by: Alex Saveau <[email protected]>

SUPERCILEX · 2024-08-07T03:15:25Z

Benchmarks:

paragraph/render_wrap/64
                        time:   [203.71 µs 204.68 µs 205.74 µs]
                        change: [-14.266% -11.921% -9.5791%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  3 (3.00%) low mild
  5 (5.00%) high mild
  2 (2.00%) high severe
paragraph/render_wrap_scroll_full/64
                        time:   [202.86 µs 203.71 µs 204.62 µs]
                        change: [-5.6208% -4.4032% -3.2639%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe
paragraph/render_wrap/2048
                        time:   [229.06 µs 229.88 µs 230.80 µs]
                        change: [-8.6312% -8.1080% -7.6043%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe
paragraph/render_wrap_scroll_full/2048
                        time:   [229.72 µs 230.77 µs 231.92 µs]
                        change: [-7.7813% -7.2111% -6.6529%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  7 (7.00%) high mild
  3 (3.00%) high severe
paragraph/render_wrap/65535
                        time:   [1.6269 ms 1.6399 ms 1.6593 ms]
                        change: [+4.9704% +6.1672% +7.6439%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
  5 (5.00%) high mild
  7 (7.00%) high severe
paragraph/render_wrap_scroll_full/65535
                        time:   [1.4048 ms 1.4114 ms 1.4190 ms]
                        change: [-30.278% -26.925% -23.530%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
  2 (2.00%) high mild
  11 (11.00%) high severe

SUPERCILEX · 2024-08-07T03:19:43Z

The semver stuff seems unrelated?

joshka · 2024-08-07T03:45:20Z

The semver stuff seems unrelated?

Yep - we're experimenting with a new CI tool that tells us when the're breaking changes, but haven't fully tweaked it to understand already breaking changes in main.

joshka

LGTM - thanks for the changes

On large paragraphs (~1MB), this saves hundreds of thousands of allocations. TL;DR: reuse as much memory as possible across `next_line` calls. Instead of allocating new buffers each time, allocate the buffers once and clear them before reuse. Signed-off-by: Alex Saveau <[email protected]>

SUPERCILEX requested review from orhun, joshka, kdheepak, Valentin271 and EdJoPaTo as code owners July 26, 2024 09:39

SUPERCILEX force-pushed the paragraph-alloc branch from f8e4649 to 3de8fe1 Compare July 26, 2024 09:46

SUPERCILEX changed the title ~~feat(reflow): eliminate most WordWrapper allocations~~ perf(reflow): eliminate most WordWrapper allocations Jul 26, 2024

SUPERCILEX force-pushed the paragraph-alloc branch from 3de8fe1 to 317cfb0 Compare July 26, 2024 19:33

joshka approved these changes Aug 6, 2024

View reviewed changes

src/widgets/reflow.rs Outdated Show resolved Hide resolved

src/widgets/reflow.rs Outdated Show resolved Hide resolved

joshka added Status: Review Needed PR needs a review / Issue needs buy-in from other maintainers / users Status: Revision Needed PR changes requested labels Aug 6, 2024

perf(reflow): eliminate most WordWrapper allocations

8fb110e

Signed-off-by: Alex Saveau <[email protected]>

SUPERCILEX force-pushed the paragraph-alloc branch from 317cfb0 to 8fb110e Compare August 7, 2024 03:15

SUPERCILEX requested a review from a team as a code owner August 7, 2024 03:15

joshka approved these changes Aug 7, 2024

View reviewed changes

joshka merged commit 4753b72 into ratatui:main Aug 7, 2024
38 of 39 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(reflow): eliminate most WordWrapper allocations #1239

perf(reflow): eliminate most WordWrapper allocations #1239

SUPERCILEX commented Jul 26, 2024 •

edited

Loading

codecov bot commented Jul 26, 2024 •

edited

Loading

joshka left a comment

github-actions bot commented Aug 7, 2024 •

edited

Loading

SUPERCILEX commented Aug 7, 2024

SUPERCILEX commented Aug 7, 2024

joshka commented Aug 7, 2024

joshka left a comment

perf(reflow): eliminate most WordWrapper allocations #1239

perf(reflow): eliminate most WordWrapper allocations #1239

Conversation

SUPERCILEX commented Jul 26, 2024 • edited Loading

codecov bot commented Jul 26, 2024 • edited Loading

Codecov Report

joshka left a comment

Choose a reason for hiding this comment

github-actions bot commented Aug 7, 2024 • edited Loading

Bencher

SUPERCILEX commented Aug 7, 2024

SUPERCILEX commented Aug 7, 2024

joshka commented Aug 7, 2024

joshka left a comment

Choose a reason for hiding this comment

SUPERCILEX commented Jul 26, 2024 •

edited

Loading

codecov bot commented Jul 26, 2024 •

edited

Loading

github-actions bot commented Aug 7, 2024 •

edited

Loading