Outline panicking code for `LocalKey::with` #135224

wyfo · 2025-01-07T23:31:27Z

See #115491 for prior related modifications.

https://godbolt.org/z/MTsz87jGj shows a reduction of the code size for TLS accesses.

See rust-lang#115491 for prior related modifications. https://godbolt.org/z/MTsz87jGj shows a reduction of the code size for TLS accesses.

rustbot · 2025-01-07T23:31:35Z

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @cuviper (or someone else) some time within the next two weeks.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

@rustbot author: the review is finished, PR author should check the comments and take action accordingly
@rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue

wyfo · 2025-01-07T23:44:00Z

I've seen that some small PRs have recently been rejected, so to dismiss any doubt, here is the context of this PR:
I've looked at the assembly generated for TLS access while writing the last line of an issue. I was already aware of the trick done in #115491, so I reproduced it here.

workingjubilee · 2025-01-08T00:01:00Z

@wyfo can you construct the diff view in Godbolt? you can diff different sources against each other, though maybe it won't be useful if they have different BB orders... here's a link: https://godbolt.org/z/sn5ncezMT

also -O is equivalent to -Copt-level=2 whereas cargo build --release gives you -Copt-level=3, and that does seem to (very slightly!) affect code size in your example

wyfo · 2025-01-08T01:20:11Z

I didn't know this godbolt feature... Here is a link: https://godbolt.org/z/9G1sYvT7K.
Indeed, -Copt-level=3 inline a jump, increasing more than slightly the code size to 20 instructions instead of 17. Still better than the current 24 instructions, but less better. Do you think it's still worth?

I wonder why the compiler remove the jump here. Maybe there is a #[cold] missing in the lazy initialization path, which lead the compiler to think a code size increase is worth avoiding a jump. Actually, https://github.com/rust-lang/rust/blob/master/library/std/src/sys/thread_local/os.rs#L89 should be marked as #[cold], no? I'm trying to find how I can generate assembly with my own std code to check if it makes things better.

workingjubilee · 2025-01-08T01:41:23Z

well, you can rebuild the stdlib, and then compile the compiler with the new stdlib, and then compile the code with the new stdlib, but that's kind of a slow iteration cycle.

maybe you can rebuild the stdlib from scratch atop core using #![no_std], but thread locals are weird kind-of-compiler-feature things. nonetheless, might suffice?

wyfo · 2025-01-08T02:14:01Z

My bad, I was looking at the wrong place, https://github.com/rust-lang/rust/blob/master/library/std/src/sys/thread_local/native/lazy.rs#L61 is already #[cold].
Do you think that a 4 assembly lines decrease is still worth a PR?

workingjubilee · 2025-01-08T19:32:53Z

hm? I think it's worth a PR fine, yeah. I'll let the reviewer decide whether to accept this tho'!

library/std/src/thread/local.rs

cuviper · 2025-01-08T20:54:32Z

Let's see if we can measure any difference:

@bors try @rust-timer queue

Outline panicking code for `LocalKey::with` See rust-lang#115491 for prior related modifications. https://godbolt.org/z/MTsz87jGj shows a reduction of the code size for TLS accesses.

bors · 2025-01-08T20:56:05Z

⌛ Trying commit 8ec7bae with merge fc39c05...

bors · 2025-01-08T22:42:57Z

☀️ Try build successful - checks-actions
Build commit: fc39c05 (fc39c05b09715d2dcb59937726f5719d09c1b0e6)

rust-timer · 2025-01-08T23:58:34Z

Finished benchmarking commit (fc39c05): comparison URL.

Overall result: ❌✅ regressions and improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	0.6%	[0.6%, 0.6%]	1
Regressions ❌ (secondary)	0.1%	[0.1%, 0.1%]	1
Improvements ✅ (primary)	-0.8%	[-0.8%, -0.8%]	1
Improvements ✅ (secondary)	-0.5%	[-0.5%, -0.5%]	1
All ❌✅ (primary)	-0.1%	[-0.8%, 0.6%]	2

Max RSS (memory usage)

Results (primary -3.6%, secondary -1.2%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.6%	[2.3%, 2.9%]	2
Regressions ❌ (secondary)	1.6%	[1.6%, 1.6%]	2
Improvements ✅ (primary)	-9.7%	[-14.7%, -4.7%]	2
Improvements ✅ (secondary)	-2.3%	[-3.2%, -0.9%]	5
All ❌✅ (primary)	-3.6%	[-14.7%, 2.9%]	4

Cycles

Results (primary -0.5%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.5%	[1.5%, 1.5%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.6%	[-2.1%, -1.1%]	2
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.5%	[-2.1%, 1.5%]	3

Binary size

Results (primary 0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.4%	[0.2%, 1.0%]	4
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.1%	[-0.1%, -0.0%]	31
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.0%	[-0.1%, 1.0%]	35

Bootstrap: 763.382s -> 763.181s (-0.03%)
Artifact size: 325.74 MiB -> 325.68 MiB (-0.02%)

wyfo · 2025-01-11T21:45:32Z

I'm not really able to interpret those results. Do you think the regression significant? What would be the next step?

cuviper · 2025-01-21T00:28:13Z

The perf change doesn't look significant to me -- let's do it!

@bors r+

bors · 2025-01-21T00:28:15Z

📌 Commit 8ec7bae has been approved by cuviper

It is now in the queue for this repository.

bors · 2025-01-21T02:23:19Z

⌛ Testing commit 8ec7bae with merge b605c65...

bors · 2025-01-21T05:09:52Z

☀️ Test successful - checks-actions
Approved by: cuviper
Pushing b605c65 to master...

rust-timer · 2025-01-21T06:27:34Z

Finished benchmarking commit (b605c65): comparison URL.

Overall result: ❌✅ regressions and improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	0.6%	[0.6%, 0.6%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.3%	[-0.4%, -0.2%]	3
All ❌✅ (primary)	0.6%	[0.6%, 0.6%]	1

Max RSS (memory usage)

Results (primary -1.3%, secondary 1.5%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	5.0%	[2.9%, 7.0%]	2
Regressions ❌ (secondary)	3.5%	[2.6%, 5.7%]	4
Improvements ✅ (primary)	-4.4%	[-6.8%, -2.4%]	4
Improvements ✅ (secondary)	-2.7%	[-2.8%, -2.6%]	2
All ❌✅ (primary)	-1.3%	[-6.8%, 7.0%]	6

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results (primary -0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.2%	[0.0%, 0.4%]	12
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.1%	[-0.6%, -0.0%]	28
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.0%	[-0.6%, 0.4%]	40

Bootstrap: 765.476s -> 764.449s (-0.13%)
Artifact size: 326.00 MiB -> 326.06 MiB (0.02%)

Outline panicking code for LocalKey::with

8ec7bae

See rust-lang#115491 for prior related modifications. https://godbolt.org/z/MTsz87jGj shows a reduction of the code size for TLS accesses.

rustbot assigned cuviper Jan 7, 2025

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jan 7, 2025

cuviper reviewed Jan 8, 2025

View reviewed changes

library/std/src/thread/local.rs Show resolved Hide resolved

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 8, 2025

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 8, 2025

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 21, 2025

bors added the merged-by-bors This PR was explicitly merged by bors. label Jan 21, 2025

bors merged commit b605c65 into rust-lang:master Jan 21, 2025
7 checks passed

rustbot added this to the 1.86.0 milestone Jan 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Outline panicking code for `LocalKey::with` #135224

Outline panicking code for `LocalKey::with` #135224

wyfo commented Jan 7, 2025

rustbot commented Jan 7, 2025

wyfo commented Jan 7, 2025

workingjubilee commented Jan 8, 2025

wyfo commented Jan 8, 2025

workingjubilee commented Jan 8, 2025 •

edited

Loading

wyfo commented Jan 8, 2025

workingjubilee commented Jan 8, 2025

cuviper commented Jan 8, 2025

This comment has been minimized.

bors commented Jan 8, 2025

bors commented Jan 8, 2025

This comment has been minimized.

rust-timer commented Jan 8, 2025

wyfo commented Jan 11, 2025

cuviper commented Jan 21, 2025

bors commented Jan 21, 2025

bors commented Jan 21, 2025

bors commented Jan 21, 2025

rust-timer commented Jan 21, 2025

Outline panicking code for LocalKey::with #135224

Outline panicking code for LocalKey::with #135224

Conversation

wyfo commented Jan 7, 2025

rustbot commented Jan 7, 2025

wyfo commented Jan 7, 2025

workingjubilee commented Jan 8, 2025

wyfo commented Jan 8, 2025

workingjubilee commented Jan 8, 2025 • edited Loading

wyfo commented Jan 8, 2025

workingjubilee commented Jan 8, 2025

cuviper commented Jan 8, 2025

This comment has been minimized.

bors commented Jan 8, 2025

bors commented Jan 8, 2025

This comment has been minimized.

rust-timer commented Jan 8, 2025

Overall result: ❌✅ regressions and improvements - no action needed

wyfo commented Jan 11, 2025

cuviper commented Jan 21, 2025

bors commented Jan 21, 2025

bors commented Jan 21, 2025

bors commented Jan 21, 2025

rust-timer commented Jan 21, 2025

Overall result: ❌✅ regressions and improvements - no action needed

Outline panicking code for `LocalKey::with` #135224

Outline panicking code for `LocalKey::with` #135224

workingjubilee commented Jan 8, 2025 •

edited

Loading