Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compiler-rt: memmove optimisation #22606

Merged
merged 4 commits into from
Feb 22, 2025
Merged

Conversation

dweiller
Copy link
Contributor

@dweiller dweiller commented Jan 25, 2025

This PR seeks to improve memmove performance and fix some issues with generated code size of the current compiler-rt memmove.

I haven't yet benchmarked this implementation, though I expect the impact to be similar to #18912.

Here is a table of code sizes for ReleastFast (targets chosen somewhat randomly, feel free to suggest additions/removals from the list):

target cpu master (B) 3294ef7 (B)
thumb-freestanding-eabihf cortex_m3 16362 438
thumb-freestanding-eabihf cortex_m4 16362 438
thumb-freestanding-eabihf cortex_m33 16362 438
thumb-freestanding-eabihf cortex_m52 2644 420
aarch64-linux cortex_a53 1472 380
aarch64-linux cortex_a75 832 568
aarch64-linux cortex_x1 836 584
aarch64-linux cortex_x4 832 584
x86_64-linux x86_64 1402 564
x86_64-linux x86_64_v2 1402 564
x86_64-linux x86_64_v3 1348 826
x86_64-linux x86_64_v4 1348 826
loognarch64-linux loongarch64 14408 304

I've marked this a ready for review as I'm not sure when I'll get to benchmarking in earnest and I think this should be merged before 0.14. I think there's no problem merging this as-is (modulo any reviews) and doing the following todos in a follow-up post 0.14 if I don't get it done before hand.

Resolves #22603 (at least for the target discussed there, but presumably for any others as well).

Todo:

  • benchmark memmove implementation
  • investigate sharing parts of implementation with memcpy

@dweiller dweiller changed the title Memmove opt compiler-rt: memmove optimisation Jan 25, 2025
@alexrp
Copy link
Member

alexrp commented Jan 29, 2025

Are you aiming to get this one in for 0.14.0?

@dweiller
Copy link
Contributor Author

dweiller commented Jan 29, 2025

Are you aiming to get this one in for 0.14.0?

Yes, I'd say it's basically mergable as is (there's one or two small things I can think of that I'd change first), which would fix the code size issue we currently have, The thing that will take more time is benchmarking and fine-tuning things based on benchmarks; that might leave things a bit close to the release date. Benchmarking could always be spun off into followup work if we're happy to merge without proper benchmarking.

@alexrp alexrp added this to the 0.14.0 milestone Jan 29, 2025
@dweiller dweiller marked this pull request as ready for review January 30, 2025 09:27
@dweiller dweiller marked this pull request as draft February 11, 2025 01:57
@dweiller dweiller marked this pull request as ready for review February 11, 2025 12:20
@andrewrk andrewrk merged commit 61ee9f9 into ziglang:master Feb 22, 2025
10 checks passed
@dweiller dweiller deleted the memmove-opt branch February 24, 2025 01:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

STM32 embedded debug binaries much larger with 0.14.0-dev.2851+b074fb7dd
3 participants