kernel: support dynamic thread stack allocation #44379

cfriedt · 2022-03-30T15:17:31Z

Opening this up for review. It's a biggie, and has some wide-reaching implications.

TL;DR - we should be able to support automatic pthread stack allocation with this. It's in the Roadmap (#51211).

However, at least with GCC, this does a lot more though. With compliant POSIX threads <pthread.h>, we also get

ISO C11 <threads.h>
ISO C++11 <thread> (<mutex>, <condition_variable>, <semaphore>), etc.

Later, we can optimize-out the POSIX layer of course, which I would like to take great care in doing with Keith and Stephanos.

Taking a step back, excluding coroutines, that represents the de-facto threading model for approximately the last 25 years, and probably the de-facto threading model for most other programming languages for the foreseeable future.

MicroPython threads
Rust threads
Lua threads
Zig threads
...

The API supports both safety-critical and non-safety critical because it uses the same interface for both pool-based and heap-based thread stack allocations.

Covered in my talks at EOSS 2023.

Thanks for the help everyone 🙏 ❤️
@andrewboie , @pfalcon, @andyross , @peter-mitsis , @nashif , @dcpleung , @ceolin , @stephanosio , @keith-packard ... I really could keep going.

Fixes #26999

There is some icing on the top added by @ceolin for k_object management in #59773

BELOW FOR POSTERITY

Just a WIP at the moment. Working fine for cortex-m3.

$ west build -p auto -b qemu_cortex_m3 -t run tests/kernel/threads/dynamic_thread_stack
...
*** Booting Zephyr OS build zephyr-v3.0.0-2000-g3ccf6cd86f23  ***
Running test suite dynamic_thread_stack
===================================================================
START - test_dynamic_thread_stack
Hello, dynamic world!
 PASS - test_dynamic_thread_stack in 0.1 seconds
===================================================================
Test suite dynamic_thread_stack succeeded
===================================================================
PROJECT EXECUTION SUCCESSFUL

drivers/mm/Kconfig

cfriedt · 2022-03-31T02:51:46Z

@dcpleung - would it make sense to separate the sys_mm_* - arch_mm_* redirection layer for into its own PR?

dcpleung · 2022-03-31T16:56:57Z

Architectures have arch_mem_map() and arch_mem_unmap(). Adding a bunch of arch_mm_drv_*() is going to confuse developers. Maybe add a system mm driver to utilize arch_mem_*() instead?

cfriedt · 2022-04-06T10:08:40Z

Architectures have arch_mem_map() and arch_mem_unmap(). Adding a bunch of arch_mm_drv_*() is going to confuse developers. Maybe add a system mm driver to utilize arch_mem_*() instead?

I didn't bother creating a sys interface, it seemed to be overkill for now.

cfriedt · 2023-06-30T17:00:07Z

some yaml fixups... hopefully I didn't break anything this time 🤣

cfriedt · 2023-06-30T19:45:33Z

Flavio & I will likely merge our PRs tomorrow, so no rush at the moment.

outdated

Add support for dynamic thread stack allocation Signed-off-by: Christopher Friedt <[email protected]>

cfriedt · 2023-07-11T22:07:43Z

@ceolin & I discussed it, and we opted to merge this PR first, and subsequently merge #59773 afterward. The second PR fine-tunes userspace, mmu, and mpu support.

This will open a few interesting doors.

Test that automatic thread stack allocation works for both user and kernel threads. Signed-off-by: Christopher Friedt <[email protected]>

cfriedt · 2023-07-11T23:46:05Z

reformat tests/kernel/threads/dynamic_thread_stack/testcase.yaml in whatever way yamllint is cool with 🤷‍♂️

keith-packard

I think you should always allocate stacks from the heap; reduces code complexity, avoids issues with the pre-allocated stacks being the wrong size and doesn't consume any more memory.

cfriedt · 2023-07-12T11:49:05Z

I think you should always allocate stacks from the heap; reduces code complexity, avoids issues with the pre-allocated stacks being the wrong size and doesn't consume any more memory.

@keith-packard - I think that makes sense for a large majority of use cases, in particular userspace code, and we actually can get that behaviour via Kconfig (by specifying the pool size to be zero and enabling heap allocation).

The pool option is really there to support high reliability / safety critical if they choose to use it.

andyross · 2023-07-12T15:06:19Z

The pool option is really there to support high reliability / safety critical if they choose to use it.

Even fragmentation is an issue in an RTOS. Stacks are big, memories are small, uptimes are (sometimes very) long. A dynamic spawned thread that works fine in testing in legacy code might suddenly die after two years in the field. A core Zephyr paradigm is that the system heap should be 100% optional. Everything that can be should be allocatable statically.

I'm actually dealing with a bunch of issues like this currently while fuzzing SOF, which is very heap-centric. The allocation errors can be handled correctly (and are), but the resulting runtime failures are still failures, and prevent reaching further states. The impact there is actually paradoxical: the random input tends to leak a lot of allocations and run out of space quickly, so in fact fuzz coverage drops like a rock after the first few thousand commands. We're seeing failures come back from the (extremely parallel) oss-fuzz rig that I never saw over days and days of running locally.

ceolin · 2023-07-12T18:42:40Z

I think you should always allocate stacks from the heap; reduces code complexity, avoids issues with the pre-allocated stacks being the wrong size and doesn't consume any more memory.

@keith-packard - I think that makes sense for a large majority of use cases, in particular userspace code, and we actually can get that behaviour via Kconfig (by specifying the pool size to be zero and enabling heap allocation).

The pool option is really there to support high reliability / safety critical if they choose to use it.

yep ... One thing though that I am thinking now is that having two separated API is probably better. The reason is that whe you are allocating from the pool the stack size is defined at build time by CONFIG_DYNAMIC_THREAD_STACK_SIZE (so the given parameter is not really needed and is confusing) while allocating from the heap we don't have this constraint. Also, K_USER does not make sense IMHO, since if you have userspace enabled they stacks declared in the pool are already user capable.

keith-packard · 2023-07-12T18:59:15Z

yep ... One thing though that I am thinking now is that having two separated API is probably better.

Oh, so just exposing the stack_alloc_pool and stack_alloc_dyn apis directly instead of wrapping them in another function. That avoids problems with applications expecting pool allocations and getting surprised with dyn stacks.

One more request -- can we have the stack automatically freed when the thread exits?

cfriedt · 2023-07-13T00:15:50Z

yep ... One thing though that I am thinking now is that having two separated API is probably better.

Oh, so just exposing the stack_alloc_pool and stack_alloc_dyn apis directly instead of wrapping them in another function. That avoids problems with applications expecting pool allocations and getting surprised with dyn stacks.

If that's the general consensus, it would probably be better to not even expose pool allocation this way - just have a separate K_THREAD_POOL() that only ever allocates at build time.

It would also make the pthread use case more complicated, which I was hoping to avoid, since that was the entire reason for doing this in a way to silently support both safety and non-safety in the same way.

@romkell, @andyross - do you have any preferences?

One more request -- can we have the stack automatically freed when the thread exits?

That was tried a long time ago in a couple of different ways and it tended to be problematic and a bit racey.

Additionally, there was a vito from the safety side, because invariably then the malloc-ing code would be intermixed with the non-mallocing code.

romkell · 2023-07-13T07:26:40Z

@romkell, @andyross - do you have any preferences?

My major point concerning safety is that we retain the API where we instantiate every thing statically on the appl. side (and pass to init function etc.).
To have a stack pool to separate the memory for stacks from other heap based mem allocation may bring some advantages. I still would want to avoid dyn. stacks for safety application.
When exposing both APIs for pool and heap for the threads, the appl. developer can do any kind of mixed (pool and heap) dynamic stack allocation in the appl. Do we want to support that?
When hiding underneath one API function the application developer decides whether to use the pool or heap approach via KConfig. What is the use case to allow both pool and heap stacks in one application?
For simplicity I would prefer one API, but can perfectly live with both exposed, and leave responsibility to the appl. developer.

cfriedt · 2023-07-13T14:09:59Z

yep ... One thing though that I am thinking now is that having two separated API is probably better. The reason is that whe you are allocating from the pool the stack size is defined at build time by CONFIG_DYNAMIC_THREAD_STACK_SIZE (so the given parameter is not really needed and is confusing) while allocating from the heap we don't have this constraint.

Technically, that is a maximum size for pooled stacks (maybe the Kconfig could be named better). The size parameter can still provide non-redundant information when size < max (especially in the case where an MMU or Stack Canaries are present).

In any case though, given that both the pure-heap, pure-pool, and mixed cases are realizable with a single API, I think it would eliminate a non-trivial amount of code duplication and maybe some future bugs with virtually zero overhead (since kconfig-controlled code paths are dead-stripped by the linker in the current implementation).

To me that's a big selling feature that makes supporting the standard threading model in C, C++, pthreads (not to mention other languages) dramatically easier.

Just at a glance, it does not seem possible to support statically allocated C11 or C++11 thread stacks at all, externally to the implementation.

https://en.cppreference.com/w/c/thread

https://en.cppreference.com/w/cpp/thread/thread/thread

I think it may be possible in Rust to specify a statically allocated thread stack.

So there is a significant probability we would be forced to duplicate all of that heap / pool logic for several language runtimes.

Also, K_USER does not make sense IMHO, since if you have userspace enabled they stacks declared in the pool are already user capable.

K_USER does make sense for heap-allocated stacks though.

cfriedt · 2023-07-13T14:20:36Z

Mentioned offline as well, but it would likely make sense to support flags for pool or heap as well

ceolin · 2023-07-13T16:03:05Z

that the system heap should be 100% optional. Everything that can be should be allocatable statically.

I think that was well ... just the K_USER that option that I was talking about to be clear. Particularly because of the different behavior that this can lead on heap/pool allocation. Though we can choose the preferred way, if it fails the fallback is using the other one.

ceolin · 2023-07-13T16:05:02Z

yep ... One thing though that I am thinking now is that having two separated API is probably better.

Oh, so just exposing the stack_alloc_pool and stack_alloc_dyn apis directly instead of wrapping them in another function. That avoids problems with applications expecting pool allocations and getting surprised with dyn stacks.

If that's the general consensus, it would probably be better to not even expose pool allocation this way - just have a separate K_THREAD_POOL() that only ever allocates at build time.

It would also make the pthread use case more complicated, which I was hoping to avoid, since that was the entire reason for doing this in a way to silently support both safety and non-safety in the same way.

@romkell, @andyross - do you have any preferences?

One more request -- can we have the stack automatically freed when the thread exits?

That was tried a long time ago in a couple of different ways and it tended to be problematic and a bit racey.

Additionally, there was a vito from the safety side, because invariably then the malloc-ing code would be intermixed with the non-mallocing code.

I agree, that is error prone too, someone may try to re-use the allocated stack in a different thread ...

andyross

Should drop a +1 to move this along. Really I think this is very reasonable. It's a big area with lots of edges (especially once we start talking about userspace!), and no API is going to meet everyone's needs. This seems straightforward and useful enough for common cases, and apps with complicated needs can use it as a reference if necessary.

dcpleung · 2023-07-13T17:19:02Z

kernel/Kconfig

+config DYNAMIC_THREAD_STACK_SIZE
+	int "Size of each pre-allocated thread stack"
+	default 1024 if !64BIT
+	default 2048 if 64BIT


default MMU_PAGE_SIZE if MMU as stack needs to use the full page for access control (especially for userspace).

userspace is coming in my pr ... do not worry about it now please.

cfriedt · 2023-07-13T17:43:41Z

Mentioned offline as well, but it would likely make sense to support flags for pool or heap as well

So if it's desirable, k_thread_stack_alloc() could accept flags like

K_THREAD_STACK_FROM_HEAP, or K_THREAD_STACK_FROM_POOL.

to indicate a suitable allocation strategy as well (a default of zero could imply either).

cfriedt · 2023-07-13T17:45:40Z

Should drop a +1 to move this along. Really I think this is very reasonable. It's a big area with

Amen to that. Lets get Flavio's changes in as well so people can mix things up and develop organically.

Better to do it early in the release cycle.

ceolin · 2023-07-13T20:55:43Z

Mentioned offline as well, but it would likely make sense to support flags for pool or heap as well

So if it's desirable, k_thread_stack_alloc() could accept flags like

K_THREAD_STACK_FROM_HEAP, or K_THREAD_STACK_FROM_POOL.

to indicate a suitable allocation strategy as well (a default of zero could imply either).

That is what I was thinking ... this way we can remove the build symbols about the allocation preferences and remove all fallbacks ... that is way is clear for the user from where he is getting the memory. Note that is not a blocker for me, we can do it incrementally.

cfriedt requested review from andyross and dcpleung March 30, 2022 15:17

github-actions bot added area: API Changes to public APIs area: Kernel area: Tests Issues related to a particular existing or missing test labels Mar 30, 2022

cfriedt requested a review from peter-mitsis March 30, 2022 15:19

cfriedt commented Mar 30, 2022

View reviewed changes

drivers/mm/Kconfig Outdated Show resolved Hide resolved

cfriedt force-pushed the issue/26999/k-thread-create-dynamic-stack branch 7 times, most recently from b5b819c to f58ea3a Compare March 31, 2022 02:51

cfriedt force-pushed the issue/26999/k-thread-create-dynamic-stack branch 3 times, most recently from 53f2ccd to 2993fbf Compare March 31, 2022 11:07

cfriedt added the dev-review To be discussed in dev-review meeting label Mar 31, 2022

cfriedt force-pushed the issue/26999/k-thread-create-dynamic-stack branch from 2993fbf to 03fbcb4 Compare March 31, 2022 11:26

cfriedt force-pushed the issue/26999/k-thread-create-dynamic-stack branch 2 times, most recently from 12621e6 to 8ec3552 Compare April 6, 2022 10:06

cfriedt force-pushed the issue/26999/k-thread-create-dynamic-stack branch 6 times, most recently from 774f93b to b87c6de Compare April 6, 2022 12:57

cfriedt force-pushed the issue/26999/k-thread-create-dynamic-stack branch from d32fbd4 to b97445a Compare June 30, 2023 16:59

cfriedt mentioned this pull request Jul 2, 2023

POSIX Roadmap for LTSv3 #51211

Closed

kernel: support dynamic thread stack allocation

d65bf8a

Add support for dynamic thread stack allocation Signed-off-by: Christopher Friedt <[email protected]>

cfriedt force-pushed the issue/26999/k-thread-create-dynamic-stack branch from b97445a to 2a012bb Compare July 11, 2023 22:05

tests: kernel: threads: add a testcase for dynamic thread stacks

71ca732

Test that automatic thread stack allocation works for both user and kernel threads. Signed-off-by: Christopher Friedt <[email protected]>

cfriedt force-pushed the issue/26999/k-thread-create-dynamic-stack branch from 2a012bb to 71ca732 Compare July 11, 2023 23:44

keith-packard reviewed Jul 12, 2023

View reviewed changes

andyross approved these changes Jul 13, 2023

View reviewed changes

dcpleung reviewed Jul 13, 2023

View reviewed changes

ceolin approved these changes Jul 13, 2023

View reviewed changes

cfriedt merged commit 1323b1a into zephyrproject-rtos:main Jul 13, 2023

cfriedt deleted the issue/26999/k-thread-create-dynamic-stack branch October 7, 2023 00:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kernel: support dynamic thread stack allocation #44379

kernel: support dynamic thread stack allocation #44379

cfriedt commented Mar 30, 2022 •

edited

Loading

cfriedt commented Mar 31, 2022 •

edited

Loading

dcpleung commented Mar 31, 2022

cfriedt commented Apr 6, 2022 •

edited

Loading

cfriedt commented Jun 30, 2023

cfriedt commented Jun 30, 2023

cfriedt commented Jul 11, 2023 •

edited

Loading

cfriedt commented Jul 11, 2023

keith-packard left a comment

cfriedt commented Jul 12, 2023

andyross commented Jul 12, 2023

ceolin commented Jul 12, 2023

keith-packard commented Jul 12, 2023

cfriedt commented Jul 13, 2023

romkell commented Jul 13, 2023 •

edited

Loading

cfriedt commented Jul 13, 2023 •

edited

Loading

cfriedt commented Jul 13, 2023

ceolin commented Jul 13, 2023

ceolin commented Jul 13, 2023

andyross left a comment

dcpleung Jul 13, 2023

ceolin Jul 13, 2023

cfriedt commented Jul 13, 2023 •

edited

Loading

cfriedt commented Jul 13, 2023 •

edited

Loading

ceolin commented Jul 13, 2023

kernel: support dynamic thread stack allocation #44379

kernel: support dynamic thread stack allocation #44379

Conversation

cfriedt commented Mar 30, 2022 • edited Loading

cfriedt commented Mar 31, 2022 • edited Loading

dcpleung commented Mar 31, 2022

cfriedt commented Apr 6, 2022 • edited Loading

cfriedt commented Jun 30, 2023

cfriedt commented Jun 30, 2023

cfriedt commented Jul 11, 2023 • edited Loading

cfriedt commented Jul 11, 2023

keith-packard left a comment

Choose a reason for hiding this comment

cfriedt commented Jul 12, 2023

andyross commented Jul 12, 2023

ceolin commented Jul 12, 2023

keith-packard commented Jul 12, 2023

cfriedt commented Jul 13, 2023

romkell commented Jul 13, 2023 • edited Loading

cfriedt commented Jul 13, 2023 • edited Loading

cfriedt commented Jul 13, 2023

ceolin commented Jul 13, 2023

ceolin commented Jul 13, 2023

andyross left a comment

Choose a reason for hiding this comment

dcpleung Jul 13, 2023

Choose a reason for hiding this comment

ceolin Jul 13, 2023

Choose a reason for hiding this comment

cfriedt commented Jul 13, 2023 • edited Loading

cfriedt commented Jul 13, 2023 • edited Loading

ceolin commented Jul 13, 2023

cfriedt commented Mar 30, 2022 •

edited

Loading

cfriedt commented Mar 31, 2022 •

edited

Loading

cfriedt commented Apr 6, 2022 •

edited

Loading

cfriedt commented Jul 11, 2023 •

edited

Loading

romkell commented Jul 13, 2023 •

edited

Loading

cfriedt commented Jul 13, 2023 •

edited

Loading

cfriedt commented Jul 13, 2023 •

edited

Loading

cfriedt commented Jul 13, 2023 •

edited

Loading