Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dual-core option? 🤔 #1133

Closed
stnolting opened this issue Dec 29, 2024 · 3 comments · Fixed by #1135
Closed

Dual-core option? 🤔 #1133

stnolting opened this issue Dec 29, 2024 · 3 comments · Fixed by #1135
Labels
enhancement New feature or request experimental Experimental feature

Comments

@stnolting
Copy link
Owner

stnolting commented Dec 29, 2024

I think it would be cool to have a dual-core option - even if I don't know if it's really useful... Anyway, @NikLeberg has already created a SMP version of the NEORV32: https://github.com/NikLeberg/neorv32_soc

Hardware Requirements

The address space for a single peripheral device has been increased to 64kB in #1126 - enough to map a RISC-V compatible multi-hart CLINT (core-local interruptor). This has also been added in #1130.

Multi-core support for the on-chip debugger is a little bit complex. But thanks to @NikLeberg previous work, a (hopefully) operationl multi-core DM is under development in #1132.

The bus infrastructure of the processor already provides a simple mux so two cores can access the same bus system. This mux has been improved to provide a round-robin option that might be more sutiable for two cores sharing the same address space:

entity neorv32_bus_switch is
generic (
ROUND_ROBIN_EN : boolean := false; -- enable round-robing scheduling
PORT_A_READ_ONLY : boolean := false; -- set if port A is read-only
PORT_B_READ_ONLY : boolean := false -- set if port B is read-only
);

So I think we have (almost) all parts ready for a dual-core system. Some housekeeping logic is still missing (some platform-level mechanism to determine the number of cores; -> #1134) and of course the actual core complex (including caches) has to be replicated.

Software Requirement

I'm a little unsure about the software... Both cores use the same physical address space. How can we assign a unique stack, heap, etc. for each core? And how can we do that in a simple an easy-to-use way? What kind of example application could be showcased (freeRTOS?!)?

I think I'll need to do some homework before.

As always, any feedback or ideas are highly welcome.

@stnolting stnolting added enhancement New feature or request experimental Experimental feature labels Dec 29, 2024
@stnolting stnolting linked a pull request Dec 30, 2024 that will close this issue
9 tasks
@NikLeberg
Copy link
Collaborator

Wow, you just implemented pretty much every hardware requirement for dual-core over the course of a weekend. Not bad. 😅

For the software part: Have a look at my (crude) changes to crt0 in commit 468f233:

sw/crt0: handle boot of secondary smp harts
- primary hart is the one with `hartid = 0`
- secondary harts are the onew with `hartid != 0`
- secondary harts will be kept in a `wfi` loop with enabled `msi`
  interrupt and special trap handler installed
- primary hart can trigger the `msi` interrupt aka ipi of any secondary
  core (see CLINT vhdl entity)
- this causes the secondary hart to wake up and start execution of main

Additionally it modifies the linker script slightly to give each hart its own stack.

@stnolting
Copy link
Owner Author

Wow, you just implemented pretty much every hardware requirement for dual-core over the course of a weekend. Not bad. 😅

Thanks! The multi-hard DM was the most complicated part. Fortunately, the debug spec. is much cleare than other RISC-V specifications. And of course your "reference implementation" was also super helpful! 👍

For the software part: Have a look at my (crude) changes to crt0 in commit 468f233:

You did a pretty cool job! But your startup won't work when loading an application via the bootloader, right?

I was thinking about a concept similar to the RP2040: https://github.com/raspberrypi/pico-sdk/blob/master/src/rp2_common/pico_multicore/multicore.c#L134

@NikLeberg
Copy link
Collaborator

But your startup won't work when loading an application via the bootloader, right?

You are right. I initially only targeted static code uploaded within the bitstream to the FPGA. Your implementation that uses configuration structures is way more nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request experimental Experimental feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants