Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update internal CMSIS and HAL #582

Merged

Conversation

stellar-aria
Copy link
Contributor

This upgrades the built-in CMSIS library to v5.9.0, the CMSIS H7 Device to v1.10.3, and the STM32 HAL to v1.11.1.

This addresses issue #538.

This change has been tested with DaisyExamples and a Daisy Patch, and tested against the included test suite.

Additions:

The new HAL modules have been added to src/sys/stm32h7xx_hal_conf.h, and have been enabled to match the others already present. These include modules such as the Digital Temperature Sensor, a Filter Math Accelerator (aka Fixed-point Multiplier and Accumulator or FMAC), an Octo-SPI controller (OSPI), a Digital Filter for Sigma-Delta modulation, and of course, the CORDIC co-processor block, which can accelerate trigonometric functions like sine, cosine, arctan, and square root for q1.31 and q1.15 fixed point signed numbers.

Makefiles have been updated to reflect the source code structure changes, including Makefile, core/Makefile, CMakeLists.txt, and the libdaisy.vcxproj[.filters] files.

Major Breaking Changes:

GPIO::Mode::OUTPUT_OD has been renamed to GPIO::Mode::OPEN_DRAIN due to a name conflict collision with a macro now defined in stm32h7xx_hal_gpio.h (the primary HAL file for gpio). This is instead of a kludge like #undefing the macro, as it may crop up elsewhere depending on include order.

Documentation has been updated accordingly.

Minor Breaking Changes:

Compiled code size has increased by up to 7%. This means that the patch/Nimbus example in DaisyExamples will not compile any longer and may need to be reworked to be stored somewhere other than FLASH.

Minor Fixes:

A very minor (single-character) bugfix was done in CpuLoadMeter_gtest.cpp so that a number would no longer overflow when bitshifting left, and so that I could actually compile and run the tests.

The relevant external repos:
https://github.com/ARM-software/CMSIS_5
https://github.com/STMicroelectronics/cmsis_device_h7
https://github.com/STMicroelectronics/stm32h7xx_hal_driver

@stephenhensley
Copy link
Collaborator

Hi @stellar-aria!

Thanks for the massive contribution.
This is a much needed update, and something that we've been talking about doing for a while!


This may take a little to look over and test, but most things look okay at a glance.

Regarding the OPEN_DRAIN/OD swap:

#define GPIO_MODE_OUTPUT_OD             (MODE_OUTPUT | OUTPUT_OD)  

Seems like the offending line you mentioned. I'm not sure this is is an actual naming conflict with the GPIO enum classes since everything is scoped to the GPIO class.
That said, I could see an argument for the name change for human readability's sake.

Re. Code size -- this has been creeping up over the past several versions:

Long term, there are a few things we can do (convert HAL stuff to LL drivers, reduce const HAL-mapping tables in peripheral APIs, etc.
Short term, I think we just have to live with it, and retarget to the Nimbus example to the daisy bootloader.

#define OUTPUT_TYPE_Pos 4u
#define OUTPUT_TYPE (0x1uL << OUTPUT_TYPE_Pos)
#define OUTPUT_PP (0x0uL << OUTPUT_TYPE_Pos)
#define OUTPUT_OD (0x1uL << OUTPUT_TYPE_Pos)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the location of the offending macro that causes name-collision

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since macros are basically just global find-and-replace text replacement, the preprocessor is inserting it into the enum regardless of the code (and scoping) around it :/

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yup! You're totally right. I forgot that macros just do a brute-force text replace everywhere. This is totally fine.
Thanks for the added detail.

@stellar-aria
Copy link
Contributor Author

Hi! Thanks for the feedback, I've added a comment to the line number of the file which causes the name-collision.

As you said, readability is also a concern, if just because it's better to be explicit for users, however if this wasn't a critical compilation failure I'd be fine just leaving it as it was.

Yeah, I don't really think there's a quick easy fix for the HAL's size creep, especially if the Daisy platform adopts other CPUs in the future (for say, a Seed 2 or 3) and doesn't want to take on the task of writing custom platform-specific adapters between the higher-level libDaisy classes and the processor-specific LL stuff. It's all tradeoffs, and the HAL is overall just more convenient.

@stephenhensley
Copy link
Collaborator

Thanks for the update.
I think we'll be able to have someone on the team start doing some testing on this next week.

We'll keep you posted if there's anything that needs changing.

@stellar-aria
Copy link
Contributor Author

After giving it a lot of thought, the DSP subfolder of the CMSIS driver should probably also be swapped out for a submodule and pinned to a release tag now that it's been spun off into its own project. It'll also keep the code turnover in this PR down once the commits are squashed and merged.

@stellar-aria
Copy link
Contributor Author

stellar-aria commented Jun 11, 2023

This will also make #578 unnecessary, as the DSP library will be up-to-date and now includes arm_clip_f32 (at Include/dsp/basic_math_functions.h)

@stephenhensley
Copy link
Collaborator

@stellar-aria thanks for the update!

I like the conversion of the DSP to a submodule.
I was going to suggest we do the same thing with the HAL as well. However, there are/were a few manual edits done to the HAL files that were required for libDaisy (either due to bugs, or specific behaviors).

Most of those edits can be found here However, I think there may have been one or two other ones.
I recall (especially prior to libDaisy) having to edit the unrolled loops in the SDMMC DMA functions to actually check each address for validity to avoid hard-to-track-down fault issues.

I'll try to track down any details on other changes, that we've made, but I figured I'd mention them here so that we can have them in mind as we go through integration, and regression testing.
Hopefully some of the bugs (like the SDMMC) have been resolved in the latest versions of the HAL, and we won't have to maintain these manual edits.

Now that we've signed off on the changes in #581 we'll start doing some testing on this.

I can update here as we progress, and possibly provide a little table or checklist of proposed/completed tests as we go.

@stellar-aria
Copy link
Contributor Author

Yeah, I actually re-incorporated those noted edits into the updated version of the HAL driver files. If there's problems with the SDMMC DMA stuff, it'll be because I didn't see any notes and it got undone 😅. I'll do a diff of the driver pre-and-post and check what's going on in a few minutes.

In terms of switching to submodules, we can probably at least go ahead with the CMSIS driver as well then, since the preliminary commit here was just based on the source at the latest release tag.

@stellar-aria
Copy link
Contributor Author

stellar-aria commented Jun 12, 2023

@stephenhensley
So it's looking like the main barrier to swapping to a submodule for the HAL is four separate edits (that I know of):

  1. commenting out the FORCE RESET lines in HAL_DeInit(void):
diff -ru stm32h7xx_hal_driver/Src/stm32h7xx_hal.c libDaisy/Drivers/STM32H7xx_HAL_Driver/Src/stm32h7xx_hal.c
--- stm32h7xx_hal_driver/Src/stm32h7xx_hal.c    2023-06-12 14:13:05.245624400 -0400
+++ libDaisy/Drivers/STM32H7xx_HAL_Driver/Src/stm32h7xx_hal.c   2023-06-12 13:17:58.057026300 -0400
@@ -187,8 +187,8 @@
 HAL_StatusTypeDef HAL_DeInit(void)
 {
   /* Reset of all peripherals */
-  __HAL_RCC_AHB3_FORCE_RESET();
-  __HAL_RCC_AHB3_RELEASE_RESET();
+  //__HAL_RCC_AHB3_FORCE_RESET();
+  //__HAL_RCC_AHB3_RELEASE_RESET();

   __HAL_RCC_AHB1_FORCE_RESET();
   __HAL_RCC_AHB1_RELEASE_RESET();

However, the only place that HAL_DeInit() was called was in src/sys/system.cpp but it's currently commented out, so it seems this patch is unneeded.

  1. This just prevents a redefine warning, but as of the current state of this branch, doesn't seem to be necessary anymore and is also unneeded.
diff -ru stm32h7xx_hal_driver/Src/stm32h7xx_ll_spi.c libDaisy/Drivers/STM32H7xx_HAL_Driver/Src/stm32h7xx_ll_spi.c
--- stm32h7xx_hal_driver/Src/stm32h7xx_ll_spi.c 2023-06-12 14:12:53.654625100 -0400
+++ libDaisy/Drivers/STM32H7xx_HAL_Driver/Src/stm32h7xx_ll_spi.c        2023-06-12 13:17:59.039027000 -0400
@@ -24,7 +24,9 @@
 #ifdef  USE_FULL_ASSERT
 #include "stm32_assert.h"
 #else
+#ifndef assert_param
 #define assert_param(expr) ((void)0U)
+#endif
 #endif /* USE_FULL_ASSERT */

 /** @addtogroup STM32H7xx_LL_Driver
  1. The changes introduced in Changes for issues with compiling with optimisation #484:
    These currently aren't incorporated into this branch (I was unaware of them, and that's on me for not checking better), so as of this message are set to be reverted and undone. It seems to be a very specific edge-case for gcc's -O3 setting, related to loop unrolling and optimizing out unnecessary calculations, completely bypassing a number of timeouts that use the form:
uint32_t count = 0U;

do {
  if (++count > 200000U) {
    return TIMEOUT
  }
} while (condition == 0)

This is a bug with the HAL and should honestly be promoted up to an issue/PR on their repo, but the question still remains if we want to keep a patch for this in-repo while there's no fix upstream.
EDIT: On closer inspection, it looks like this bug has been fixed upstream as of v1.10.1 (all counter variables are now marked __IO aka volatile)

  1. The potential problems you mentioned with the sdmmc module (stm32h7xx_ll_sdmmc.c), probably very similar as I'm seeing register uint32_t count in a number of places that don't have them in the upstream HAL nor in this PR.

In the short-term, what would you like me to do about this?

@stellar-aria
Copy link
Contributor Author

Okay, after some thinking at lunch, the easiest thing to do would probably be to swap to using a submodule, but retain those singular patched source files in-tree until the changes are merged upstream, at which point the build managers can just be pointed back to the originals.

This is what I'm going to work on for now, unless something changes?

Alternatively, the daisy project could fork the HAL, introduce the required bugfixes, and then submit a PR or just keep the separate fork and then have that be the submodule.

@stephenhensley
Copy link
Collaborator

@stellar-aria thanks for being super thorough and digging through the changes to the previous HAL files.

I'll have a chance to go through and make a bit of a test plan later this week. Ideally we'll be doing some actual integration testing on our end (both to confirm there aren't any regressions, and to see if the HAL update resolved some of the patches we needed to do in the past).

Once I've had a chance to dig in a bit more, I'll respond more specifically to some of your comments above wrt specific changes, submodule-ification of the HAL, etc.

@stellar-aria stellar-aria force-pushed the feature/driver_and_hal_update branch from c9f131b to a6ece3f Compare June 14, 2023 00:20
@stellar-aria stellar-aria force-pushed the feature/driver_and_hal_update branch from a6ece3f to 0366831 Compare June 14, 2023 00:31
@stellar-aria
Copy link
Contributor Author

stellar-aria commented Jun 14, 2023

Another potential source of regressions is the Middleware drivers now being out of date, but unless they do cause problems, upgrading them is probably best left for a future PR.

So I'm just noting some things here for the future:

The USB Device Middleware can probably be upgraded (and swapped to a submodule) without any issues.

However there's some notes on the Host library that dynamic allocation was disabled for one of the internal Handle types, and then later notes that USBH_Free was also disabled, and the main barrier to using the Middlewares as-designed (with malloc and free) was that the heap is in DTCMRAM. As far as I can tell the only memory configuration with the heap located there is the SRAM one, but I also don't think I understand the architecture well enough yet to understand why it was placed there initially.

Host is also the one I'm most interested in upgrading eventually, since there's now in-built components for USB Host Audio and CDC classes.

@stellar-aria
Copy link
Contributor Author

Hey just wondering where everything's at, I figure testing this is a real pain. I see there's some merge conflicts from upstream, should I go ahead and resolve those or will that impact the build that's being used?

@stephenhensley
Copy link
Collaborator

Hi @stellar-aria

You can resolve the conflicts, that shouldn't cause any issues with testing.
Testing has been going okay so far.

I'm hoping to get dig into it quite a bit deeper this week, and do a few updates on some older, but more complex programs that use USB, SDMMC, etc. and do some side-by-side comparison of some of the non-trivial changes.
That should validate the new versions work consistently (or better) than the older versions.

If nothing breaks/doesn't stand out, then I think we might be able to merge this this week 😄

@stellar-aria
Copy link
Contributor Author

Since I Introduced this PR over six months ago, I've learned a lot more about the state of embedded development toolchains and build systems, primarily from my work managing such for the Deluge Community Firmware project. In that time, my overall outlook w.r.t. the management of dependencies (such as the CMSIS ones here) and build systems in the Daisy project has changed, and it boils down to a few key points:

  1. Maintaining two build systems is a hassle and can lead to problems. Make and CMake ultimately accomplish the same thing, but CMake offers far more flexibility in terms of project organization and integration into other projects as a library. From what I understand the CMake system is newer, and is potentially a migration target, but (much like with the GPIO/Pin situation) currently in an awkward middle state.

  2. Submodules are not the greatest way to do dependency management. CMake now has a very nice built-in dependency download and integration system called FetchContent that is well suited for problems like this: downloading weird out-of-source build components and integrating them without polluting the version control system. I've used this in a very similar situation for downloading and integrating AVR Dx devicepacks into a project's AVR-GCC toolchain, for example.

In the long run migrating the current build system to more idiomatic CMake is probably a good path forward, and will reduce a lot of issues with other kinds of integrations such as alternate toolchains (I've had a Clang-support PR waiting in the wings for a while now). However in the short term I think submodules are an acceptable stopgap solution for the dependency version problem, even if it requires an additional VCS step in the form of git submodule update.

I also kind of just want to see this PR merged so I can work on others, haha 😅

@stephenhensley
Copy link
Collaborator

Hi @stellar-aria

Happy New Year!
Sorry for the delay in getting to this, the last few months of the year became very busy for us.

We are doing some internal testing on this, and that should progress fairly quickly. We are aiming to have either approval, or in the case of finding any issues, feedback by the end of next week!

Regarding your comments on CMake, I've converted that into and issue (#606) so that we could discuss it separately since it is a worthwhile discussion, but a bit outside the scope of this PR.

@stellar-aria
Copy link
Contributor Author

Just checking in on this!

@stephenhensley
Copy link
Collaborator

@stellar-aria alright!

@beserge and I have both gone through quite a bit of testing at this point.
Overall, things are working great. However, there were a few things that Ben and I both experienced that are worth taking a bit more of a look at:

USB Device

I am unable to use the internal USB port (I haven't tested the external yet) with the updates.
It does trigger the first few IRQ handlers, but then seems to get stuck in a busy state. I only checked this with the CDC class/Logger class setup. I know Ben tested with MIDI and ran into a similar issue of it not showing up on his PC.

I have a feeling this could be an incompatibility in our class interface with at least one of the changes that have happened over the years.
I did see that there is a new "TransmitCplt" callback added to the interface operations for the API, although ST's patch notes mention that a subsequent version made this optional/backwards compatible with previous versions.

Did you run into this on your end with either of the device classes?

USB Host

Similar to the above, I setup some test code that worked on the current master branch that no longer works correctly on the new branch.
However, I am still getting some USB Host behavior (connect/disconnect awareness), but it does end up failing in FatFS with disk errors.

Again, I would guess this is an incompatibility with the latest changes (interestingly it looks like the USB Host middleware has been left the same. So it might be on the HAL side that's causing this particular issue.

SDRAM/FMC

So I haven't personally gotten any abnormal behavior yet, but Ben mentioned having an example not start up. Afterwards it worked okay.

I haven't done any strict profiling to see if there's been any unexpected change in performance yet either.
Just wanted to make a note about that here in case you've run into any other similar events, or have done any time-based profiling on SDRAM performance.


I'm going to dig into the USB stuff a little bit today/tomorrow to see if I can find the particular issue(s) remaining there. If you have any insight from testing you've done so far that would be welcome!

Aside from getting USB up and running, I'm thinking the rest of this is ready to go.

@stellar-aria
Copy link
Contributor Author

I haven't played much with the USB stuff yet, tbqh. Part of the incompatibility might just come from using the older middleware libraries with the newer HAL, as I noted earlier:

Another potential source of regressions is the Middleware drivers now being out of date, but unless they do cause problems, upgrading them is probably best left for a future PR.

So I'm just noting some things here for the future:

The USB Device Middleware can probably be upgraded (and swapped to a submodule) without any issues.

However there's some notes on the Host library that dynamic allocation was disabled for one of the internal Handle types, and then later notes that USBH_Free was also disabled, and the main barrier to using the Middlewares as-designed (with malloc and free) was that the heap is in DTCMRAM. As far as I can tell the only memory configuration with the heap located there is the SRAM one, but I also don't think I understand the architecture well enough yet to understand why it was placed there initially.

So it might just be worth it to upgrade the device library and see if that fixes the device problems. If it does, we might have to upgrade/merge the host library manually. It doesn't look like the heap is stored in the DTCMRAM anymore, rather in SRAM for the QSPI and Flash loaders, and the D2 RAM for the SRAM loader, so does that note need to be edited/fixed?

I haven't done any strict profiling for the SDRAM, if you need something like that I do have some profiling tools. Was this code execution from the SDRAM or just data storage there?

@stephenhensley
Copy link
Collaborator

Okay!

I have a feeling you're correct, I just wanted to confirm if you had either experienced the same thing, or had a different experience before we went super far down the rabbit hole.
We'll test around with updating the device libraries, and go from there.

As far as I can tell, the USB device/host issues are the only things left to resolve, and then I think this is good to go.


I don't think there's much to do wrt to the SDRAM, the performance seemed pretty similar to me. The only red flag was the report of an app not starting, but that could have been something unrelated as it has not been reproduced since.

I didn't do any quantitative profiling, but I have a program that just barely underruns when copying the input around the SDRAM a ton of times, and then to the output, and it was audibly the same when running it between both the master branch, and this PR.
Other tests, and code using the SDRAM for data worked as usual as well.

@stephenhensley
Copy link
Collaborator

@stellar-aria Just wanted to keep you up to date as we progress.

We've made some decent headway in identifying the issues with USB, and are hoping to have a patch for getting USB device stuff working before the end of the week.
If that's the case, we should have a decent idea of how to get USB host working again as well, and I'm hoping to have that knocked out in the first half of next week. So if that timeline works out, we can potentially be doing final testing, and be ready to merge this by the end of next week.

@GregBurns
Copy link
Contributor

GregBurns commented Feb 16, 2024 via email

@beserge
Copy link
Contributor

beserge commented Feb 16, 2024

@GregBurns Awesome! Please do. I'm going to look into this myself a bit now.

I've just pushed my patches to the USB Device Middleware required to use both USB Midi and CDC.
As part of that I updated the ref for the HAL Driver.
I tested these were working using the DaisyExamples/seed/USB_MIDI and USB_CDC examples.

@GregBurns
Copy link
Contributor

GregBurns commented Feb 16, 2024 via email

@beserge
Copy link
Contributor

beserge commented Feb 19, 2024

Ok, so I was finally able to successfully test USB Host and get it working. It was a combination of issues on my end.

  1. My example had to be compiled for the bootloader due to size constraints.
  2. The Daisy Web Programmer is (I think) still flashing an old version of the bootloader. I made sure to flash an updated version.
  3. This branch doesn't include some crucial bootloader stuff if you checkout directly to it. Once I merged with master it worked flawlessly.

Inexplicably the app still was able to be debugged, and the issues didn't present as if they were bootloader related, not really sure why yet.

Couple of things I want to do:

  • Confirm which version of the bootloader is being flashed by the web programmer.
  • Merge my usb host example into libDaisy

That being said, this PR is finally tested, and ready to be merged! Thanks for your hard work, glad we're finally able to give it the signoff! 🚀

@beserge beserge merged commit 9d93eea into electro-smith:master Feb 21, 2024
4 checks passed
@beserge
Copy link
Contributor

beserge commented Feb 21, 2024

Ready for merge! Thanks once again for your stellar contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants