Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rmt: Improve codegen for enable_listen_interrupt #2960

Merged
merged 2 commits into from
Jan 16, 2025

Conversation

wisp3rwind
Copy link
Contributor

@wisp3rwind wisp3rwind commented Jan 15, 2025

Thank you for your contribution!

We appreciate the time and effort you've put into this pull request.
To help us review it efficiently, please ensure you've gone through the following checklist:

Submission Checklist 📝

  • I have updated existing examples or added new ones (if applicable).
  • I have used cargo xtask fmt-packages command to ensure that all changed code is formatted correctly.
  • My changes were added to the CHANGELOG.md in the proper section.
  • I have added necessary changes to user code to the Migration Guide.
  • My changes are in accordance to the esp-rs API guidelines

Extra:

Pull Request Details 📖

Description

By chance, I've noticed that the enable_listen_interrupt methods of the RMT driver generates quite bloated code at opt_level: 's'. This seems to happen because the EnumSet iterator is too complicated to be optimized away at this optimization level (it is for opt_level: 2|3).

This converts to simple conditionals instead of a loop/match construct in order to help the compiler in generating better code.

I didn't add a changelog entry since there is no user-visible change; not sure that there should be one?

Testing

  • I checked that the embassy_rmt_rx and embassy_rmt_tx examples compile for riscv architectures. I don't have an xtensa toolchain installed to easily test that architecture.
  • I've also run a heavily modified RMT driver (including this change) in async tx mode on a ESP Rust Board (ESP32c3).

Here are some code samples of the resulting rmt async interrupt handler obtained via cargo objdump --release --bin async_main -- -d for esp32c3 (but the issue also shows up for usage of this method outside of interrupt handler).

enumset loops (old code), `opt_level: 3`: code optimized as expected
42004650 <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h93ee520a4872cfc8>:
42004650: 1101          addi    sp, sp, -0x20
42004652: ce06          sw      ra, 0x1c(sp)
42004654: cc22          sw      s0, 0x18(sp)
42004656: ca26          sw      s1, 0x14(sp)
42004658: c84a          sw      s2, 0x10(sp)
4200465a: c64e          sw      s3, 0xc(sp)
4200465c: c452          sw      s4, 0x8(sp)
4200465e: c256          sw      s5, 0x4(sp)
42004660: 1000          addi    s0, sp, 0x20
42004662: 60016537      lui     a0, 0x60016
42004666: 5d4c          lw      a1, 0x3c(a0)
42004668: 0115f613      andi    a2, a1, 0x11
4200466c: c601          beqz    a2, 0x42004674 <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h93ee520a4872cfc8+0x24>
4200466e: 4581          li      a1, 0x0
42004670: 5639          li      a2, -0x12
42004672: a02d          j       0x4200469c <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h93ee520a4872cfc8+0x4c>
42004674: 0225f613      andi    a2, a1, 0x22
42004678: c609          beqz    a2, 0x42004682 <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h93ee520a4872cfc8+0x32>
4200467a: 4585          li      a1, 0x1
4200467c: fdd00613      li      a2, -0x23
42004680: a831          j       0x4200469c <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h93ee520a4872cfc8+0x4c>
42004682: 0445f613      andi    a2, a1, 0x44
42004686: c609          beqz    a2, 0x42004690 <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h93ee520a4872cfc8+0x40>
42004688: 4589          li      a1, 0x2
4200468a: fbb00613      li      a2, -0x45
4200468e: a039          j       0x4200469c <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h93ee520a4872cfc8+0x4c>
42004690: 0885f593      andi    a1, a1, 0x88
42004694: c5bd          beqz    a1, 0x42004702 <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h93ee520a4872cfc8+0xb2>
42004696: 458d          li      a1, 0x3
42004698: f7700613      li      a2, -0x89
4200469c: 4134          lw      a3, 0x40(a0)
4200469e: 8e75          and     a2, a2, a3
420046a0: c130          sw      a2, 0x40(a0)
420046a2: 00259513      slli    a0, a1, 0x2
420046a6: 0592          slli    a1, a1, 0x4
420046a8: 8d89          sub     a1, a1, a0
420046aa: 3fc814b7      lui     s1, 0x3fc81
420046ae: b4848493      addi    s1, s1, -0x4b8
420046b2: 94ae          add     s1, s1, a1
420046b4: 4981          li      s3, 0x0
420046b6: 300479f3      csrrci  s3, mstatus, 0x8
420046ba: 0004ca03      lbu     s4, 0x0(s1)
420046be: 0044aa83      lw      s5, 0x4(s1)
420046c2: 0084a903      lw      s2, 0x8(s1)
420046c6: 4505          li      a0, 0x1
420046c8: 00a48023      sb      a0, 0x0(s1)
420046cc: 0004a223      sw      zero, 0x4(s1)
420046d0: 000a8f63      beqz    s5, 0x420046ee <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h93ee520a4872cfc8+0x9e>
420046d4: 008aa583      lw      a1, 0x8(s5)
420046d8: 854a          mv      a0, s2
420046da: 9582          jalr    a1
420046dc: 40cc          lw      a1, 0x4(s1)
420046de: 4488          lw      a0, 0x8(s1)
420046e0: 0154a223      sw      s5, 0x4(s1)
420046e4: 0124a423      sw      s2, 0x8(s1)
420046e8: c199          beqz    a1, 0x420046ee <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h93ee520a4872cfc8+0x9e>
420046ea: 45cc          lw      a1, 0xc(a1)
420046ec: 9582          jalr    a1
420046ee: 000a1a63      bnez    s4, 0x42004702 <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h93ee520a4872cfc8+0xb2>
420046f2: 0089f513      andi    a0, s3, 0x8
420046f6: 00048023      sb      zero, 0x0(s1)
420046fa: c501          beqz    a0, 0x42004702 <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h93ee520a4872cfc8+0xb2>
420046fc: 4521          li      a0, 0x8
420046fe: 30052073      csrs    mstatus, a0
42004702: 40f2          lw      ra, 0x1c(sp)
42004704: 4462          lw      s0, 0x18(sp)
42004706: 44d2          lw      s1, 0x14(sp)
42004708: 4942          lw      s2, 0x10(sp)
4200470a: 49b2          lw      s3, 0xc(sp)
4200470c: 4a22          lw      s4, 0x8(sp)
4200470e: 4a92          lw      s5, 0x4(sp)
42004710: 6105          addi    sp, sp, 0x20
42004712: 8082          ret
enumset loops (old code), `opt_level: 's'`: very sub-optimal code: looping over EnumSet fields, using a table in `.rodata`
42003d6e <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h6a29442e950718f9>:
42003d6e: 60016837     	lui	a6, 0x60016
42003d72: 03c82503     	lw	a0, 0x3c(a6)
42003d76: 01157613     	andi	a2, a0, 0x11
42003d7a: ca39         	beqz	a2, 0x42003dd0 <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h6a29442e950718f9+0x62>
42003d7c: 04082603     	lw	a2, 0x40(a6)
42003d80: 4515         	li	a0, 0x5
42003d82: 4685         	li	a3, 0x1
42003d84: 3c012737     	lui	a4, 0x3c012
42003d88: ffc70713     	addi	a4, a4, -0x4
42003d8c: fff50793     	addi	a5, a0, -0x1
42003d90: fff54593     	not	a1, a0
42003d94: 8dfd         	and	a1, a1, a5
42003d96: 0015d793     	srli	a5, a1, 0x1
42003d9a: 0557f793     	andi	a5, a5, 0x55
42003d9e: 8d9d         	sub	a1, a1, a5
42003da0: 0335f793     	andi	a5, a1, 0x33
42003da4: 8189         	srli	a1, a1, 0x2
42003da6: 0335f593     	andi	a1, a1, 0x33
42003daa: 95be         	add	a1, a1, a5
42003dac: 0045d793     	srli	a5, a1, 0x4
42003db0: 95be         	add	a1, a1, a5
42003db2: 89bd         	andi	a1, a1, 0xf
42003db4: 00b697b3     	sll	a5, a3, a1
42003db8: 058a         	slli	a1, a1, 0x2
42003dba: 95ba         	add	a1, a1, a4
42003dbc: 418c         	lw	a1, 0x0(a1)
42003dbe: fff7c793     	not	a5, a5
42003dc2: 8d7d         	and	a0, a0, a5
42003dc4: 0ff57793     	andi	a5, a0, 0xff
42003dc8: 8e6d         	and	a2, a2, a1
42003dca: f3e9         	bnez	a5, 0x42003d8c <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h6a29442e950718f9+0x1e>
42003dcc: 4701         	li	a4, 0x0
42003dce: a231         	j	0x42003eda <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h6a29442e950718f9+0x16c>
42003dd0: 02257613     	andi	a2, a0, 0x22
42003dd4: ca31         	beqz	a2, 0x42003e28 <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h6a29442e950718f9+0xba>
42003dd6: 04082603     	lw	a2, 0x40(a6)
42003dda: 4515         	li	a0, 0x5
42003ddc: 3c0126b7     	lui	a3, 0x3c012
42003de0: 00868693     	addi	a3, a3, 0x8
42003de4: fff50593     	addi	a1, a0, -0x1
42003de8: fff54713     	not	a4, a0
42003dec: 8df9         	and	a1, a1, a4
42003dee: 0015d713     	srli	a4, a1, 0x1
42003df2: 05577713     	andi	a4, a4, 0x55
42003df6: 8d99         	sub	a1, a1, a4
42003df8: 0335f713     	andi	a4, a1, 0x33
42003dfc: 8189         	srli	a1, a1, 0x2
42003dfe: 0335f593     	andi	a1, a1, 0x33
42003e02: 95ba         	add	a1, a1, a4
42003e04: 0045d713     	srli	a4, a1, 0x4
42003e08: 95ba         	add	a1, a1, a4
42003e0a: 89bd         	andi	a1, a1, 0xf
42003e0c: 4705         	li	a4, 0x1
42003e0e: 00b717b3     	sll	a5, a4, a1
42003e12: 058a         	slli	a1, a1, 0x2
42003e14: 95b6         	add	a1, a1, a3
42003e16: 418c         	lw	a1, 0x0(a1)
42003e18: fff7c793     	not	a5, a5
42003e1c: 8d7d         	and	a0, a0, a5
42003e1e: 0ff57793     	andi	a5, a0, 0xff
42003e22: 8e6d         	and	a2, a2, a1
42003e24: f3e1         	bnez	a5, 0x42003de4 <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h6a29442e950718f9+0x76>
42003e26: a855         	j	0x42003eda <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h6a29442e950718f9+0x16c>
42003e28: 04457613     	andi	a2, a0, 0x44
42003e2c: ca39         	beqz	a2, 0x42003e82 <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h6a29442e950718f9+0x114>
42003e2e: 04082603     	lw	a2, 0x40(a6)
42003e32: 4515         	li	a0, 0x5
42003e34: 4685         	li	a3, 0x1
42003e36: 3c012737     	lui	a4, 0x3c012
42003e3a: 01470713     	addi	a4, a4, 0x14
42003e3e: fff50593     	addi	a1, a0, -0x1
42003e42: fff54793     	not	a5, a0
42003e46: 8dfd         	and	a1, a1, a5
42003e48: 0015d793     	srli	a5, a1, 0x1
42003e4c: 0557f793     	andi	a5, a5, 0x55
42003e50: 8d9d         	sub	a1, a1, a5
42003e52: 0335f793     	andi	a5, a1, 0x33
42003e56: 8189         	srli	a1, a1, 0x2
42003e58: 0335f593     	andi	a1, a1, 0x33
42003e5c: 95be         	add	a1, a1, a5
42003e5e: 0045d793     	srli	a5, a1, 0x4
42003e62: 95be         	add	a1, a1, a5
42003e64: 89bd         	andi	a1, a1, 0xf
42003e66: 00b697b3     	sll	a5, a3, a1
42003e6a: 058a         	slli	a1, a1, 0x2
42003e6c: 95ba         	add	a1, a1, a4
42003e6e: 418c         	lw	a1, 0x0(a1)
42003e70: fff7c793     	not	a5, a5
42003e74: 8d7d         	and	a0, a0, a5
42003e76: 0ff57793     	andi	a5, a0, 0xff
42003e7a: 8e6d         	and	a2, a2, a1
42003e7c: f3e9         	bnez	a5, 0x42003e3e <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h6a29442e950718f9+0xd0>
42003e7e: 4709         	li	a4, 0x2
42003e80: a8a9         	j	0x42003eda <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h6a29442e950718f9+0x16c>
42003e82: 08857513     	andi	a0, a0, 0x88
42003e86: c151         	beqz	a0, 0x42003f0a <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h6a29442e950718f9+0x19c>
42003e88: 04082603     	lw	a2, 0x40(a6)
42003e8c: 4515         	li	a0, 0x5
42003e8e: 4685         	li	a3, 0x1
42003e90: 3c012737     	lui	a4, 0x3c012
42003e94: 02070713     	addi	a4, a4, 0x20
42003e98: fff50593     	addi	a1, a0, -0x1
42003e9c: fff54793     	not	a5, a0
42003ea0: 8dfd         	and	a1, a1, a5
42003ea2: 0015d793     	srli	a5, a1, 0x1
42003ea6: 0557f793     	andi	a5, a5, 0x55
42003eaa: 8d9d         	sub	a1, a1, a5
42003eac: 0335f793     	andi	a5, a1, 0x33
42003eb0: 8189         	srli	a1, a1, 0x2
42003eb2: 0335f593     	andi	a1, a1, 0x33
42003eb6: 95be         	add	a1, a1, a5
42003eb8: 0045d793     	srli	a5, a1, 0x4
42003ebc: 95be         	add	a1, a1, a5
42003ebe: 89bd         	andi	a1, a1, 0xf
42003ec0: 00b697b3     	sll	a5, a3, a1
42003ec4: 058a         	slli	a1, a1, 0x2
42003ec6: 95ba         	add	a1, a1, a4
42003ec8: 418c         	lw	a1, 0x0(a1)
42003eca: fff7c793     	not	a5, a5
42003ece: 8d7d         	and	a0, a0, a5
42003ed0: 0ff57793     	andi	a5, a0, 0xff
42003ed4: 8e6d         	and	a2, a2, a1
42003ed6: f3e9         	bnez	a5, 0x42003e98 <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h6a29442e950718f9+0x12a>
42003ed8: 470d         	li	a4, 0x3
42003eda: 1141         	addi	sp, sp, -0x10
42003edc: c606         	sw	ra, 0xc(sp)
42003ede: c422         	sw	s0, 0x8(sp)
42003ee0: 0800         	addi	s0, sp, 0x10
42003ee2: 00271513     	slli	a0, a4, 0x2
42003ee6: 0712         	slli	a4, a4, 0x4
42003ee8: 40a70533     	sub	a0, a4, a0
42003eec: 3fc815b7     	lui	a1, 0x3fc81
42003ef0: a4858593     	addi	a1, a1, -0x5b8
42003ef4: 952e         	add	a0, a0, a1
42003ef6: 04c82023     	sw	a2, 0x40(a6)
42003efa: 85aa         	mv	a1, a0
42003efc: 40b2         	lw	ra, 0xc(sp)
42003efe: 4422         	lw	s0, 0x8(sp)
42003f00: 0141         	addi	sp, sp, 0x10
42003f02: fffff317     	auipc	t1, 0xfffff
42003f06: 3d830067     	jr	0x3d8(t1) <_rtc_fast_persistent_start+0xf20032da>
42003f0a: 8082         	ret
conditionals (new code), `opt_level: 's'`: code optimized as expected
42003d70 <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h6a29442e950718f9>:
42003d70: 1141          addi    sp, sp, -0x10
42003d72: c606          sw      ra, 0xc(sp)
42003d74: c422          sw      s0, 0x8(sp)
42003d76: 0800          addi    s0, sp, 0x10
42003d78: 600165b7      lui     a1, 0x60016
42003d7c: 5dc8          lw      a0, 0x3c(a1)
42003d7e: 01157613      andi    a2, a0, 0x11
42003d82: c601          beqz    a2, 0x42003d8a <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h6a29442e950718f9+0x1a>
42003d84: 4501          li      a0, 0x0
42003d86: 5639          li      a2, -0x12
42003d88: a02d          j       0x42003db2 <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h6a29442e950718f9+0x42>
42003d8a: 02257613      andi    a2, a0, 0x22
42003d8e: c609          beqz    a2, 0x42003d98 <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h6a29442e950718f9+0x28>
42003d90: 4505          li      a0, 0x1
42003d92: fdd00613      li      a2, -0x23
42003d96: a831          j       0x42003db2 <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h6a29442e950718f9+0x42>
42003d98: 04457613      andi    a2, a0, 0x44
42003d9c: c609          beqz    a2, 0x42003da6 <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h6a29442e950718f9+0x36>
42003d9e: 4509          li      a0, 0x2
42003da0: fbb00613      li      a2, -0x45
42003da4: a039          j       0x42003db2 <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h6a29442e950718f9+0x42>
42003da6: 08857513      andi    a0, a0, 0x88
42003daa: c905          beqz    a0, 0x42003dda <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h6a29442e950718f9+0x6a>
42003dac: 450d          li      a0, 0x3
42003dae: f7700613      li      a2, -0x89
42003db2: 41b4          lw      a3, 0x40(a1)
42003db4: 8e75          and     a2, a2, a3
42003db6: 00251693      slli    a3, a0, 0x2
42003dba: 0512          slli    a0, a0, 0x4
42003dbc: 8d15          sub     a0, a0, a3
42003dbe: 3fc816b7      lui     a3, 0x3fc81
42003dc2: a4868693      addi    a3, a3, -0x5b8
42003dc6: 9536          add     a0, a0, a3
42003dc8: c1b0          sw      a2, 0x40(a1)
42003dca: 85aa          mv      a1, a0
42003dcc: 40b2          lw      ra, 0xc(sp)
42003dce: 4422          lw      s0, 0x8(sp)
42003dd0: 0141          addi    sp, sp, 0x10
42003dd2: fffff317      auipc   t1, 0xfffff
42003dd6: 50a30067      jr      0x50a(t1) <_rtc_fast_persistent_start+0xf20032dc>
42003dda: 40b2          lw      ra, 0xc(sp)
42003ddc: 4422          lw      s0, 0x8(sp)
42003dde: 0141          addi    sp, sp, 0x10
42003de0: 8082          ret
conditionals (new code), `opt_level: 3`: code optimized as expected, signaling the waker also inlined
42004650 <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h93ee520a4872cfc8>:
42004650: 1101          addi    sp, sp, -0x20
42004652: ce06          sw      ra, 0x1c(sp)
42004654: cc22          sw      s0, 0x18(sp)
42004656: ca26          sw      s1, 0x14(sp)
42004658: c84a          sw      s2, 0x10(sp)
4200465a: c64e          sw      s3, 0xc(sp)
4200465c: c452          sw      s4, 0x8(sp)
4200465e: c256          sw      s5, 0x4(sp)
42004660: 1000          addi    s0, sp, 0x20
42004662: 60016537      lui     a0, 0x60016
42004666: 5d4c          lw      a1, 0x3c(a0)
42004668: 0115f613      andi    a2, a1, 0x11
4200466c: c601          beqz    a2, 0x42004674 <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h93ee520a4872cfc8+0x24>
4200466e: 4581          li      a1, 0x0
42004670: 5639          li      a2, -0x12
42004672: a02d          j       0x4200469c <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h93ee520a4872cfc8+0x4c>
42004674: 0225f613      andi    a2, a1, 0x22
42004678: c609          beqz    a2, 0x42004682 <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h93ee520a4872cfc8+0x32>
4200467a: 4585          li      a1, 0x1
4200467c: fdd00613      li      a2, -0x23
42004680: a831          j       0x4200469c <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h93ee520a4872cfc8+0x4c>
42004682: 0445f613      andi    a2, a1, 0x44
42004686: c609          beqz    a2, 0x42004690 <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h93ee520a4872cfc8+0x40>
42004688: 4589          li      a1, 0x2
4200468a: fbb00613      li      a2, -0x45
4200468e: a039          j       0x4200469c <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h93ee520a4872cfc8+0x4c>
42004690: 0885f593      andi    a1, a1, 0x88
42004694: c5bd          beqz    a1, 0x42004702 <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h93ee520a4872cfc8+0xb2>
42004696: 458d          li      a1, 0x3
42004698: f7700613      li      a2, -0x89
4200469c: 4134          lw      a3, 0x40(a0)
4200469e: 8e75          and     a2, a2, a3
420046a0: c130          sw      a2, 0x40(a0)
420046a2: 00259513      slli    a0, a1, 0x2
420046a6: 0592          slli    a1, a1, 0x4
420046a8: 8d89          sub     a1, a1, a0
420046aa: 3fc814b7      lui     s1, 0x3fc81
420046ae: b4848493      addi    s1, s1, -0x4b8
420046b2: 94ae          add     s1, s1, a1
420046b4: 4981          li      s3, 0x0
420046b6: 300479f3      csrrci  s3, mstatus, 0x8
420046ba: 0004ca03      lbu     s4, 0x0(s1)
420046be: 0044aa83      lw      s5, 0x4(s1)
420046c2: 0084a903      lw      s2, 0x8(s1)
420046c6: 4505          li      a0, 0x1
420046c8: 00a48023      sb      a0, 0x0(s1)
420046cc: 0004a223      sw      zero, 0x4(s1)
420046d0: 000a8f63      beqz    s5, 0x420046ee <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h93ee520a4872cfc8+0x9e>
420046d4: 008aa583      lw      a1, 0x8(s5)
420046d8: 854a          mv      a0, s2
420046da: 9582          jalr    a1
420046dc: 40cc          lw      a1, 0x4(s1)
420046de: 4488          lw      a0, 0x8(s1)
420046e0: 0154a223      sw      s5, 0x4(s1)
420046e4: 0124a423      sw      s2, 0x8(s1)
420046e8: c199          beqz    a1, 0x420046ee <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h93ee520a4872cfc8+0x9e>
420046ea: 45cc          lw      a1, 0xc(a1)
420046ec: 9582          jalr    a1
420046ee: 000a1a63      bnez    s4, 0x42004702 <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h93ee520a4872cfc8+0xb2>
420046f2: 0089f513      andi    a0, s3, 0x8
420046f6: 00048023      sb      zero, 0x0(s1)
420046fa: c501          beqz    a0, 0x42004702 <esp_hal::rmt::__esp_hal_internal_async_interrupt_handler::h93ee520a4872cfc8+0xb2>
420046fc: 4521          li      a0, 0x8
420046fe: 30052073      csrs    mstatus, a0
42004702: 40f2          lw      ra, 0x1c(sp)
42004704: 4462          lw      s0, 0x18(sp)
42004706: 44d2          lw      s1, 0x14(sp)
42004708: 4942          lw      s2, 0x10(sp)
4200470a: 49b2          lw      s3, 0xc(sp)
4200470c: 4a22          lw      s4, 0x8(sp)
4200470e: 4a92          lw      s5, 0x4(sp)
42004710: 6105          addi    sp, sp, 0x20
42004712: 8082          ret

such that they reduce to a single load/modify/store operation
This iterator generates relatively complex code, which is not reliably
reduced to the intended simple memory access, even if the `events`
argument is constant at compile time. Rather, the loop might be
retained, which is inefficient and leads to much larger code.

Tested on riscv using cargo objdump, with the following cargo profile

[profile.release]
codegen-units = 1
debug = true
debug-assertions = false
incremental = false
lto = 'fat'
opt-level = 3|'s'
overflow-checks = false
@wisp3rwind wisp3rwind changed the title Rmt unroll Rmt: Improve codegen for enable_listen_interrupt Jan 15, 2025
@bugadani bugadani added the skip-changelog No changelog modification needed label Jan 15, 2025
@bugadani
Copy link
Contributor

Oh hmm that's unexpectedly significant, thanks!

@wisp3rwind wisp3rwind mentioned this pull request Jan 16, 2025
6 tasks
@jessebraham jessebraham added this pull request to the merge queue Jan 16, 2025
Merged via the queue into esp-rs:main with commit 866dc91 Jan 16, 2025
28 of 29 checks passed
@wisp3rwind wisp3rwind deleted the rmt-unroll branch January 16, 2025 22:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
skip-changelog No changelog modification needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants