sequential mtspr calls fail #122

stffrdhrn · 2021-06-20T06:46:43Z

I have converted to or1k-tests to use inline calls to mtspr() and mfspr(). This causes failures in the following cases in or1k-mmu.

See: openrisc/or1k-tests@32f0ba2

static int itlb_match_test (int way, int set)
{
...

  // Flush these areas incase cache doesn't write them through immediately
  mtspr(OR1K_SPR_DCACHE_DCBFR_ADDR, ta - 8);
  mtspr(OR1K_SPR_DCACHE_DCBFR_ADDR, ta);
(CASE 1 - assembly stores ta to stack at this time)

  mtspr(OR1K_SPR_DCACHE_DCBFR_ADDR, ta + PAGE_SIZE - 8);
  mtspr(OR1K_SPR_DCACHE_DCBFR_ADDR, ta + PAGE_SIZE);

  /* Shifting one in ITLBMR */
  add = (PAGE_SIZE*itlb_sets);
  // Space we'll access and expect the MMU to translate our requests
  ea = add + (set*PAGE_SIZE);

  /* Enable IMMU */
  immu_enable ();

  while (add != 0x00000000) {
(CASE 2)
    mtspr (OR1K_SPR_IMMU_ITLBW_MR_ADDR(way, set), ea | OR1K_SPR_IMMU_ITLBW_MR_V_MASK);
    mtspr (OR1K_SPR_IMMU_ITLBW_TR_ADDR(way, set), ta | ITLB_PR_NOLIMIT);

Case 1 - this happens when 1 nop after mtspr.

The first case is where we have mtspr following by a l.sw, this causes an invalidate to be sent to the dcache. At the same time as the write operation. The write fails.

Case 2 - this happens with 0 nop after mtspr.

The mtspr followed by mtspr causes the second write to OR1K_SPR_IMMU_ITLBW_TR_ADDR to be lost.

Workaround

Adding 2 l.nops after mtspr fixes the issue. Below in case 1 the l.sw came before the l.nop due to some optimizations from gcc, to fix it the asm for l.mtspr contains the nop inline i.e. l.mtspr %0, %1, 0; l.nop; l.nop. But the below is without the fix exposing the CPU issue, but it should still work.

Trace case1 - dcache spr mix with l.sw fails

    2ff8:       9e b9 20 00     l.addi r21,r25,8192
    2ffc:       aa 20 18 02     l.ori r17,r0,0x1802
    3000:       c0 11 d8 00     l.mtspr r17,r27,0x0  <--- mtspr(OR1K_SPR_DCACHE_DCBFR_ADDR, ta - 8);
    3004:       15 00 00 00     l.nop 0x0
    3008:       c0 11 c8 00     l.mtspr r17,r25,0x0   <--- mtspr(OR1K_SPR_DCACHE_DCBFR_ADDR, ta);
    300c:       d4 01 c8 24     l.sw 36(r1),r25          <--- This store fails to go to dcache (save ta)
    3010:       15 00 00 00     l.nop 0x0
    3014:       c0 11 b8 00     l.mtspr r17,r23,0x0
    3018:       15 00 00 00     l.nop 0x0
    301c:       c0 11 a8 00     l.mtspr r17,r21,0x0
    3020:       15 00 00 00     l.nop 0x0
    3024:       18 80 00 00     l.movhi r4,0x0

Trace case2 - immu spr fails

When we have a l.mtspr writing to the IMMU followed by another l.mtspr writing to the IMMU the second spr write may fail.

    24e0:       aa 94 08 00     l.ori r20,r20,0x800
    24e4:       aa 52 08 00     l.ori r18,r18,0x800
    24e8:       aa 10 03 c0     l.ori r16,r16,0x3c0
    24ec:       9f 51 ff ff     l.addi r26,r17,-1
    24f0:       aa 3e 00 01     l.ori r17,r30,0x1
    24f4:       c0 14 88 00     l.mtspr r20,r17,0x0  <--- This write to IMMUMR goes through
    24f8:       c0 12 80 00     l.mtspr r18,r16,0x0  <--- This write to IMMUTR never happens
    24fc:       d4 02 00 44     l.sw 68(r2),r0
    2500:       d4 02 00 48     l.sw 72(r2),r0
    2504:       d4 02 00 1c     l.sw 28(r2),r0

The text was updated successfully, but these errors were encountered:

stffrdhrn · 2021-06-20T12:18:17Z

Find the test newlib binaries and vcd files attached.

The file *mtspr.sw* is case 1, 0x300c is the pc address where the issue happens
The file *mtspr.mtspr* is case 2, 0x2fb4 is the pc address where the issue happens

Issue openrisc#122 Writing to the tlb rams was dependent on the spr_bus_ack as a means to only write during the first cycle after a spr_bus_stb_i. This works fine for a single write, but if we have a write to itlb_match_spr followed by itlb_trans_spr the spr_bus_ack never goes low so the write to itlb_trans_spr is skipped. Fix this by using the itlb_match_spr_cs_r and itlb_trans_spr_cs_r signals to control the "first cycle after spr_bus_stb_i" condition. This way writes can happen independently. Note, there may still be issues with a write to itlb_match followed by another write to itlb_match. Fixing that requires changing the "first cycle after spr_bus_stb_i" policy, which is similar to dmmu.

stffrdhrn · 2021-06-24T17:31:12Z

I was able to fix case 2 with patch stffrdhrn@8d7b044

The test still fails due to case 1.

Case 2 - mtspr / mtspr succeeds

We can see itlb_trans_we goes high indicating a successful write

Case 1 - store still fails

We can see that after the invalidate the cpu_req_i signal is not high during the WRITE phase causing the dcache to miss the write`

stffrdhrn · 2021-06-29T08:44:35Z

To test this I am using fusesoc with iverilog a fusesoc.conf as:

[library.elf-loader]
location = /home/shorne/work/openrisc/local-cores/elf-loader
sync-uri = /home/shorne/work/openrisc/local-cores/elf-loader/
sync-type = local
auto-sync = true

[library.intgen]
location = fusesoc_libraries/intgen
sync-uri = https://github.com/stffrdhrn/intgen.git
sync-type = git
auto-sync = true

[library.mor1kx-generic]
location = /home/shorne/work/openrisc/local-cores/mor1kx-generic
sync-uri = /home/shorne/work/openrisc/local-cores/mor1kx-generic
sync-type = local
auto-sync = true

[library.mor1kx]
location = /home/shorne/work/openrisc/local-cores/mor1kx
sync-uri = /home/shorne/work/openrisc/local-cores/mor1kx
sync-type = local
auto-sync = true

Then running fusesoc: fusesoc run --target mor1kx_tb mor1kx-generic --elf_load /home/shorne/work/openrisc/or1k-tests/native/build/or1k/or1k-mmu --vcd

The or1k-mmu version is the same as or1k-mmu.fail.0x2fb4.mtspr.mtspr.elf.gz uploaded earlier.

stffrdhrn · 2021-06-29T08:50:50Z

The investigation so far:

 PC 0x2b10   store FFFF FFFE into the dcache at way location 0x7ed
 PC 0x2f1c   store 0041 6000 into the dcache at way location 0x7ed - this fails
 PC 0x2f68   load FFFF FFFE from way location 0x7ed - this is written to the IMMU and causes a failure

Issue openrisc#122 This fixes case 1 from issue 122 where writes are missed after invalidate state. The change to invalidate_ack to to allow the invalidate_ack to be asserted after the invalidate happens not before. For writes we allow forwarding to WRITE state directly from invalidate at this point pending_write will still be high and we can use that to initialiate the write. When we have delays the cpu_req_i may be low when we get to the write state.

Issue openrisc#122 An additional fix for back to back spr writes. I was seeing that during back to back writes the busy_o signal would go low after the first spr write complete. This caused the fetch to proceed and present bad data to the bus caused by the second in-progress write. This is because ack goes high during the first write and stays high as long as we are writing. Introducing a busy tracker based on what writes are in progress seems to fix this issue.

Issue openrisc#122 An additional fix for back to back spr writes. I was seeing that during back to back writes the busy_o signal would go low after the first spr write complete. This caused the fetch to proceed and present bad data to the bus caused by the second in-progress write. This is because ack goes high during the first write and stays high as long as we are writing. Introducing a busy tracker based on the assuming that spr transactions take 2 clocks to complete.

Issue openrisc#122 Writing to the tlb rams was dependent on the spr_bus_ack as a means to only write during the first cycle after a spr_bus_stb_i. This works fine for a single write, but if we have a write to itlb_match_spr followed by itlb_trans_spr the spr_bus_ack never goes low so the write to itlb_trans_spr is skipped. Fix this by using the itlb_match_spr_cs_r and itlb_trans_spr_cs_r signals to control the "first cycle after spr_bus_stb_i" condition. This way writes can happen independently. Note, there may still be issues with a write to itlb_match followed by another write to itlb_match. Fixing that requires changing the "first cycle after spr_bus_stb_i" policy, which is similar to dmmu.

Issue openrisc#122 This fixes case 1 from issue 122 where writes are missed after invalidate state. The change to invalidate_ack to to allow the invalidate_ack to be asserted after the invalidate happens not before. For writes we allow forwarding to WRITE state directly from invalidate at this point pending_write will still be high and we can use that to initialiate the write. When we have delays the cpu_req_i may be low when we get to the write state.

Issue openrisc#122 An additional fix for back to back spr writes. I was seeing that during back to back writes the busy_o signal would go low after the first spr write complete. This caused the fetch to proceed and present bad data to the bus caused by the second in-progress write. This is because ack goes high during the first write and stays high as long as we are writing. Introducing a busy tracker based on the assuming that spr transactions take 2 clocks to complete.

Issue #122 Writing to the tlb rams was dependent on the spr_bus_ack as a means to only write during the first cycle after a spr_bus_stb_i. This works fine for a single write, but if we have a write to itlb_match_spr followed by itlb_trans_spr the spr_bus_ack never goes low so the write to itlb_trans_spr is skipped. Fix this by using the itlb_match_spr_cs_r and itlb_trans_spr_cs_r signals to control the "first cycle after spr_bus_stb_i" condition. This way writes can happen independently. Note, there may still be issues with a write to itlb_match followed by another write to itlb_match. Fixing that requires changing the "first cycle after spr_bus_stb_i" policy, which is similar to dmmu.

Issue #122 This fixes case 1 from issue 122 where writes are missed after invalidate state. The change to invalidate_ack to to allow the invalidate_ack to be asserted after the invalidate happens not before. For writes we allow forwarding to WRITE state directly from invalidate at this point pending_write will still be high and we can use that to initialiate the write. When we have delays the cpu_req_i may be low when we get to the write state.

Issue #122 An additional fix for back to back spr writes. I was seeing that during back to back writes the busy_o signal would go low after the first spr write complete. This caused the fetch to proceed and present bad data to the bus caused by the second in-progress write. This is because ack goes high during the first write and stays high as long as we are writing. Introducing a busy tracker based on the assuming that spr transactions take 2 clocks to complete.

This again fixes: openrisc#122 Testing to see how it helps with: openrisc#146 However, this time I hope it is more stable as formal now caught this bug and now asserts that the formal properties pass. The fix is slightly different than the original as now we only change the state machine to allow going directly from INVALIDATE to WRITE with write having higher priority over invalidate as per comment.

stffrdhrn mentioned this issue Jun 24, 2021

Fix issues related to successive mtspr/mtspr instructions #124

Merged

stffrdhrn closed this as completed Jul 1, 2021

stffrdhrn mentioned this issue Feb 28, 2022

Linux crashes with latest master #146

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sequential mtspr calls fail #122

sequential mtspr calls fail #122

stffrdhrn commented Jun 20, 2021 •

edited

Loading

stffrdhrn commented Jun 20, 2021 •

edited

Loading

stffrdhrn commented Jun 24, 2021 •

edited

Loading

stffrdhrn commented Jun 29, 2021

stffrdhrn commented Jun 29, 2021

sequential mtspr calls fail #122

sequential mtspr calls fail #122

Comments

stffrdhrn commented Jun 20, 2021 • edited Loading

Case 1 - this happens when 1 nop after mtspr.

Case 2 - this happens with 0 nop after mtspr.

Workaround

Trace case1 - dcache spr mix with l.sw fails

Trace case2 - immu spr fails

stffrdhrn commented Jun 20, 2021 • edited Loading

stffrdhrn commented Jun 24, 2021 • edited Loading

Case 2 - mtspr / mtspr succeeds

Case 1 - store still fails

stffrdhrn commented Jun 29, 2021

stffrdhrn commented Jun 29, 2021

stffrdhrn commented Jun 20, 2021 •

edited

Loading

stffrdhrn commented Jun 20, 2021 •

edited

Loading

stffrdhrn commented Jun 24, 2021 •

edited

Loading