Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sequential mtspr calls fail #122

Closed
stffrdhrn opened this issue Jun 20, 2021 · 4 comments
Closed

sequential mtspr calls fail #122

stffrdhrn opened this issue Jun 20, 2021 · 4 comments

Comments

@stffrdhrn
Copy link
Member

stffrdhrn commented Jun 20, 2021

I have converted to or1k-tests to use inline calls to mtspr() and mfspr(). This causes failures in the following cases in or1k-mmu.

See: openrisc/or1k-tests@32f0ba2

static int itlb_match_test (int way, int set)
{
...

  // Flush these areas incase cache doesn't write them through immediately
  mtspr(OR1K_SPR_DCACHE_DCBFR_ADDR, ta - 8);
  mtspr(OR1K_SPR_DCACHE_DCBFR_ADDR, ta);
(CASE 1 - assembly stores ta to stack at this time)

  mtspr(OR1K_SPR_DCACHE_DCBFR_ADDR, ta + PAGE_SIZE - 8);
  mtspr(OR1K_SPR_DCACHE_DCBFR_ADDR, ta + PAGE_SIZE);

  /* Shifting one in ITLBMR */
  add = (PAGE_SIZE*itlb_sets);
  // Space we'll access and expect the MMU to translate our requests
  ea = add + (set*PAGE_SIZE);

  /* Enable IMMU */
  immu_enable ();

  while (add != 0x00000000) {
(CASE 2)
    mtspr (OR1K_SPR_IMMU_ITLBW_MR_ADDR(way, set), ea | OR1K_SPR_IMMU_ITLBW_MR_V_MASK);
    mtspr (OR1K_SPR_IMMU_ITLBW_TR_ADDR(way, set), ta | ITLB_PR_NOLIMIT);

Case 1 - this happens when 1 nop after mtspr.

The first case is where we have mtspr following by a l.sw, this causes an invalidate to be sent to the dcache. At the same time as the write operation. The write fails.

Case 2 - this happens with 0 nop after mtspr.

The mtspr followed by mtspr causes the second write to OR1K_SPR_IMMU_ITLBW_TR_ADDR to be lost.

Workaround

Adding 2 l.nops after mtspr fixes the issue. Below in case 1 the l.sw came before the l.nop due to some optimizations from gcc, to fix it the asm for l.mtspr contains the nop inline i.e. l.mtspr %0, %1, 0; l.nop; l.nop. But the below is without the fix exposing the CPU issue, but it should still work.

Trace case1 - dcache spr mix with l.sw fails

    2ff8:       9e b9 20 00     l.addi r21,r25,8192
    2ffc:       aa 20 18 02     l.ori r17,r0,0x1802
    3000:       c0 11 d8 00     l.mtspr r17,r27,0x0  <--- mtspr(OR1K_SPR_DCACHE_DCBFR_ADDR, ta - 8);
    3004:       15 00 00 00     l.nop 0x0
    3008:       c0 11 c8 00     l.mtspr r17,r25,0x0   <--- mtspr(OR1K_SPR_DCACHE_DCBFR_ADDR, ta);
    300c:       d4 01 c8 24     l.sw 36(r1),r25          <--- This store fails to go to dcache (save ta)
    3010:       15 00 00 00     l.nop 0x0
    3014:       c0 11 b8 00     l.mtspr r17,r23,0x0
    3018:       15 00 00 00     l.nop 0x0
    301c:       c0 11 a8 00     l.mtspr r17,r21,0x0
    3020:       15 00 00 00     l.nop 0x0
    3024:       18 80 00 00     l.movhi r4,0x0

image

Trace case2 - immu spr fails

When we have a l.mtspr writing to the IMMU followed by another l.mtspr writing to the IMMU the second spr write may fail.

    24e0:       aa 94 08 00     l.ori r20,r20,0x800
    24e4:       aa 52 08 00     l.ori r18,r18,0x800
    24e8:       aa 10 03 c0     l.ori r16,r16,0x3c0
    24ec:       9f 51 ff ff     l.addi r26,r17,-1
    24f0:       aa 3e 00 01     l.ori r17,r30,0x1
    24f4:       c0 14 88 00     l.mtspr r20,r17,0x0  <--- This write to IMMUMR goes through
    24f8:       c0 12 80 00     l.mtspr r18,r16,0x0  <--- This write to IMMUTR never happens
    24fc:       d4 02 00 44     l.sw 68(r2),r0
    2500:       d4 02 00 48     l.sw 72(r2),r0
    2504:       d4 02 00 1c     l.sw 28(r2),r0

image

@stffrdhrn
Copy link
Member Author

stffrdhrn commented Jun 20, 2021

Find the test newlib binaries and vcd files attached.

The file *mtspr.sw* is case 1, 0x300c is the pc address where the issue happens
The file *mtspr.mtspr* is case 2, 0x2fb4 is the pc address where the issue happens

stffrdhrn added a commit to stffrdhrn/mor1kx that referenced this issue Jun 24, 2021
Issue openrisc#122

Writing to the tlb rams was dependent on the spr_bus_ack as a means
to only write during the first cycle after a spr_bus_stb_i.  This works
fine for a single write, but if we have a write to itlb_match_spr
followed by itlb_trans_spr the spr_bus_ack never goes low so the write
to itlb_trans_spr is skipped.

Fix this by using the itlb_match_spr_cs_r and itlb_trans_spr_cs_r
signals to control the "first cycle after spr_bus_stb_i" condition. This
way writes can happen independently.

Note, there may still be issues with a write to itlb_match followed by
another write to itlb_match.  Fixing that requires changing the "first
cycle after spr_bus_stb_i" policy, which is similar to dmmu.
stffrdhrn added a commit to stffrdhrn/mor1kx that referenced this issue Jun 24, 2021
Issue openrisc#122

Writing to the tlb rams was dependent on the spr_bus_ack as a means
to only write during the first cycle after a spr_bus_stb_i.  This works
fine for a single write, but if we have a write to itlb_match_spr
followed by itlb_trans_spr the spr_bus_ack never goes low so the write
to itlb_trans_spr is skipped.

Fix this by using the itlb_match_spr_cs_r and itlb_trans_spr_cs_r
signals to control the "first cycle after spr_bus_stb_i" condition. This
way writes can happen independently.

Note, there may still be issues with a write to itlb_match followed by
another write to itlb_match.  Fixing that requires changing the "first
cycle after spr_bus_stb_i" policy, which is similar to dmmu.
@stffrdhrn
Copy link
Member Author

stffrdhrn commented Jun 24, 2021

I was able to fix case 2 with patch stffrdhrn@8d7b044

The test still fails due to case 1.

Case 2 - mtspr / mtspr succeeds

We can see itlb_trans_we goes high indicating a successful write
Screenshot from 2021-06-25 02-23-57

Case 1 - store still fails

We can see that after the invalidate the cpu_req_i signal is not high during the WRITE phase causing the dcache to miss the write`
Screenshot from 2021-06-27 21-07-18

@stffrdhrn
Copy link
Member Author

To test this I am using fusesoc with iverilog a fusesoc.conf as:

[library.elf-loader]
location = /home/shorne/work/openrisc/local-cores/elf-loader
sync-uri = /home/shorne/work/openrisc/local-cores/elf-loader/
sync-type = local
auto-sync = true

[library.intgen]
location = fusesoc_libraries/intgen
sync-uri = https://github.com/stffrdhrn/intgen.git
sync-type = git
auto-sync = true

[library.mor1kx-generic]
location = /home/shorne/work/openrisc/local-cores/mor1kx-generic
sync-uri = /home/shorne/work/openrisc/local-cores/mor1kx-generic
sync-type = local
auto-sync = true

[library.mor1kx]
location = /home/shorne/work/openrisc/local-cores/mor1kx
sync-uri = /home/shorne/work/openrisc/local-cores/mor1kx
sync-type = local
auto-sync = true

Then running fusesoc: fusesoc run --target mor1kx_tb mor1kx-generic --elf_load /home/shorne/work/openrisc/or1k-tests/native/build/or1k/or1k-mmu --vcd

The or1k-mmu version is the same as or1k-mmu.fail.0x2fb4.mtspr.mtspr.elf.gz uploaded earlier.

@stffrdhrn
Copy link
Member Author

The investigation so far:

 PC 0x2b10   store FFFF FFFE into the dcache at way location 0x7ed
 PC 0x2f1c   store 0041 6000 into the dcache at way location 0x7ed - this fails
 PC 0x2f68   load FFFF FFFE from way location 0x7ed - this is written to the IMMU and causes a failure

stffrdhrn added a commit to stffrdhrn/mor1kx that referenced this issue Jun 30, 2021
Issue openrisc#122

This fixes case 1 from issue 122 where writes are missed after
invalidate state.

The change to invalidate_ack to to allow the invalidate_ack to
be asserted after the invalidate happens not before.

For writes we allow forwarding to WRITE state directly from invalidate
at this point pending_write will still be high and we can use that
to initialiate the write.  When we have delays the cpu_req_i may be
low when we get to the write state.
stffrdhrn added a commit to stffrdhrn/mor1kx that referenced this issue Jun 30, 2021
Issue openrisc#122

An additional fix for back to back spr writes.  I was seeing that during
back to back writes the busy_o signal would go low after the first spr
write complete. This caused the fetch to proceed and present bad data to
the bus caused by the second in-progress write.  This is because ack
goes high during the first write and stays high as long as we are
writing.

Introducing a busy tracker based on what writes are in progress seems to
fix this issue.
stffrdhrn added a commit to stffrdhrn/mor1kx that referenced this issue Jul 1, 2021
Issue openrisc#122

An additional fix for back to back spr writes.  I was seeing that during
back to back writes the busy_o signal would go low after the first spr
write complete. This caused the fetch to proceed and present bad data to
the bus caused by the second in-progress write.  This is because ack
goes high during the first write and stays high as long as we are
writing.

Introducing a busy tracker based on the assuming that spr transactions
take 2 clocks to complete.
stffrdhrn added a commit to stffrdhrn/mor1kx that referenced this issue Jul 1, 2021
Issue openrisc#122

Writing to the tlb rams was dependent on the spr_bus_ack as a means
to only write during the first cycle after a spr_bus_stb_i.  This works
fine for a single write, but if we have a write to itlb_match_spr
followed by itlb_trans_spr the spr_bus_ack never goes low so the write
to itlb_trans_spr is skipped.

Fix this by using the itlb_match_spr_cs_r and itlb_trans_spr_cs_r
signals to control the "first cycle after spr_bus_stb_i" condition. This
way writes can happen independently.

Note, there may still be issues with a write to itlb_match followed by
another write to itlb_match.  Fixing that requires changing the "first
cycle after spr_bus_stb_i" policy, which is similar to dmmu.
stffrdhrn added a commit to stffrdhrn/mor1kx that referenced this issue Jul 1, 2021
Issue openrisc#122

This fixes case 1 from issue 122 where writes are missed after
invalidate state.

The change to invalidate_ack to to allow the invalidate_ack to
be asserted after the invalidate happens not before.

For writes we allow forwarding to WRITE state directly from invalidate
at this point pending_write will still be high and we can use that
to initialiate the write.  When we have delays the cpu_req_i may be
low when we get to the write state.
stffrdhrn added a commit to stffrdhrn/mor1kx that referenced this issue Jul 1, 2021
Issue openrisc#122

An additional fix for back to back spr writes.  I was seeing that during
back to back writes the busy_o signal would go low after the first spr
write complete. This caused the fetch to proceed and present bad data to
the bus caused by the second in-progress write.  This is because ack
goes high during the first write and stays high as long as we are
writing.

Introducing a busy tracker based on the assuming that spr transactions
take 2 clocks to complete.
skristiansson pushed a commit that referenced this issue Jul 1, 2021
Issue #122

Writing to the tlb rams was dependent on the spr_bus_ack as a means
to only write during the first cycle after a spr_bus_stb_i.  This works
fine for a single write, but if we have a write to itlb_match_spr
followed by itlb_trans_spr the spr_bus_ack never goes low so the write
to itlb_trans_spr is skipped.

Fix this by using the itlb_match_spr_cs_r and itlb_trans_spr_cs_r
signals to control the "first cycle after spr_bus_stb_i" condition. This
way writes can happen independently.

Note, there may still be issues with a write to itlb_match followed by
another write to itlb_match.  Fixing that requires changing the "first
cycle after spr_bus_stb_i" policy, which is similar to dmmu.
skristiansson pushed a commit that referenced this issue Jul 1, 2021
Issue #122

This fixes case 1 from issue 122 where writes are missed after
invalidate state.

The change to invalidate_ack to to allow the invalidate_ack to
be asserted after the invalidate happens not before.

For writes we allow forwarding to WRITE state directly from invalidate
at this point pending_write will still be high and we can use that
to initialiate the write.  When we have delays the cpu_req_i may be
low when we get to the write state.
skristiansson pushed a commit that referenced this issue Jul 1, 2021
Issue #122

An additional fix for back to back spr writes.  I was seeing that during
back to back writes the busy_o signal would go low after the first spr
write complete. This caused the fetch to proceed and present bad data to
the bus caused by the second in-progress write.  This is because ack
goes high during the first write and stays high as long as we are
writing.

Introducing a busy tracker based on the assuming that spr transactions
take 2 clocks to complete.
stffrdhrn added a commit to stffrdhrn/mor1kx that referenced this issue Apr 23, 2022
This again fixes: openrisc#122

Testing to see how it helps with: openrisc#146

However, this time I hope it is more stable as formal now caught this
bug and now asserts that the formal properties pass.

The fix is slightly different than the original as now we only change
the state machine to allow going directly from INVALIDATE to WRITE with
write having higher priority over invalidate as per comment.
stffrdhrn added a commit to stffrdhrn/mor1kx that referenced this issue May 14, 2022
This again fixes: openrisc#122

Testing to see how it helps with: openrisc#146

However, this time I hope it is more stable as formal now caught this
bug and now asserts that the formal properties pass.

The fix is slightly different than the original as now we only change
the state machine to allow going directly from INVALIDATE to WRITE with
write having higher priority over invalidate as per comment.
stffrdhrn added a commit to stffrdhrn/mor1kx that referenced this issue May 16, 2022
This again fixes: openrisc#122

Testing to see how it helps with: openrisc#146

However, this time I hope it is more stable as formal now caught this
bug and now asserts that the formal properties pass.

The fix is slightly different than the original as now we only change
the state machine to allow going directly from INVALIDATE to WRITE with
write having higher priority over invalidate as per comment.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant