Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is FENCE not fully clear? #1782

Closed
AFOliveira opened this issue Dec 19, 2024 · 19 comments · Fixed by #1824
Closed

Is FENCE not fully clear? #1782

AFOliveira opened this issue Dec 19, 2024 · 19 comments · Fixed by #1824

Comments

@AFOliveira
Copy link
Contributor

This came up because riscv-opcodes specifies fence.tso as a pseudo-instruction here, but binutils and LLVM specify it as an instruction. Moreover, I found this commit stating it is not a pseudo-instruction, but it's 5 years old and the only source I could find.
Thanks in advance.

@gfavor
Copy link
Collaborator

gfavor commented Dec 20, 2024

FENCE.TSO is defined in the Unpriv spec as an instruction - which mirrors the fact that it has a distinct encoding from all other defined FENCE instructions. (Hence FENCE.TSO is not a pseudonym for some other instruction or sequence of instructions.)

@AFOliveira
Copy link
Contributor Author

Thanks for clarifying!

@trdthg
Copy link
Contributor

trdthg commented Jan 22, 2025

Hi, I would like to share what I know, https://github.com/riscv-non-isa/riscv-asm-manual has all the pseudo instructions, you can find fence in the table below (second to last)

https://github.com/riscv-non-isa/riscv-asm-manual/blob/0a043a0059b8bfd336e88660733e77dd0ca1ae97/src/asm-manual.adoc#a-listing-of-standard-risc-v-pseudoinstructions

Because I was also looking for a list of all pseudo-instructions before, and finally I found this repository from #1470

@AFOliveira
Copy link
Contributor Author

@trdthg I understand that, thanks for the pointer but I think it is a bit more tricky than just those pseudo-instructions. I believe the concept of FENCE bifurcates into 2 different things and that's where my confusion is coming from.

From what I can understand FENCE is both a category and an instruction. What I mean by category is that at some point I see FENCE being described with fm as a variable field, but to be a normal FENCE there needs to be fm=0000. If we look at the encoding description in the end of the manual fm appears again as variable.

Image

To me, it seems that FENCE.TSO is a FENCE(category), but it needs to not be confused due to a pseudo-instruction due to it's different encoding, as @gfavor explained. Wouldn't it all be more clear if when we describe FENCE encoding in the above picture the fm field is set to 0001?

It would even solve the problem we faced on this PR riscv/riscv-opcodes#330.

@ThinkOpenly
Copy link
Contributor

Wouldn't it all be more clear if when we describe FENCE encoding in the above picture the fm field is set to 0001?

[nit] That should read "0000".

I'll note here that is exactly what the Sail code does:

mapping clause encdec = FENCE(pred, succ)
  <-> 0b0000 @ pred @ succ @ 0b00000 @ 0b000 @ 0b00000 @ 0b0001111

All other encodings would be considered "ILLEGAL". (I presume the C emulator derived from Sail would fault.)

The spec, though, has a "shall" clause for the other (reserved) encodings:

Base implementations shall treat all such reserved configurations as normal fences with fm=0000, and standard software shall use only non-reserved configurations.

@allenjbaum
Copy link

allenjbaum commented Jan 22, 2025 via email

@gfavor
Copy link
Collaborator

gfavor commented Jan 22, 2025

FENCE instructions, as a category, all have (or will have, for future new fence instructions) the encoding described in the spec for Memory Ordering Instructions. The fm field, as the spec says, "defines the semantics of the FENCE." And it says that FENCE instructions with fm=0000 order all memory operations in its predecessor set before all memory operations in its successor set. Whereas the one currently defined fence with fm!=0000 is specified to have somewhat different semantics.

But these are all FENCE instructions, just different forms of such. And this is orthogonal to there being some pseudo-instructions, for programmer convenience, that equate to a few specific defined FENCE instructions.

So I guess I'm missing what is the problem or confusing aspect of all this?

@ThinkOpenly
Copy link
Contributor

One challenge is succinctly defining what comprises the FENCE instruction encoding, given the existence of the FENCE.TSO instruction which is exactly FENCE with the fm field set to 1000. So, FENCE must be every FENCE-like encoding except when fm=1000. The ISA doesn't make that perfectly clear, leaving the value of the fm field for FENCE as just "fm" (no restrictions). Conveniently, FENCE.TSO is defined in the next row of the table, but one needs to look there to observe that there is a restriction on the fm value for FENCE.

There's no obvious syntax for FENCE, but there are a number of examples like FENCE a,b where a and b are symbolic representations of the bits associated with predecessor and successor, respectively. Importantly, there is never mentioned a means to directly set the fm field. This implies that that field can only be set by changing the mnemonic, as is done with FENCE.TSO.

If the mnemonic is the only means to change the "fm" field, then the only valid value for the current FENCE mnemonic is 0000, and the documentation should reflect that. It could say that other values of the "fm" field are reserved for future "fence" instructions, but those future "fence" instructions will have new mnemonics, no?

@gfavor
Copy link
Collaborator

gfavor commented Jan 22, 2025

The category of "fence" instructions is defined by the "Memory Ordering Instructions" section of the Unpriv spec - irrespective of instruction mnemonics. That's one succinct definition of this class of instructions. Another succinct definition is all the instructions using this encoding - with any value for the fm field. In other words, the fm field distinguishes sub-classes of "fence" instructions. Where fm=0000 is one sub-class with specified general semantics, and fm=1000 is another sub-class with somewhat different semantics. And other fm values are reserved for future new forms of "fence" instructions.

One can then observe that, for all the currently defined fence-class instructions, the 'FENCE' mnemonic is used for all the fm=0000 instructions, and 'FENCE.TSO' is used for the fm=1000 instruction.

As far as instruction mnemonic implying fm field value, yes, that is currently true. And this may or may not continue to be true, although one can imagine a preference to maintain that property unless good reason emerges to do otherwise. Further, trying to predict the future as to what is going to be new forms of "fence" instructions, what flavors of them will be needed, and what their mnemonic representations should be, is unnecessary and some woudl consider unwise. Which is why the current spec stays away from trying to do so.

Lastly, irrespective of these specifics, there will always be a clear mapping from instruction mnemonics and syntax, to instruction encodings (e.g. irrespective of whether 'FENCE' and only 'FENCE' continues to imply fm=0000).

@ThinkOpenly
Copy link
Contributor

As far as instruction mnemonic implying fm field value, yes, that is currently true. And this may or may not continue to be true, although one can imagine a preference to maintain that property unless good reason emerges to do otherwise.

Does this mean the syntax for the instruction defined with the FENCE mnemonic could change? That would be the only way I see for different "fm" values for that exact mnemonic. And, that would not work for current assemblers, at least. That seems non-optimal.

I still kinda wonder what the table entries that Afonso shows above are describing:

Image

None of the other entries appear to describe a "class" of instructions where to use the entire range of field values would require new syntax or new mnemonics.

I'd argue we're stuck with the current definition of the instruction with the FENCE mnemonic, where fm=0000 always. This is how the Sail code defines it. There is only one defined instruction using the FENCE mnemonic. Other encodings with the "fm" field set to other than 0000 or 1000 are reserved, but treated as FENCE with fm=0000.

Part of the confusion may stem from the ISA spec not really talking about assembly syntax, but dabbling in it. Table 4 in section 2.8 (below) calls "a FENCE" with fm=0000 "Normal fence" and specifies "Mnemonic" as "none". With fm=1000 the "Mnemonic" is "TSO". Other values for "fm" leave the "Mnemonic" as "other" and "Reserved for future use."

Image

There is no succinct definition of the syntax in the ISA.

Is it fair to conclude:

  • The instruction defined with the FENCE mnemonic has fm=0000.
  • The instruction defined with the FENCE.TSO mnemonic has fm=1000 (and "RW,RW")
  • All other cases might be called "FENCE" instructions, but are reserved and should currently be treated as the instruction with the FENCE mnemonic with fm=0000.

@gfavor
Copy link
Collaborator

gfavor commented Jan 23, 2025

Does this mean the syntax for the instruction defined with the FENCE mnemonic could change?

For the currently defined instructions, certainly not. That would be changing a ratified arch extension.

As far as future fence instruction possibilities, one can imagine absolutes on what can and can't happen. But there are no such official/ratified absolutes constraining what may be done in the future for new arch extensions defining new fence-class instructions.

But practically speaking it would seem very unlikely to have a new instruction that simply uses the 'FENCE' mnemonic. (Although it's not hard to imagine how one could define a new fence instruction that uses this mnemonic but with different arguments - which serve to distinguish this new fence instruction from the existing instructions.)

As far as the table entry for 'FENCE.TSO', that mnemonic, readily distinguishable from the 'FENCE' mnemonic, represents a different fence instruction with a different encoding from the 'FENCE' instructions - which happens to most notably be distinguished by fm=1000.

In table 4, the 'None' for the fm field refers not to the root instruction mnemonic, but to any suffix mnemonic to the root mnemonic. So the main fence instructions have no suffix, while 'FENCE.TSO' has a '.TSO' suffix. And, naturally, nothing is specified as far as suffix for other yet-to-be-defined fence instructions with fm!=x000.

Regarding the last three bullets in the last post, the firts two are readily correct. The third bullet is not really correct. Instructions with fm!=x000 will certainly be fence-class instructions, but we can only speculate as to what their mnemonics and mnemonic suffixes may or may not be (e.g. that the root mnemonic will probably be 'FENCE' and the suffix mnemonic will probably be something other than null or '.TSO').

@ThinkOpenly
Copy link
Contributor

Does this mean the syntax for the instruction defined with the FENCE mnemonic could change?

For the currently defined instructions, certainly not.

Great!

(Although it's not hard to imagine how one could define a new fence instruction that uses this mnemonic but with different arguments - which serve to distinguish this new fence instruction from the existing instructions.)

...but would break current assemblers. I guess a new mnemonic would also not work with current assemblers. :-) If only the original definition of FENCE included an (optional) "fm" operand. :-)

In table 4, the 'None' for the fm field refers not to the root instruction mnemonic, but to any suffix mnemonic to the root mnemonic. So the main fence instructions have no suffix, while 'FENCE.TSO' has a '.TSO' suffix.

Agreed. That's apparent with some brief thought, but it could be more clear. The term "suffix" is used elsewhere in the ISA, but not here. Indeed, in the ISA text, "suffix" always refers to the full suffix, including the ".". This table should probably show the "TSO" suffix as ".TSO", and the header for that column as "mnemonic suffix". (And perhaps move "other" under just "fm field" column, and extend "reserved for future use" into the "mnemonic suffix" column.)

Regarding the last three bullets in the last post, the firts two are readily correct. The third bullet is not really correct. Instructions with fm!=x000 will certainly be fence-class instructions, but we can only speculate as to what their mnemonics and mnemonic suffixes may or may not be (e.g. that the root mnemonic will probably be 'FENCE' and the suffix mnemonic will probably be something other than null or '.TSO').

I think we are in basic agreement here. I made no inference as to mnemonics for the reserved set of fence-class instructions. Maybe I should have said "fence-class instructions" instead of "FENCE instructions".

This ISA is quite loose in distinguishing these two concepts, using "the FENCE instruction", "a FENCE instruction", "FENCE instructions", and it uses "Fence" and "fence" (e.g. "the fence mode field fm") quite a bit. The ISA is also inconsistent with respect to font attributes, sometimes all-caps, sometimes fixed-width-bold-caps, sometimes lower case. The best specific reference to a the instruction with the FENCE mnemonic is "a FENCE instruction with fm=0000".

@AFOliveira, I think the resolution here is

  • FENCE mnemonic currently only defines a "FENCE" instruction with fm=0000.
  • FENCE.TSO mnemonic currently only defines a "FENCE" instruction with fm=1000 and RW, RW operands.
  • All other cases are currently reserved, but shall be treated as FENCE (fm=0000).

@gfavor
Copy link
Collaborator

gfavor commented Jan 24, 2025

Regarding the third bullet about Reserved cases, does that mean that the behavior is the same as for fm=0000, but using all the existing encoded P* and S* bits? Or are all Reserved cases treated the same as FENCE RW,RW irrespective of those eight instruction bits? Or the same as FENCE IORW,IORW? Or the same as FENCE with all the P8 and S* bits treated as if all zeroes - effectively making the instruction behave like a nop? Off-hand that would seem like the most appropriat emapping for current Reserved cases.

Also, please close this issue once you all are satisfied.

@ThinkOpenly
Copy link
Contributor

ThinkOpenly commented Jan 24, 2025

Regarding how reserved cases of the FENCE instruction work, this is all the ISA spec says:

Base implementations shall treat all such reserved configurations as normal fences with fm=0000, and standard software shall use only non-reserved configurations.

I read this as respecting the operands, so your first option "same as for fm=0000, but using all the existing encoded P* and S* bits".

I seem to have overrun @AFOliveira's issue, but to summarize possible documentation changes suggested in this issue:

  • Make all uses of "fence" in any current representation consistent with the desired meaning/context.
  • [Possibly redundant with above] Discriminate clearly between "FENCE" instructions (as a class, if you will: all possible values of field "fm") and the instruction which uses the FENCE mnemonic (no suffixes, fm=0000).
  • Consider changing the table in chapter 34, "RV32I Base Instruction Set" to show the only non-reserved value (0000) of the "fm" field for the FENCE instruction.
  • Adopt the suggested changes to the "Fence mode encoding" table, restated here:
    • Show the "TSO" suffix as ".TSO".
    • Change the header for the mnemonic suffix column to be "mnemonic suffix".
    • Move "other" under just "fm field" column.
    • Extend "reserved for future use" into the "mnemonic suffix" column.
  • Possibly clarify the behavior for reserved opcodes.
  • Add assembly syntax to the ISA :-)

@AFOliveira, does that cover everything (and maybe more)?

@gfavor
Copy link
Collaborator

gfavor commented Jan 24, 2025

I would agree with that interpretation of the spec as to how reserved configurations should be treated.

@aswaterman
Copy link
Member

aswaterman commented Jan 24, 2025

I read this as respecting the operands, so your first option "same as for fm=0000, but using all the existing encoded P* and S* bits".

Right. Translating that paragraph into pseudocode:

/* Ignore rd/rs1 altogether. */

if (fm == TSO && P == RW && S == RW) {
  FENCE.TSO
} else {
  FENCE P, S
}

@AFOliveira
Copy link
Contributor Author

AFOliveira commented Jan 24, 2025

Thanks for all the clarification. @ThinkOpenly, I think your suggestions are indeed a way forward so that the spec is more clear. Although I'm not sure how the first two points are super clear on how to act on them. Should we state FENCE category and FENCE instructions every time it is not clear to which the text refers?

Make all uses of "fence" in any current representation consistent with the desired meaning/context.
[Possibly redundant with above] Discriminate clearly between "FENCE" instructions (as a class, if you will: all possible values of field "fm") and the instruction which uses the FENCE mnemonic (no suffixes, fm=0000).

With the last four I agree and I think I could actually just do and PR the following:

Consider changing the table in chapter 34, "RV32I Base Instruction Set" to show the only non-reserved value (0000) of the "fm" field for the FENCE instruction.
Adopt the suggested changes to the "Fence mode encoding" table, restated here:
Show the "TSO" suffix as ".TSO".
Change the header for the mnemonic suffix column to be "mnemonic suffix".
Move "other" under just "fm field" column.
Extend "reserved for future use" into the "mnemonic suffix" column.

@aswaterman @gfavor I believe these are reasonable changes that would make the spec clearer, can I PR them?

@aswaterman
Copy link
Member

Sounds reasonable on its face, but I might have some comments after I see the result.

@AFOliveira
Copy link
Contributor Author

AFOliveira commented Jan 24, 2025

Sounds reasonable on its face, but I might have some comments after I see the result.

Sure, I'll do it and link the PR to this issue. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants