Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semantics of is_included_in_clobbers not ideal #4425

Closed
uweigand opened this issue Jul 9, 2022 · 2 comments
Closed

Semantics of is_included_in_clobbers not ideal #4425

uweigand opened this issue Jul 9, 2022 · 2 comments
Labels
cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. cranelift:discussion cranelift Issues related to the Cranelift code generator

Comments

@uweigand
Copy link
Member

uweigand commented Jul 9, 2022

In implementing SIMD support for s390x, I had to support vector registers that are defined as partially callee-saved by ABI. This is the same situation that already exists on AArch64, and I used the same method: removing Call instructions from consideration when computing function-wide clobbers via the is_included_in_clobbers callback.

This works in general, but there is one unexpected issue. On s390x, calls work by loading the return address into a "link register", which the callee will use to return back to the caller. This means of course that the call instruction clobbers that link register - so if the caller later has to return to its caller, it will need the original value of the link register at function entry, and therefore has to save/restore it.

This used to work fine since the link register is marked as "defined" by the call instruction. However, once call instructions are ignored via is_included_in_clobbers, that definition of the link register is also ignored. I would have expected that only the clobbers of the call instruction would be ignored - but the current implementation ignores the instruction completely.

Now, I can work around this by implicitly considering the link register clobbered in any non-leaf routine. But this seems a bit awkward. Any thoughts?

FYI @cfallin

uweigand added a commit to uweigand/wasmtime that referenced this issue Jul 9, 2022
This adds full support for all Cranelift SIMD instructions
to the s390x target.  Everything is matched fully via ISLE.

In addition to adding support for many new instructions,
and the lower.isle code to match all SIMD IR patterns,
this patch also adds ABI support for vector types.
In particular, we now need to handle the fact that
vector registers 8 .. 15 are partially callee-saved,
i.e. the high parts of those registers (which correspond
to the old floating-poing registers) are callee-saved,
but the low parts are not.  This is the exact same situation
that we already have on AArch64, and so this patch uses the
same solution (the is_included_in_clobbers callback).

The bulk of the changes are platform-specific, but there are
a few exceptions:

- Added ISLE extractors for the Immediate and Constant types,
  to enable matching the vconst and swizzle instructions.

- Added a missing accessor for call_conv to ABISig.

- Fixed endian conversion for vector types in data_value.rs
  to enable their use in runtests on the big-endian platforms.

- Enabled (nearly) all SIMD runtests on s390x.  [ Two test cases
  remain disabled due to vector shift count semantics, see below. ]

- Enabled all Wasmtime SIMD tests on s390x.

There are three minor issues, called out via FIXMEs below,
which should be addressed in the future, but should not be
blockers to getting this patch merged.  I've opened the
following issues to track them:

- Vector shift count semantics
  bytecodealliance#4424

- is_included_in_clobbers vs. link register
  bytecodealliance#4425

- gen_constant callback
  bytecodealliance#4426

All tests, including all newly enabled SIMD tests, pass
on both z14 and z15 architectures.
uweigand added a commit to uweigand/wasmtime that referenced this issue Jul 9, 2022
This adds full support for all Cranelift SIMD instructions
to the s390x target.  Everything is matched fully via ISLE.

In addition to adding support for many new instructions,
and the lower.isle code to match all SIMD IR patterns,
this patch also adds ABI support for vector types.
In particular, we now need to handle the fact that
vector registers 8 .. 15 are partially callee-saved,
i.e. the high parts of those registers (which correspond
to the old floating-poing registers) are callee-saved,
but the low parts are not.  This is the exact same situation
that we already have on AArch64, and so this patch uses the
same solution (the is_included_in_clobbers callback).

The bulk of the changes are platform-specific, but there are
a few exceptions:

- Added ISLE extractors for the Immediate and Constant types,
  to enable matching the vconst and swizzle instructions.

- Added a missing accessor for call_conv to ABISig.

- Fixed endian conversion for vector types in data_value.rs
  to enable their use in runtests on the big-endian platforms.

- Enabled (nearly) all SIMD runtests on s390x.  [ Two test cases
  remain disabled due to vector shift count semantics, see below. ]

- Enabled all Wasmtime SIMD tests on s390x.

There are three minor issues, called out via FIXMEs below,
which should be addressed in the future, but should not be
blockers to getting this patch merged.  I've opened the
following issues to track them:

- Vector shift count semantics
  bytecodealliance#4424

- is_included_in_clobbers vs. link register
  bytecodealliance#4425

- gen_constant callback
  bytecodealliance#4426

All tests, including all newly enabled SIMD tests, pass
on both z14 and z15 architectures.
uweigand added a commit to uweigand/wasmtime that referenced this issue Jul 9, 2022
This adds full support for all Cranelift SIMD instructions
to the s390x target.  Everything is matched fully via ISLE.

In addition to adding support for many new instructions,
and the lower.isle code to match all SIMD IR patterns,
this patch also adds ABI support for vector types.
In particular, we now need to handle the fact that
vector registers 8 .. 15 are partially callee-saved,
i.e. the high parts of those registers (which correspond
to the old floating-poing registers) are callee-saved,
but the low parts are not.  This is the exact same situation
that we already have on AArch64, and so this patch uses the
same solution (the is_included_in_clobbers callback).

The bulk of the changes are platform-specific, but there are
a few exceptions:

- Added ISLE extractors for the Immediate and Constant types,
  to enable matching the vconst and swizzle instructions.

- Added a missing accessor for call_conv to ABISig.

- Fixed endian conversion for vector types in data_value.rs
  to enable their use in runtests on the big-endian platforms.

- Enabled (nearly) all SIMD runtests on s390x.  [ Two test cases
  remain disabled due to vector shift count semantics, see below. ]

- Enabled all Wasmtime SIMD tests on s390x.

There are three minor issues, called out via FIXMEs below,
which should be addressed in the future, but should not be
blockers to getting this patch merged.  I've opened the
following issues to track them:

- Vector shift count semantics
  bytecodealliance#4424

- is_included_in_clobbers vs. link register
  bytecodealliance#4425

- gen_constant callback
  bytecodealliance#4426

All tests, including all newly enabled SIMD tests, pass
on both z14 and z15 architectures.
@akirilov-arm
Copy link
Contributor

Now, I can work around this by implicitly considering the link register clobbered in any non-leaf routine. But this seems a bit awkward. Any thoughts?

The 64-bit Arm architecture works in pretty much the same way with respect to the link register, and that is the solution that has been chosen.

@akirilov-arm akirilov-arm added cranelift Issues related to the Cranelift code generator cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. cranelift:discussion labels Jul 11, 2022
uweigand added a commit to uweigand/wasmtime that referenced this issue Jul 12, 2022
This adds full support for all Cranelift SIMD instructions
to the s390x target.  Everything is matched fully via ISLE.

In addition to adding support for many new instructions,
and the lower.isle code to match all SIMD IR patterns,
this patch also adds ABI support for vector types.
In particular, we now need to handle the fact that
vector registers 8 .. 15 are partially callee-saved,
i.e. the high parts of those registers (which correspond
to the old floating-poing registers) are callee-saved,
but the low parts are not.  This is the exact same situation
that we already have on AArch64, and so this patch uses the
same solution (the is_included_in_clobbers callback).

The bulk of the changes are platform-specific, but there are
a few exceptions:

- Added ISLE extractors for the Immediate and Constant types,
  to enable matching the vconst and swizzle instructions.

- Added a missing accessor for call_conv to ABISig.

- Fixed endian conversion for vector types in data_value.rs
  to enable their use in runtests on the big-endian platforms.

- Enabled (nearly) all SIMD runtests on s390x.  [ Two test cases
  remain disabled due to vector shift count semantics, see below. ]

- Enabled all Wasmtime SIMD tests on s390x.

There are three minor issues, called out via FIXMEs below,
which should be addressed in the future, but should not be
blockers to getting this patch merged.  I've opened the
following issues to track them:

- Vector shift count semantics
  bytecodealliance#4424

- is_included_in_clobbers vs. link register
  bytecodealliance#4425

- gen_constant callback
  bytecodealliance#4426

All tests, including all newly enabled SIMD tests, pass
on both z14 and z15 architectures.
cfallin pushed a commit that referenced this issue Jul 18, 2022
This adds full support for all Cranelift SIMD instructions
to the s390x target.  Everything is matched fully via ISLE.

In addition to adding support for many new instructions,
and the lower.isle code to match all SIMD IR patterns,
this patch also adds ABI support for vector types.
In particular, we now need to handle the fact that
vector registers 8 .. 15 are partially callee-saved,
i.e. the high parts of those registers (which correspond
to the old floating-poing registers) are callee-saved,
but the low parts are not.  This is the exact same situation
that we already have on AArch64, and so this patch uses the
same solution (the is_included_in_clobbers callback).

The bulk of the changes are platform-specific, but there are
a few exceptions:

- Added ISLE extractors for the Immediate and Constant types,
  to enable matching the vconst and swizzle instructions.

- Added a missing accessor for call_conv to ABISig.

- Fixed endian conversion for vector types in data_value.rs
  to enable their use in runtests on the big-endian platforms.

- Enabled (nearly) all SIMD runtests on s390x.  [ Two test cases
  remain disabled due to vector shift count semantics, see below. ]

- Enabled all Wasmtime SIMD tests on s390x.

There are three minor issues, called out via FIXMEs below,
which should be addressed in the future, but should not be
blockers to getting this patch merged.  I've opened the
following issues to track them:

- Vector shift count semantics
  #4424

- is_included_in_clobbers vs. link register
  #4425

- gen_constant callback
  #4426

All tests, including all newly enabled SIMD tests, pass
on both z14 and z15 architectures.
@uweigand
Copy link
Member Author

OK, given we need to special-case the link register anyway to support preserve_frame_pointers (#4477), I think this is fine after all. Closing this issue now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. cranelift:discussion cranelift Issues related to the Cranelift code generator
Projects
None yet
Development

No branches or pull requests

2 participants