Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improper usage of #[inline(always)] and #[target_feature] results in LLVM instruction selection error #404

Closed
shepmaster opened this issue Mar 27, 2018 · 8 comments · Fixed by rust-lang/rust#49425

Comments

@shepmaster
Copy link
Member

#![feature(stdsimd)]
#![feature(target_feature)]

use std::arch::x86_64::*;

#[inline(always)]
#[target_feature(enable = "sse4.2")]
unsafe fn alpha(needle: &[u8], haystack: &[u8]) -> i32 {
    let a = _mm_loadu_si128(needle.as_ptr() as *const _);
    let b = _mm_loadu_si128(haystack.as_ptr() as *const _);

    _mm_cmpestri(a, 3, b, 15, _SIDD_CMP_EQUAL_ORDERED)
}

fn main() {
    let haystack = b"Split \r\n\t line  ";
    let needle = b"\r\n\t ignore this ";

    let idx = unsafe { alpha(needle, haystack) };
    assert_eq!(idx, 6);
}
LLVM ERROR: Cannot select: 0x7fa0f8e25ea0: i32,i32 = X86ISD::PCMPESTRI 0x7fa0f8e25410, Constant:i32<3>, 0x7fa0f8e259c0, Constant:i32<15>, Constant:i8<12>
  0x7fa0f8e25410: v16i8 = bitcast 0x7fa0f8e257b8
    0x7fa0f8e257b8: v2i64,ch = load<LD16[ConstantPool]> 0x7fa0f8e7b040, 0x7fa0f8e25208, undef:i64
      0x7fa0f8e25208: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<<16 x i8> <i8 13, i8 10, i8 9, i8 32, i8 105, i8 103, i8 110, i8 111, i8 114, i8 101, i8 32, i8 116, i8 104, i8 105, i8 115, i8 32>> 0
        0x7fa0f98fa3a8: i64 = TargetConstantPool<<16 x i8> <i8 13, i8 10, i8 9, i8 32, i8 105, i8 103, i8 110, i8 111, i8 114, i8 101, i8 32, i8 116, i8 104, i8 105, i8 115, i8 32>> 0
      0x7fa0f8e25d00: i64 = undef
  0x7fa0f8e25138: i32 = Constant<3>
  0x7fa0f8e259c0: v16i8 = bitcast 0x7fa0f8e25618
    0x7fa0f8e25618: v2i64,ch = load<LD16[ConstantPool]> 0x7fa0f8e7b040, 0x7fa0f8e25c30, undef:i64
      0x7fa0f8e25c30: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<<16 x i8> <i8 83, i8 112, i8 108, i8 105, i8 116, i8 32, i8 13, i8 10, i8 9, i8 32, i8 108, i8 105, i8 110, i8 101, i8 32, i8 32>> 0
        0x7fa0f98fa138: i64 = TargetConstantPool<<16 x i8> <i8 83, i8 112, i8 108, i8 105, i8 116, i8 32, i8 13, i8 10, i8 9, i8 32, i8 108, i8 105, i8 110, i8 101, i8 32, i8 32>> 0
      0x7fa0f8e25d00: i64 = undef
  0x7fa0f8e25d68: i32 = Constant<15>
  0x7fa0f8e25750: i8 = Constant<12>
In function: _ZN10playground4main17ha4aeba89c162948dE

1.26.0-nightly (2018-03-25 482a913fb337855072a53c0d602cd19947f45285)

@alexcrichton
Copy link
Member

Thanks! I'll poke around at this soon

@gnzlbg
Copy link
Contributor

gnzlbg commented Mar 27, 2018

The RFC2045 target_feature states that inlining a function into another function that matches or extends its feature set is ok, but "inlining is forbidden in the opposite case". The Example 1 shows:

#[target_feature(enable = "avx")] unsafe fn foo();
#[target_feature(enable = "avx")] #[inline] unsafe fn baz(); // OK
#[target_feature(enable = "avx")] #[inline(always)] unsafe fn bar(); // OK  

#[target_feature(enable = "sse3")]
unsafe fn moo() {
  // This function supports SSE3 but not AVX
  if cfg_feature_enabled!("avx") {
      foo(); // OK: foo is not inlined into moo
      baz(); // OK: baz is not inlined into moo
      bar(); 
      // ^ ERROR: bar cannot be inlined across mismatching features
      // did you meant to make bar #[inline] instead of #[inline(always)]?
      // Note: the logic to detect this is the same as for the call 
      // to baz, but in this case rustc must emit an error because an
      // #[inline(always)] function cannot be inlined in this call site. 
  }
}

We should probably suggest doing both things:

  • making bar #[inline] instead of #[inline(always)] if bar is the part of the current crate
  • making moo #[target_feature = "avx"] (independently of whether bar is part of the current crate or not)

@alexcrichton
Copy link
Member

I've applied a hammer for this rust-lang/rust#49425 which is different than RFC 2045 (but easier to implement)

@gnzlbg
Copy link
Contributor

gnzlbg commented Mar 27, 2018

@alexcrichton I've skimmed through the RFC comments, and it looks like the RFC wasn't properly updated. In the comments you argued for the fix that you proposed in this PR, and it was agreed upon that it is the right step to take. Allowing #[inline(always)] + #[target_feature] can be done in a backwards compatible way, so it can be resolved later.

alexcrichton added a commit to alexcrichton/rust that referenced this issue Mar 27, 2018
Once a target feature is enabled for a function that means that it in general
can't be inlined into other functions which don't have that target feature
enabled. This can cause both safety and LLVM issues if we were to actually
inline it, so `#[inline(always)]` both can't be respected and would be an error
if we did so!

Today LLVM doesn't inline functions with different `#[target_feature]`
annotations, but it turns out that if one is tagged with `#[inline(always)]`
it'll override this and cause scary LLVM error to arise!

This commit fixes this issue by forbidding these two attributes to be used in
conjunction with one another.

cc rust-lang/stdarch#404
@Amanieu
Copy link
Member

Amanieu commented Apr 2, 2018

coresimd doesn't build on the latest nightly because of this change.

@gnzlbg
Copy link
Contributor

gnzlbg commented Apr 3, 2018

@Amanieu which coresimd ?

@sunchao
Copy link

sunchao commented Apr 4, 2018

I'm also getting a bunch of errors when trying to compile coresimd with nightly rustc, possibly due to the change rust-lang/rust#49425

......
error: cannot use #[inline(always)] with #[target_feature]
  --> /Users/chao/.cargo/registry/src/github.jparrowsec.cn-1ecc6299db9ec823/coresimd-0.0.4/src/x86/x86_64/xsave.rs:45:1
   |
45 | #[inline(always)]
   | ^^^^^^^^^^^^^^^^^

error: cannot use #[inline(always)] with #[target_feature]
  --> /Users/chao/.cargo/registry/src/github.jparrowsec.cn-1ecc6299db9ec823/coresimd-0.0.4/src/x86/x86_64/xsave.rs:59:1
   |
59 | #[inline(always)]
   | ^^^^^^^^^^^^^^^^^

error: cannot use #[inline(always)] with #[target_feature]
  --> /Users/chao/.cargo/registry/src/github.jparrowsec.cn-1ecc6299db9ec823/coresimd-0.0.4/src/x86/x86_64/xsave.rs:72:1
   |
72 | #[inline(always)]
   | ^^^^^^^^^^^^^^^^^

error: cannot use #[inline(always)] with #[target_feature]
  --> /Users/chao/.cargo/registry/src/github.jparrowsec.cn-1ecc6299db9ec823/coresimd-0.0.4/src/x86/x86_64/xsave.rs:86:1
   |
86 | #[inline(always)]
   | ^^^^^^^^^^^^^^^^^

error: cannot use #[inline(always)] with #[target_feature]
   --> /Users/chao/.cargo/registry/src/github.jparrowsec.cn-1ecc6299db9ec823/coresimd-0.0.4/src/x86/x86_64/xsave.rs:102:1

@gnzlbg
Copy link
Contributor

gnzlbg commented Apr 4, 2018

The answer is probably to migrate to @AdamNiederer 's coresimd fork here: https://github.com/AdamNiederer/stdsimd/tree/fix-bft-build for the time being till we do a new 0.0.5 release, or to migrate to core::arch / core::simd.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants