Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start letting through some more LLVM intrinsics disguised as calls #1565

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 122 additions & 3 deletions ykrt/src/compile/jitc_yk/codegen/x64/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1499,9 +1499,25 @@ impl<'a> Assemble<'a> {
.collect::<Vec<_>>();

// unwrap safe on account of linker symbol names not containing internal NULL bytes.
let va = symbol_to_ptr(self.m.func_decl(func_decl_idx).name())
.map_err(|e| CompilationError::General(e.to_string()))?;
self.emit_call(iidx, fty, Some(va), None, &args)
match self.m.func_decl(func_decl_idx).name() {
vext01 marked this conversation as resolved.
Show resolved Hide resolved
"llvm.assume" => Ok(()),
"llvm.lifetime.start.p0" => Ok(()),
"llvm.lifetime.end.p0" => Ok(()),
x if x.starts_with("llvm.ctpop") => {
let [op] = args.try_into().unwrap();
self.cg_ctpop(iidx, op);
Ok(())
}
x if x.starts_with("llvm.smax") => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know for sure that llvm.smax is never inlined at the MIR level?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see no evidence of that, but really it's beyond my purview.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is, if an intrinsic is inlined at the codegen level, then we trace the inlined version and then call it a second time.

That's probably OK (if a little iffy) for an operation like smax, but you certainly don't want to do that for operations that are expensive (in terms of performance, since you would incur the cost twice) or have side effects (which would break program semantics).

I'd like to hear what @ptersilie thinks, but my gut feeling is to not merge this until we have a better way of knowing whether an intrinsic has been inlined or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not merging this means that we are not compiling a surprising amount of traces: I have an extension of this PR that implements another one of these (ctpop) that triggers a bug (not, I'm about 95% sure, because of ctpop) that has been hidden because of all the traces we fail to compile.

let [lhs_op, rhs_op] = args.try_into().unwrap();
self.cg_smax(iidx, lhs_op, rhs_op);
Ok(())
}
x => {
let va = symbol_to_ptr(x).map_err(|e| CompilationError::General(e.to_string()))?;
self.emit_call(iidx, fty, Some(va), None, &args)
}
}
}

/// Codegen a indirect call.
Expand Down Expand Up @@ -1681,6 +1697,41 @@ impl<'a> Assemble<'a> {
}
}

fn cg_ctpop(&mut self, iidx: InstIdx, op: Operand) {
let bitw = op.bitw(self.m);
let [in_reg, out_reg] = self.ra.assign_gp_regs(
&mut self.asm,
iidx,
[
RegConstraint::Input(op.clone()),
RegConstraint::OutputCanBeSameAsInput(op),
],
);
match bitw {
32 => dynasm!(self.asm; popcnt Rd(out_reg.code()), Rd(in_reg.code())),
x => todo!("{x}"),
}
}

fn cg_smax(&mut self, iidx: InstIdx, lhs: Operand, rhs: Operand) {
assert_eq!(lhs.bitw(self.m), rhs.bitw(self.m));
let bitw = lhs.bitw(self.m);
let [lhs_reg, rhs_reg] = self.ra.assign_gp_regs(
&mut self.asm,
iidx,
[RegConstraint::InputOutput(lhs), RegConstraint::Input(rhs)],
);
match bitw {
64 => {
dynasm!(self.asm
; cmp Rq(lhs_reg.code()), Rq(rhs_reg.code())
; cmovl Rq(lhs_reg.code()), Rq(rhs_reg.code())
);
}
x => todo!("{x}"),
}
}

/// Return the [VarLocation] an [Operand] relates to.
fn op_to_var_location(&self, op: Operand) -> VarLocation {
match op {
Expand Down Expand Up @@ -3865,6 +3916,74 @@ mod tests {
);
}

#[test]
fn cg_call_hints() {
codegen_and_test(
"
func_decl llvm.assume (i1)
func_decl llvm.lifetime.start.p0 (i64, ptr)
func_decl llvm.lifetime.end.p0 (i64, ptr)
entry:
%0: i1 = param 0
%1: ptr = param 1
call @llvm.assume(%0)
call @llvm.lifetime.start.p0(16i64, %1)
call @llvm.lifetime.end.p0(16i64, %1)
%5: ptr = ptr_add %1, 1
black_box %5
",
"
...
; call @llvm.assume(%0)
; call @llvm.lifetime.start.p0(16i64, %1)
; call @llvm.lifetime.end.p0(16i64, %1)
; %5: ...
...
",
false,
);
}

#[test]
fn cg_call_ctpop() {
codegen_and_test(
"
func_decl llvm.ctpop.i32 (i32) -> i32
entry:
%0: i32 = param 0
%1: i32 = call @llvm.ctpop.i32(%0)
black_box %1
",
"
...
; %1: i32 = call @llvm.ctpop.i32(%0)
popcnt r.32._, r.32._
",
false,
);
}

#[test]
fn cg_call_smax() {
codegen_and_test(
"
func_decl llvm.smax.i64 (i64, i64) -> i64
entry:
%0: i64 = param 0
%1: i64 = param 1
%2: i64 = call @llvm.smax.i64(%0, %1)
black_box %2
",
"
...
; %2: i64 = call @llvm.smax.i64(%0, %1)
cmp r.64.a, r.64.b
cmovl r.64.a, r.64.b
",
false,
);
}

#[test]
fn cg_eq() {
codegen_and_test(
Expand Down
4 changes: 2 additions & 2 deletions ykrt/src/compile/jitc_yk/jit_ir/jit_ir.l
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
%%
@[a-zA-Z_.][a-zA-Z_0-9]* "GLOBAL"
@[a-zA-Z_.][a-zA-Z_0-9.]* "GLOBAL"
%[0-9]+ "LOCAL_OPERAND"
i[0-9]+ "INT_TYPE"
float "FLOAT_TYPE"
Expand Down Expand Up @@ -80,7 +80,7 @@ f_true "F_TRUE"
urem "UREM"
xor "XOR"
[a-zA_Z_]+: "LABEL"
[a-zA_Z_][a-zA_Z_0-9]* "ID"
[a-zA_Z_][a-zA_Z_0-9.]* "ID"
volatile "VOLATILE"
\< "<"
\> ">"
Expand Down
Loading