Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

krabcake: Successfully send intrinsics::assume data to Valgrind #1

Open
wants to merge 1 commit into
base: master-before-krabcakeeeee-squash
Choose a base branch
from

Conversation

bryangarza
Copy link
Owner

@bryangarza bryangarza commented Mar 23, 2023

This patch adds support to send the bool in std::intrinsics::assume to Valgrind by inserting a Valgrind Client Request using a MirPass.

Depends on pnkfelix/krabcake-vg#2

How to test: use the -Z instrument_krabcake flag. You can compile something simple like:

#![feature(core_intrinsics, intrinsics)]
fn main() {
    let mut x = 45;
    let y = &mut x;
    println!("{y}");
    unsafe {
        std::intrinsics::assume(x == 45);
    }
}

Running krabcake:

~/projects/default/krabcake/bin/valgrind --tool=krabcake ./my_simple_program
==75859== Krabcake, the Rust UB detection tool
==75859== Copyright (C) 2023, and GNU GPL'd by Felix S. Klock.
==75859== Using Valgrind-3.21.0.GIT and LibVEX; rerun with -h for copyright info
==75859== Command: ./my_simple_program
==75859==
Hello world (from `rs_hello/src/lib.rs`)! printed: 42
45
--75859-- kc_handle_client_request, dispatching client request code 4b430008
--75859-- kc_handle_client_request, handle INTRINSICS_ASSUME 1 0 0 0 0
==75859==
==75859== For lists of detected and suppressed errors, rerun with: -s
==75859== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

This patch adds support to send the `bool` in `std::intrinsics::assume`
to Valgrind by inserting a Valgrind Client Request using a MirPass.
@bryangarza bryangarza force-pushed the krabcakeeeee-squash branch from 6cbbac7 to 917f56a Compare March 23, 2023 20:11
None,
None,
)),
vec![op(true)],
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Get the bool from the existing MIR intrinsics::assume itself

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

namely, extract it from the _operand parameter you are passing above, right?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct

#[allow(dead_code)] // For currently unused variants
#[derive(Debug, Clone, Copy)]
#[repr(u64)]
pub(super) enum ValgrindClientRequestCode {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Question: Should we be extending libcore with stuff like this?

Here are our options, as I see them:

  1. Client code code carries its own copy of the relevant enum (that's what we're doing now in https://github.com/pnkfelix/krabcake/blob/f353b59e7af59d055935022c6c3f72adf3a1e53e/sb_rs_port/src/krabcake.rs#L7 ). This is fine for demos, but a non-starter for our target audience for krabcake (where its supposed to be "just recompile and go")
  2. Rustc Compiler has access to all the integers that are used for the relevant enum, and embeds those values directly into the object code that gets generated. Demo code cannot refer to the enum itself from its source code. This is what I had originally envisaged.
  3. libcore defines the enum, and either the Rustc Compiler includes accesses to the elements of that enum in the code it injects (which is what I expect for the majority of the cases), or the client code can access the values itself for its own purposes.

I'm trying to envisage the cases where we would want to do option (3). The main benefit of option (3) AFAICT is that you get to write source code in your demos that reference the enum values, so you have more control over fine-tuning (if you wanted to e.g. experiment with alternative protocols for how you interact with the valgrind krabcake tool).

But I think in all such cases, I would opt to actually include a copy of the relevant enum in the source code (i.e. option (1.) over option (3.)

@bryangarza did you adopt this strategy because it was simpler to implement than option (2.)? Or am I missing some benefit of option (3.)?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

embeds those values directly into the object code that gets generated

Not sure I quite follow what you mean in option 2; do you mean basically being able to have the MirPass insert the krabcake request into the MIR but without needing a Lang Item? (example)

Or do you mean some method that doesn't use a MirPass?


did you adopt this strategy because it was simpler to implement than option (2.)? Or am I missing some benefit of option (3.)?

The reason I made it a lang item is so that it would be more maintainable to have most of the logic defined as Rust code instead of needing to handwrite a lot of MIR. Also, I created a separate data-type KrabcakeRequest so that we don't couple the C enum to the enum that is used in rustc, since as you said, we could in the future change the format or approach that is used in rustc itself independently of what the request looks like when sent to Valgrind (or we could even use something other than/in addition to Valgrind)

bryangarza pushed a commit that referenced this pull request Apr 21, 2023
There were a series of unfortunate interactions here. Here's an MCVE of the test this fixes (committed as `tests/ui/meta/no_std-extern-libc.rs`):
```rust
 #![crate_type = "lib"]
 #![no_std]
 #![feature(rustc_private)]
extern crate libc;
```

Before, this would give an error about duplicate versions of libc:
```
error[E0464]: multiple candidates for `rlib` dependency `libc` found
  --> fake-test-src-base/allocator/no_std-alloc-error-handler-default.rs:15:1
   |
LL | extern crate libc;
   | ^^^^^^^^^^^^^^^^^^
   |
   = note: candidate #1: /home/gh-jyn514/rust/build/aarch64-unknown-linux-gnu/stage2/lib/rustlib/aarch64-unknown-linux-gnu/lib/liblibc-358db1024b7d9957.rlib
   = note: candidate rust-lang#2: /home/gh-jyn514/rust/build/aarch64-unknown-linux-gnu/stage2/lib/rustlib/aarch64-unknown-linux-gnu/lib/liblibc-ebc478710122a279.rmeta
```
Both these versions were downloaded from CI, but one came from the `rust-std` component and one came from `rustc-dev`:
```
; tar -tf build/cache/f2d9a3d0771504f1ae776226a5799dcb4408a91a/rust-std-nightly-x86_64-unknown-linux-gnu.tar.xz | grep liblibc
rust-std-nightly-x86_64-unknown-linux-gnu/rust-std-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/liblibc-68a2d9e195dd6ed2.rlib
; tar -tf build/cache/f2d9a3d0771504f1ae776226a5799dcb4408a91a/rustc-dev-nightly-x86_64-unknown-linux-gnu.tar.xz | grep liblibc
rustc-dev-nightly-x86_64-unknown-linux-gnu/rustc-dev/lib/rustlib/x86_64-unknown-linux-gnu/lib/liblibc-f226c9fbdd92a0fd.rmeta
```
The fix was to only copy files from `rust-std` unless a Step explicitly requests for the `rustc-dev` components to be available by calling `builder.ensure(compile::Rustc)`.

To avoid having to re-parse the `rustc-dev.tar.xz` tarball every time, which is quite slow, this adds a new `build/host/ci-rustc/.rustc-dev-contents` cache file which stores only the names of files we need to copy into the sysroot.

This also allows reverting the hack in
rust-lang#110121; now that we only copy
rustc-dev on-demand, we can correctly add the `Rustc` check artifacts
into the sysroot, so that this works correctly even when
`download-rustc` is forced to `true`.

---

See rust-lang#108767 (comment) for why `no_std` is required for the MCVE test to fail; it's complicated and not particularly important.

Fixes rust-lang#108767.
bryangarza pushed a commit that referenced this pull request Apr 21, 2023
…tlarsan68

Fix no_std tests that load libc from the sysroot when download-rustc is enabled

There were a series of unfortunate interactions here. Here's an MCVE of the test this fixes (committed as `tests/ui/meta/no_std-extern-libc.rs`):
```rust
#![crate_type = "lib"]
#![no_std]
#![feature(rustc_private)]
extern crate libc;
```

Before, this would give an error about duplicate versions of libc:
```
error[E0464]: multiple candidates for `rlib` dependency `libc` found
  --> fake-test-src-base/allocator/no_std-alloc-error-handler-default.rs:15:1
   |
LL | extern crate libc;
   | ^^^^^^^^^^^^^^^^^^
   |
   = note: candidate #1: /home/gh-jyn514/rust/build/aarch64-unknown-linux-gnu/stage2/lib/rustlib/aarch64-unknown-linux-gnu/lib/liblibc-358db1024b7d9957.rlib
   = note: candidate rust-lang#2: /home/gh-jyn514/rust/build/aarch64-unknown-linux-gnu/stage2/lib/rustlib/aarch64-unknown-linux-gnu/lib/liblibc-ebc478710122a279.rmeta
```
Both these versions were downloaded from CI, but one came from the `rust-std` component and one came from `rustc-dev`:
```
; tar -tf build/cache/f2d9a3d0771504f1ae776226a5799dcb4408a91a/rust-std-nightly-x86_64-unknown-linux-gnu.tar.xz | grep liblibc
rust-std-nightly-x86_64-unknown-linux-gnu/rust-std-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/liblibc-68a2d9e195dd6ed2.rlib
; tar -tf build/cache/f2d9a3d0771504f1ae776226a5799dcb4408a91a/rustc-dev-nightly-x86_64-unknown-linux-gnu.tar.xz | grep liblibc
rustc-dev-nightly-x86_64-unknown-linux-gnu/rustc-dev/lib/rustlib/x86_64-unknown-linux-gnu/lib/liblibc-f226c9fbdd92a0fd.rmeta
```
The fix was to only copy files from `rust-std` unless a Step explicitly requests for the `rustc-dev` components to be available by calling `builder.ensure(compile::Rustc)`.

To avoid having to re-parse the `rustc-dev.tar.xz` tarball every time, which is quite slow, this adds a new `build/host/ci-rustc/.rustc-dev-contents` cache file which stores only the names of files we need to copy into the sysroot.

This also allows reverting the hack in rust-lang#110121; now that we only copy rustc-dev on-demand, we can correctly add the `Rustc` check artifacts into the sysroot, so that this works correctly even when `download-rustc` is forced to `true` and some tool depends on a local change to `compiler`.

---

See rust-lang#108767 (comment) for why `no_std` is required for the MCVE test to fail; it's complicated and not particularly important.

Fixes rust-lang#108767.
"xchg rbx, rbx",
inout("di") res,
in("ax") &args,
);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could have a terminator or intrinsic for emitting valgrind client requests, right? That would probably be more efficient as you can avoid the overhead of a non-inlineable call. Valgrind is relatively slow, so every bit helps I guess.

@bryangarza bryangarza changed the title krabcake: Successfully send intrinsics::asume data to Valgrind krabcake: Successfully send intrinsics::assume data to Valgrind Oct 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants