-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Differential fuzzing with wasm-smith isn't the most effective #4322
Comments
I suspect that for individual instruction semantics, something like #3251 might be a reasonable solution. If we focus on generating test cases for individual Wasm instructions, we can drive the random code generation in the other direction: e.g. we have a Without coverage feedback libFuzzer may not know that some of those inputs are only sensitive to particular values (e.g. zero), but we could maybe bias probabilities too, if needed. (Custom entropy based on... similar bits? Closeness of some args to others? Surely the propcheck/quickcheck folks have some useful heuristics here...) All the above won't give us coverage of multiple-instruction patterns (I think this objection was also raised in #3251) but unit-testing instruction lowerings seems like a sufficiently useful and unique use-case to me that it's worth its own fuzz target and approach... |
I had forgotten about that issue! That's an excellent point though. I also filed #4338 as possible other wasm fuzzers we could integrate. Otherwise though other wasm-smith improvements include bytecodealliance/wasm-tools#266. While this may not be the most useful issue to keep open I'm tempted to leave it here as "if anyone searches the fuzzing tag in the Wasmtime repo this'll trigger them to think about improving wasm-smith" |
Hey, I had completely forgotten about #3251 but I believe when I added |
We could also add a "no control flow" mode to In fact, @abrown added the I think we also only pass zeros in as arguments to whatever functions we generate, and we could probably do much better than that as well (at least try zero, max, and one random bit pattern). |
@abrown jinx ;)
Yes, definitely. |
One thing I think we'd also want to change about wasm-smith is the signatures of functions as well. Right now we drop all values that don't correspond to the function's type so differential fuzzing has a hard time picking these up. Ideally we want a mode where we generate the function body first and then we wrap that in a function of the appropriate type. |
Hm... how? I wasn't really aware of the limitations on function signatures. |
We just choose a signature, and then generate a body. Alex is suggesting we generate a body and then derive a signature from that. |
Makes sense for returns, not necessarily for parameters though, since generating a body needs to know what locals are available. |
Or the first instruction dictates what locals must be available which should bubble up to the signature... This is the part that I'm having trouble seeing in wasm-smith. It doesn't seem designed for this kind of thing? |
Ok, how about this: we add
The |
One idea is to perhaps:
I like the idea though of using a completely new dedicated |
So you mean "arbitrary list of parameters" not "arbitrary list of parameter types"? |
Oh sorry I meant types there, but I was thinking we could in actuality do:
whether that second step belongs in wasm-smith or wasmtime I dunno but either seems fine by me |
The single-instruction generator from Andrew came out of this and nothing else has turned up in the meantime. Additionally Andrew did a lot of refactoring to have one |
Wasmtime recently had #4315 filed against it which discovered that there were two separate bugs in the SIMD implementation on x86_64. This discovery comes after "months of continuous oss-fuzzing" for the simd feature. I wanted to file an issue here with some investigation of why this happened because this theoretically should not happen.
Specifically here the bug was a buggy instruction lowering (two different ones). One fix (#4318) surfaced by corrupting an input register which I think only causes issues if the input is attempted to be reused elsewhere (e.g. a constant reused somewhere else). I don't know precisely but my impression was that this involved some register pressure, a "big" function, and constants to line up. This specific bug I could see as very difficult to discover via wasm-smith. The second bug, however, (#4317) was a trivial bug in the
select
instruction which showed up with the smallest of tests forselect
. The fact that wasm-smith never discovered this is alarming to me.Digging in it appears to be a confluence of factors which makes wasm-smith basically unable to find these bugs:
select
instruction requires 3 operands on the stack of specific types. Turns out this very rarely happens. I inserted apanic!
whenever aselect
instruction was even considered a candidate, and it was rarely hit. Even less rarely is the instruction chosen to be emitted.i32
input toselect
I think is almost always nonzero at runtime itself. The specific bug only happened when the condition was 0, however. I think this is because a lot of i32s come from things likei32.const
which is practically never zero.select
is generated with v128 inputs (which happens quite rarely) it's often never actually even executed at runtime. The few test cases I found which generated this instruction immediately had infinite recursion or an infinite loop with the interesting instructions far away.I unfortunately don't know if there's really a "fix" for issues like this. We could throw a bunch more heuristics at wasm-smith but at some point we probably need a somewhat fundamental new strategy for fuzzing here to get significantly more coverage.
The text was updated successfully, but these errors were encountered: