Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue found by fuzzer #3119

Closed
MaxGraey opened this issue Sep 11, 2020 · 23 comments · Fixed by #3123
Closed

Issue found by fuzzer #3119

MaxGraey opened this issue Sep 11, 2020 · 23 comments · Fixed by #3123

Comments

@MaxGraey
Copy link
Contributor

MaxGraey commented Sep 11, 2020

Fuzzer found issue in upstream. seed 1485930439220998215

Last working commit: cd6f0d9
So seems it was introduced in #3109

cc @dcodeIO

@dcodeIO
Copy link
Contributor

dcodeIO commented Sep 11, 2020

Hmm, reducing this one takes ages. Overall seems strange, first time seeing a fuzzing error like this.

@dcodeIO
Copy link
Contributor

dcodeIO commented Sep 11, 2020

Apparently doesn't reduce further than w.wat.zip, and I can only guess that something goes wrong on some pass then.

@MaxGraey
Copy link
Contributor Author

I think it produce incorrect wasm module at the beginning for some situations

@dcodeIO
Copy link
Contributor

dcodeIO commented Sep 11, 2020

The error happens on

wasm-opt out/test/a.wasm [--flatten --local-cse --remove-unused-module-elements --roundtrip] -Oz [--memory-packing --dae-optimizing --dce --precompute-propagate --const-hoisting] --all-features --disable-mutable-globals --disable-bulk-memory --disable-sign-ext --fuzz-exec

with everything in [ ] not making a difference, ending in

...
[fuzz-exec] calling func_248
[LoggingExternalInterface logging i32x4 0xffffb82b 0xdf000000 0xffffa316 0x4f000000]
[LoggingExternalInterface logging 821644594]
[fuzz-exec] note result: func_248 => (funcref(funcref), anyref(null), 65535.77734375, 4341566586702169899008, 127, i32x4 0x00000000 0x00000000 0x00000000 0x80000000)
[fuzz-exec] calling func_248_invoker
[LoggingExternalInterface logging i32x4 0xffffb82b 0xdf000000 0xffffa316 0x4f000000]
[LoggingExternalInterface logging 821644594]
[fuzz-exec] calling func_251_invoker
[fuzz-exec] calling func_253_invoker
[LoggingExternalInterface logging 821644594]
[LoggingExternalInterface logging 821644594]
[LoggingExternalInterface logging 821644594]
[LoggingExternalInterface logging 821644594]
[LoggingExternalInterface logging 821644594]
[LoggingExternalInterface logging 821644594]
[LoggingExternalInterface logging 821644594]
[LoggingExternalInterface logging 821644594]
[fuzz-exec] calling hangLimitInitializer
[fuzz-exec] comparing func_103
[fuzz-exec] comparing func_105
[fuzz-exec] comparing func_108
[fuzz-exec] comparing func_110
[fuzz-exec] comparing func_115
[fuzz-exec] comparing func_117
[fuzz-exec] comparing func_122
[fuzz-exec] comparing func_13
[fuzz-exec] comparing func_140
[fuzz-exec] comparing func_141
[fuzz-exec] comparing func_146
[fuzz-exec] comparing func_150
[fuzz-exec] comparing func_152
[fuzz-exec] comparing func_16
[fuzz-exec] comparing func_164
not identical!
Aborted

@MaxGraey
Copy link
Contributor Author

MaxGraey commented Sep 11, 2020

It also happened for this seeds:
4144293488409637722
14732826757361179878

@dcodeIO
Copy link
Contributor

dcodeIO commented Sep 11, 2020

Looks like

[fuzz-exec] comparing func_164
not identical!

is calling

 (func $164
  (drop
   (call $163
    (f32.const 536870912)
   )
  )
  (drop
   (call $163
    (f32.const -134217728)
   )
  )
  (call $fimport$0
   (call $0)
  )
 )

with $0 being hashMemory.

Regarding

4144293488409637722
14732826757361179878

the first somehow works for me, but the second has an error.

@MaxGraey
Copy link
Contributor Author

the first somehow works for me

Right, it could exclude

@dcodeIO
Copy link
Contributor

dcodeIO commented Sep 11, 2020

Trying to reduce the other seed now and see how it goes, but otherwise clueless what's going on. If you find more seeds, let me know :)

@dcodeIO
Copy link
Contributor

dcodeIO commented Sep 11, 2020

This one looks promising

(module
 (type $none_=>_anyref (func (result anyref)))
 (global $global$0 (mut i32) (i32.const 10))
 (export "func_77" (func $0))
 (func $0 (result anyref)
  (if
   (i32.eqz
    (global.get $global$0)
   )
   (return
    (ref.null any)
   )
  )
  (global.set $global$0
   (i32.const 1)
  )
  (ref.null func)
 )
)

Smells like getLeastUpperBound.

@tlively
Copy link
Member

tlively commented Sep 11, 2020

Do the optimizations change the type of the null literal that gets returned? That might cause them to compare differently.

@dcodeIO
Copy link
Contributor

dcodeIO commented Sep 11, 2020

Hmm, good question. Might in fact be that some pass previously assumed that ref.nulls are interchangeable, yet aren't anymore.

@tlively
Copy link
Member

tlively commented Sep 11, 2020

It looks like the implementation of operator== on Literal needs to be updated to take subtyping into account. If I have an anyref literal and a funcref literal and they are both null, they should compare equal despite having different types. Back when ref.null always generated nullref values, this wasn't an issue.

@MaxGraey
Copy link
Contributor Author

@dcodeIO another seed: 422893904766447699

@dcodeIO
Copy link
Contributor

dcodeIO commented Sep 11, 2020

Hmm, confusing. Conceptionally, operator== compares for equality of two values of two types that are not subtypes of eqref so their equality cannot be observed. Technically, one null might be 32-bit and another might be a tagged 64-bit pointer equalling null abstractly in the engine, but not sure how relevant the latter aspect is. Also leads down the rabbit hole of having an externref and a funcref that are both null, but then we need to know again whether anyref is available. Also just tried to return true for two nulls of any reftype (both subtypes of anyref then according to this logic) and the fuzzer still complains.

@tlively
Copy link
Member

tlively commented Sep 11, 2020

Hmm, yeah, the kind of equality you want is different in different situations. Here's another idea: when the interpreter returns a value, have it coerce the Literal is returns to the type of the function. That way an anyref-typed function always returns an anyref literal, even if the concrete value is actually a subtype of anyref.... But no, that won't work because erasing the type would change the semantics when the value is downcasted. Which means the optimizations that make this change are incorrect after all?

@dcodeIO
Copy link
Contributor

dcodeIO commented Sep 12, 2020

Looks like we need a way to change the type of a literal independently of its payload, so the interpreter knows that some anyref literal carries a funcref payload. Alternatively, the interpreter may pass along a tuple of a type and a literal.

Unfortunately I still haven't figured out which check is failing exactly here, to have a starting point. Do you have a suspicion where the error is happening / the fuzzer aborting?

@dcodeIO
Copy link
Contributor

dcodeIO commented Sep 12, 2020

Meanwhile just searched for "not identical" in the sources and found where it happens :)

@dcodeIO
Copy link
Contributor

dcodeIO commented Sep 12, 2020

When I extend the logging a bit, it's as you suspected:

not identical: anyref(null) != funcref(funcref)
Aborted

Though I don't quite understand why it would expect a null and a non-null value to be equal?

@dcodeIO
Copy link
Contributor

dcodeIO commented Sep 12, 2020

💡 Just understood that --fuzz-exec really just compares the binary before and after running the specified passes, so somewhere in -O2 --flatten --rereloop -Oz --dae-optimizing an optimization modifies the binary in an unexpected way so that execution differs.

Before

 (global $global$16 (mut i32) (i32.const 10))
 (func $71 (result anyref)
  (local $0 i32)
  (local $1 i64)
  (local $2 funcref)
  (local $3 externref)
  (local $4 exnref)
  (local $5 anyref)
  (if
   (i32.eqz
    (global.get $global$16)
   )
   (return
    (local.get $5) ;; (ref.null any)
   )
  )
  (global.set $global$16
   (i32.sub
    (global.get $global$16)
    (i32.const 1)
   )
  )
  (loop $label$2
   (if
    (i32.eqz
     (global.get $global$16)
    )
    (return
     (local.get $5) ;; (ref.null any)
    )
   )
   (global.set $global$16
    (i32.sub
     (global.get $global$16)
     (i32.const 1)
    )
   )
   (return
    (select (result anyref)
     (local.get $2) ;; (ref.null func)
     (local.get $4) ;; (ref.null exn)
     (i16x8.extract_lane_u 7
      (i32x4.splat
       (i32.const -6)
      )
     )
    )
   )
  )
 )

After

 (global $global$16 (mut i32) (i32.const 10))
 (func $71 (result anyref)
  (if
   (i32.eqz
    (global.get $global$16)
   )
   (return
    (ref.null any)
   )
  )
  (global.set $global$16
   (i32.sub
    (global.get $global$16)
    (i32.const 1)
   )
  )
  (if
   (global.get $global$16)
   (global.set $global$16
    (i32.sub
     (global.get $global$16)
     (i32.const 1)
    )
   )
  )
  (ref.null any)
 )

But something about that doesn't quite add up. Especially the non-null funcref(funcref).

@dcodeIO
Copy link
Contributor

dcodeIO commented Sep 12, 2020

Found where the funcref(funcref) is coming from:

// We cannot compare funcrefs by name because function names can
// change (after duplicate function elimination or roundtripping)
// while the function contents are still the same
for (Literal& val : ret) {
if (val.type == Type::funcref) {
val = Literal::makeFunc(Name("funcref"));
}
}

Doesn't account for null in which case we can't replace with a pseudo name. Now seeing the expected

[fuzz-exec] comparing func_77
not identical: anyref(null) != funcref(null)
Aborted

@dcodeIO
Copy link
Contributor

dcodeIO commented Sep 12, 2020

I guess what's happening there is on

    (select (result anyref)
     (local.get $2) ;; (ref.null func)
     (local.get $4) ;; (ref.null exn)
     (i16x8.extract_lane_u 7
      (i32x4.splat
       (i32.const -6)
      )
     )
    )

some pass(es) figure that the select is unnecessary due to a constant condition, figure a null is returned and make a new null of type anyref since that's the select's type. Really just guessing, though, as from above's discussion I would have expected that the select is replaced with the ifTrue arm or something while keeping the type. Where would this optimization happen?

@tlively
Copy link
Member

tlively commented Sep 12, 2020

I'm not sure where that would happen, but if you do BINARYEN_PASS_DEBUG=3 you can get the intermediate states from each pass and bisect on them to figure out which pass is the culprit.

@dcodeIO
Copy link
Contributor

dcodeIO commented Sep 12, 2020

but if you do BINARYEN_PASS_DEBUG=3

Very useful indeed! Found that this happens during code-folding where two expressions become compared, and I forgot to implement visitImmediates (reached from ExpressionAnalyzer::flexibleEqual) for RefNull's new type. As such the pass figured that both (ref.null func) and (ref.null any) are equal, keeping one (here, (ref.null any)) and dropping the other. Appears to be it, yay :)

(finished running seed 1485930439220998215 without error)
(finished running seed 14732826757361179878 without error)
(finished running seed 422893904766447699 without error)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants