Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhance the fuzzing algorithm to be competitive with other mainstream fuzzers #20804

Open
andrewrk opened this issue Jul 26, 2024 · 2 comments
Open
Labels
contributor friendly This issue is limited in scope and/or knowledge of Zig internals. enhancement Solving this issue will likely involve adding new logic or components to the codebase. fuzzing
Milestone

Comments

@andrewrk
Copy link
Member

andrewrk commented Jul 26, 2024

Extracted from #20773.

In the initial implementation of fuzzing, I threw together something rough and quick that was able to find a string used with mem.eql. However, this is far from being competitive.

It doesn't take much to be competitive. AFL is only 10K lines of code, and it's all open source. Some guy just sat down and threw some paint at the wall to see what sticks, and you know what, anyone can do that. So let's also do it. We'll probably come up with some novel ideas, as well as plenty of silly ideas. We can keep the best, toss the rest, and then steal whatever good ideas are leftover from all the other open source fuzzers out there. The more source code I read from AFL and libFuzzer, the more confident I am that we can beat these projects on every metric simultaneously.

This issue is open-ended, however, in order to close it, we should be able to run zig's fuzzer side by side with other mainstream fuzzers on many of the same software test cases, and provide a comparison of their efficiency with regards to finding bugs and exploring the state space.

Probably, solving #352 will greatly aid this issue because it will provide insight into how well the state space is being explored, as well as just being really satisfying to watch.

Despite being an area of research, this is actually quite a contributor-friendly issue because it is well-scoped and I have already hooked up all the components so you can start trying out stuff already just by making edits to fuzzer.zig and rerunning zig build --fuzz (perhaps also with --debug-rt).

This is marked as 0.14.0 milestone because I want to use this feature to fuzz test incremental compilation, which is the main goal of this release cycle.

Related:

@andrewrk andrewrk added enhancement Solving this issue will likely involve adding new logic or components to the codebase. contributor friendly This issue is limited in scope and/or knowledge of Zig internals. fuzzing labels Jul 26, 2024
@andrewrk andrewrk added this to the 0.14.0 milestone Jul 26, 2024
@ProkopRandacek
Copy link
Contributor

I am interested in working on this.

Note that mainstream fuzzers (checked afl, afl++, angora) use custom llvm passes instead of just the coverage pass.

Probably, solving #352 will greatly aid this issue

The problem with coverage is that it has a slightly different goal. For example:

// ...
if (a) {
    // ...
}
// ....

The coverage information usually does not contain information about the if branch not being taken since from the line coverage point of view, there are no lines to mark as red when the if is false. Some fuzzers insert dummy basicblocks to solve this issue and keep using the coverage information.

IMO the better option is to do what angora is doing and provide custom instrumentation that stores triples (source BB, target BB, call stack context). (See the paper for more details)

@andrewrk
Copy link
Member Author

trace-pc does this already. No custom instrumentation needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributor friendly This issue is limited in scope and/or knowledge of Zig internals. enhancement Solving this issue will likely involve adding new logic or components to the codebase. fuzzing
Projects
None yet
Development

No branches or pull requests

2 participants