-
Notifications
You must be signed in to change notification settings - Fork 390
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feedback on fuzzer benchmarking setup #985
Comments
Hi, Thanks for testing echidna in your new beachmark, happy to provide feedback for you. Some pointers:
Very important point to make sure the experiments make sense. There one thing missing here: |
@ggrieco-tob Thanks a lot for the quick response! I have two follow-up questions: (1) In my experiments, I observed that a small Here's a quick experiment I did:
On my machine, the last command terminates after only ~5 seconds even though the time limit is 60s. (2) It's difficult to say what value of |
Yes, this is correct. However, using a large
Well, if you expect smart contract fuzzers to accumulate a particular state over 15 transactions, clearly resetting every 15 is not enough (ofc!). In fact, from our experience, it is very very important to select a much larger limit to avoid resetting the state earlier than needed (100 or 200). We will love to see empirical experiments to support this intuition, and of course, there are some specially crafted examples where a fuzzer will benefit of resetting the state early (e.g if there a state that the fuzzer cannot leave) but these are not very common in our audits. |
@ggrieco-tob Thanks for clarifying! About the It would be really useful if there was a About |
The
Echidna also uses coverage for adding elements into the corpus, however we are not sure how much we can reduce that value even if we relying on the coverage guidance. |
@ggrieco-tob Thanks! Then I don't quite understand the behavior I'm observing above. It seems like the entire fuzzing campaign terminates when the |
Could be the case. Can you please create a small issue to reproduce it? It is odd that the complete campaign is over, unless there is nothing else to test (e.g. everything failed). |
I tried to minimize the example: pragma solidity ^0.8.19;
contract Maze {
event AssertionFailed(string message);
uint64 private x;
uint64 private y;
function moveNorth(uint64 p0, uint64 p1) payable external returns (int64) {
uint64 ny = y + 1;
require(ny < 7);
y = ny;
return step(p0, p1);
}
function moveSouth(uint64 p0, uint64 p1) payable external returns (int64) {
require(0 < y);
uint64 ny = y - 1;
y = ny;
return step(p0, p1);
}
function moveEast(uint64 p0, uint64 p1) payable external returns (int64) {
uint64 nx = x + 1;
require(nx < 7);
x = nx;
return step(p0, p1);
}
function moveWest(uint64 p0, uint64 p1) payable external returns (int64) {
require(0 < x);
uint64 nx = x - 1;
x = nx;
return step(p0, p1);
}
function step(uint64 p0, uint64 p1) internal returns (int64) {
unchecked {
if (x == 0 && y == 0) {
// start
return 0;
}
if (x == 2 && y == 2) {
emit AssertionFailed("1"); assert(false); // bug
return 1;
}
if (x == 6 && y == 6) {
if (p0 * p1 == 938957) {
emit AssertionFailed("2"); assert(false); // bug
}
return 2;
}
return 3;
}
}
} Assertion 1 is easy to cover, but assertion 2 should be more difficult to cover. |
@ggrieco-tob I observed that setting the shrink limit to 1 or even 0 works just fine when using the exploration test-mode (instead of the assertion test-mode). Perhaps the fuzzer simply terminates after finding the first bug and uses up the shrink budget before terminating. With a small budget, it terminates very quickly whereas with a large budget it "wastes" most of the allocated time just shrinking. I changed the test-mode in my benchmarking setup to "exploration" and this improved Echidna's performance very significantly. I'm using the I also compared shrink limit 0 with the default (5000) and did not observe a noticeable difference. I'm leaning towards simply keeping the default, but I'm also happy to use 0. |
Quick update: I also tried to set |
You can also try using echidna-parade, which uses swarm testing to combine different configurations of echidna in order to get more coverage. |
Thanks for the suggestion! I'll see if I can make it work. I'm still trying to set up Hybrid-Echidna... :) The increase from 100 to 200 did not have a significant performance impact. |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
I'm trying to compare Echidna with the Forge fuzzer on several benchmark contracts.
To make the comparison as fair as possible, I've created a benchmark generator that automatically generates challenging contracts. The benchmarks intentionally use a limited subset of Solidity to avoid language features that could be handled differently by different tools. Each contract contains ~50 assertions (some can fail, but others cannot due to infeasible path conditions). (If you're curious, you can find one of the benchmarks here. The benchmark-generation approach is inspired by the Fuzzle benchmark generator for C-based fuzzers.) To find the assertions that can fail, a fuzzer needs to generate up to ~15 transactions and satisfy some input constraints for each transaction.
Since I'm not deeply familiar with Echidna I'd like to check if there are any potential issues with my benchmark setup before sharing results.
For each fuzzing campaign I'm using the following settings that deviate from the defaults:
testLimit
: 1073741823 (instead of 50000)shrinkLimit
: 1073741823 (instead of 5000)codeSize
: 0xc00000 (instead of 0x6000)The motivation for increasing the
testLimit
andshrinkLimit
settings is that I want to run long fuzzing campaigns (for instance, 1 hour for each contract), and I use thetimeout
setting to terminate the campaign after a fixed amount of time.I also increased the
codeSize
setting to handle larger contracts, if necessary. Currently, all benchmark contracts are below the EVM limit when using the solc optimizer (0.8.19).Please let me know if you see any potential issues with this setup.
The text was updated successfully, but these errors were encountered: