Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve fuzzing coverage #1227

Merged
merged 6 commits into from
Nov 10, 2023
Merged

Improve fuzzing coverage #1227

merged 6 commits into from
Nov 10, 2023

Conversation

sashashura
Copy link
Contributor

Improves fuzzing coverage and fixes a memory leak in SnoopFileReader.

@sashashura sashashura requested a review from seladb as a code owner November 1, 2023 20:56
Copy link

codecov bot commented Nov 1, 2023

Codecov Report

Merging #1227 (6b53531) into dev (813f15b) will decrease coverage by 0.01%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##              dev    #1227      +/-   ##
==========================================
- Coverage   82.73%   82.72%   -0.01%     
==========================================
  Files         159      159              
  Lines       20294    20294              
  Branches     7668     7671       +3     
==========================================
- Hits        16791    16789       -2     
- Misses       2882     2883       +1     
- Partials      621      622       +1     
Flag Coverage Δ
alpine317 72.44% <66.66%> (+<0.01%) ⬆️
centos7 74.61% <66.66%> (+0.03%) ⬆️
fedora37 72.40% <66.66%> (+<0.01%) ⬆️
macos-11 61.39% <33.33%> (-0.02%) ⬇️
macos-12 61.44% <33.33%> (-0.02%) ⬇️
macos-ventura 61.42% <33.33%> (-0.01%) ⬇️
mingw32 70.33% <66.66%> (-0.01%) ⬇️
mingw64 70.30% <66.66%> (-0.06%) ⬇️
npcap 83.27% <100.00%> (-0.05%) ⬇️
ubuntu1804 75.02% <66.66%> (-0.04%) ⬇️
ubuntu2004 73.24% <66.66%> (+0.01%) ⬆️
ubuntu2204 72.25% <66.66%> (+<0.01%) ⬆️
ubuntu2204-icpx 59.26% <33.33%> (-0.03%) ⬇️
unittest 82.72% <100.00%> (-0.01%) ⬇️
windows-2019 83.31% <100.00%> (-0.02%) ⬇️
windows-2022 83.31% <100.00%> (-0.03%) ⬇️
winpcap 83.30% <100.00%> (+0.01%) ⬆️
zstd 73.81% <66.66%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
Pcap++/src/PcapFileDevice.cpp 81.53% <100.00%> (+0.34%) ⬆️

... and 1 file with indirect coverage changes

@sashashura sashashura force-pushed the coverage branch 4 times, most recently from cc46024 to cc62f9f Compare November 3, 2023 08:59
@seladb
Copy link
Owner

seladb commented Nov 6, 2023

@sashashura I'm contemplating if we should add all of the protocols to the fuzzing tests:

  • It makes the fuzzing test longer and more complex
  • When we add new protocols we need to remember to update the fuzzing tests

Is it worth adding this complexity? 🤔

@sashashura
Copy link
Contributor Author

I'm contemplating if we should add all of the protocols to the fuzzing tests:

Do you mean you have an idea adding something? Could you elaborate on that? Or is it about the changes done in the pull request?

The current fuzzing covers all protocols already because it parses binary blobs. If a new protocol is added it is automatically discovered by fuzzer (New pcap test files are automatically added to the seed corpus by the build.sh script, but I don't know if oss-fuzz automatically picks them up or needs some manual bump to add them to the accumulated corpus files it maintains itself. Anyway at least eventually nothing stops the fuzzer from discovering the protocol after randomly mutating existing corpus files).

This pull request however adds new fuzzers for ngpcap and snoop file containers. Each file format is fuzzed by a separately built fuzzer that is run in parallel by oss-fuzz, so it doesn't slow the fuzzing.
Also the pull request adds a new fuzzer for pcap and ngpcap writers. They are added as separate fuzz targets exactly because otherwise they would significantly slow the existing fuzz targets. Since oss-fuzz runs all fuzz targets in parallel this is the way to write fuzzers.

The goal of the pull request is to improve the fuzzing coverage from the existing ~25% to 50-52%. More coverage - more fixed bugs.

@seladb
Copy link
Owner

seladb commented Nov 7, 2023

@sashashura I agree that we can add pcapng, but I doubt if we need to add snoop which is an old and rarely used format.
Just to make sure I understand: the only reason to add pcapng to the fuzz tests is to discover issues in pcapng format parsing, is that correct? Otherwise I'm not sure pcapng brings a lot more value than pcap. As you mentioned, the fuzzer creates binary blobs that mostly test packet parsing, the file format itself doesn't really matter.

Also, I wanted to ask about the main change in this PR which is in this file: Tests/Fuzzers/FuzzTarget.cpp. It goes over all the protocols that exist on the blob/packet generated by the fuzzer, and tries to extract some data for them. My question is: is this change really necessary?
Going back to what you mentioned - the fuzzer will eventually discover new protocols and find "dark corners" in existing protocols, so why do we need to add specific treatment for each protocol?

The main concerns that I have with this PR:

  • Adding specific treatment for each protocol is something that contributors need to remember when adding a new protocol. A lot of contributors are new to PcapPlusPlus (some are even new to C++ programming) and aren't familiar with fuzzing tests. This change will make it harder for them to contributre...
  • The Tests/Fuzzers/FuzzTarget.cpp file grows from 87 lines to 582 lines which makes it more complex and harder to understand. Is this change worth the complexity it brings?

@sashashura
Copy link
Contributor Author

sashashura commented Nov 7, 2023

I agree that we can add pcapng, but I doubt if we need to add snoop which is an old and rarely used format.

If it is supported it should be fuzzed. Otherwise we should delete it.

Just to make sure I understand: the only reason to add pcapng to the fuzz tests is to discover issues in pcapng format parsing, is that correct?

Technically yes, but with a caveat. In ideal world we would have a set of pcap files that covered all possible cases, so fuzzing with pcapng would be unnecessary overlap. My coverage tests show however that the seed for pcap and the seed for pcapng have non-overlapping parts.

But I hear you and made the changes to test only format parsing for .snoop files.

Also, I wanted to ask about the main change in this PR which is in this file: Tests/Fuzzers/FuzzTarget.cpp. It goes over all the protocols that exist on the blob/packet generated by the fuzzer, and tries to extract some data for them. My question is: is this change really necessary?
Going back to what you mentioned - the fuzzer will eventually discover new protocols and find "dark corners" in existing protocols, so why do we need to add specific treatment for each protocol?

Yes, it is. It increases functions called by fuzzer from roughly 25% to 50%.
There is a small misunderstanding. Fuzzer may mutate files and eventually discover new code paths blocked by some if condition. But the code must be theoretically callable. Say, the old fuzzer calls getNext, it triggers other functions. The calls I added are not triggered neither by getNext, nor by toString. They must be called explicitly. It is not a specific treatment for each protocol. It is an explicit calling of the parsed packet functions that are not otherwise reachable by general parsing.

Adding specific treatment for each protocol is something that contributors need to remember when adding a new protocol. A lot of contributors are new to PcapPlusPlus (some are even new to C++ programming) and aren't familiar with fuzzing tests. This change will make it harder for them to contributre...

I think it is a misunderstanding that it brings contribution complexity. Fuzzing harness is essentially one more unit test. I have picked up many of the calls from existing tests. Expanding the fuzzing harness doesn't require anything neither from existing nor from new contributors. If a new Layer public method is added or even a whole new protocol is added, it doesn't require the contributor to add it to the harness. If the method signature changes (Side note, I think you don't allow changing of public API methods) existing unit tests will stop compiling and the contributor must update all places to make it compile anyway. If the function is deleted, the contributor is welcome to delete it from the fuzzer too. Most of the functions that are added are very simple as getContent() or getContentLength().

The Tests/Fuzzers/FuzzTarget.cpp file grows from 87 lines to 582 lines which makes it more complex and harder to understand. Is this change worth the complexity it brings?

I see your point and extracted the additional calls into a separate function to improve readability.

Pcap++/src/PcapFileDevice.cpp Show resolved Hide resolved
Tests/Fuzzers/FuzzTarget.cpp Outdated Show resolved Hide resolved
Tests/Fuzzers/FuzzTarget.cpp Show resolved Hide resolved
Tests/Fuzzers/ReadParsedPacket.h Show resolved Hide resolved
Tests/Fuzzers/default.options Show resolved Hide resolved
Tests/Fuzzers/ossfuzz.sh Show resolved Hide resolved
Tests/Fuzzers/ossfuzz.sh Show resolved Hide resolved
Tests/Fuzzers/FuzzTarget.cpp Show resolved Hide resolved
Tests/Fuzzers/FuzzTarget.cpp Show resolved Hide resolved
Pcap++/src/PcapFileDevice.cpp Show resolved Hide resolved
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this pcap file relevant for this PR?

Copy link
Contributor Author

@sashashura sashashura Nov 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it improves the coverage. That was the case when code was reachable, but the fuzzer didn't manage to reach the code without help.

@seladb seladb merged commit 6a67cab into seladb:dev Nov 10, 2023
34 checks passed
@seladb
Copy link
Owner

seladb commented Nov 10, 2023

Thank you so much @sashashura for working on this! 🙏 🙏

@sashashura
Copy link
Contributor Author

Marked google/oss-fuzz#11198 as ready for review, but this pr wasn't merged to master yet.

@seladb
Copy link
Owner

seladb commented Nov 10, 2023

@sashashura I merged your PR to master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants