-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding dumps and doc creation coverage for tomlkit #10046
Adding dumps and doc creation coverage for tomlkit #10046
Conversation
9978572
to
49ed5a3
Compare
Hey @DavidKorczynski , I've removed the document creation for now while I work out a more elegant way to generate them. Following from an earlier comment, I've had some time to refine and improve the random dictionary generation, test it, make sure the results are deterministic and make it a self contained library pypi - dictgen. This is because there's good potential for using it to cover the other dumps methods for YAML, JSON, TOML encoders etc. Do you mind having another look, and seeing if you think we can use this library to drive coverage for the tomlkit |
b6bfe44
to
d76e4ba
Compare
Yeah, I'll aim go over this tomorrow. |
projects/tomlkit/build.sh
Outdated
@@ -15,6 +15,7 @@ | |||
# | |||
################################################################################ | |||
pip3 install . | |||
pip3 install dictgen |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you move this to the Dockerfile?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have moved this to the Dockerfile and tested locally running using introspector. All working.
lgtm with the minor nit above. Have you compared dictgen to a simple construct e.g.
in terms of how much code coverage is achieved? |
I tried an initial fdp = atheris.FuzzedDataProvider(input_bytes)
try:
test_data = tomlkit.loads(fdp.ConsumeUnicodeNoSurrogates(sys.maxsize))
tomlkit.api.dumps(test_data, sort_keys=fdp.ConsumeBool())
except Exception as e:
if "tomlkit.exceptions." in str(e.__class__):
pass
else:
raise(e) For 100k runs we get total project coverage around |
d76e4ba
to
0d66ad6
Compare
These are some additional improvements to the toml encoder. - Remove unneeded fuzzers `fuzz_loads` and `fuzz_dumps`. These can both be reached by the `fuzz_load` and `fuzz_dump` methods - Cover additional encoders `TomlOrderedEncoder` & `TomlNumpyEncoder` - Cover additional decoders `TomlOrderedDecoder` & `TomlPreserveCommentDecoder` - Use the `dictgen` library approach, the same as #10046 Using local runs this gives us a coverage of an extra 8 methods.
Improving the coverage for the tomlkit by adding coverage for the
dumps
fuzzer. This is based on my existing improvements to the toml fuzzer #9834 , using a modified version of theatheris_dict.py
approach to creating random dictionaries.Doing some local runs this takes the coverage
26%
->37%
31%
->39%