-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add --entropy-sensitivity option for controlling entropy checks #272
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make more sense to add deprecation warnings to v2.x for this and any other option(s) that we want to remove in v3.0 & then just remove them in v3.0?
I dunno. This feels a lot like obeying the letter of the law ("we warned you in 2.11 (?) that this would disappear in 3.0") while violating the spirit by jamming it in at chronologically the last second and effectively saying "you don't HAVE to get all the new features in 3.0, so you can take as long as you want to convert, and suffer in the meantime...". The extra logic/code for the backward compatibility is pretty minor and cleanly separated, so I don't view it as a problem to leave it for now (marked as deprecated) and then tear it out in 3.future or even 4.0. Especially if it means we can avoid yet another 2.x update right on the heels of all the ones we already published. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🦅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the update to the command line options, please remember to update the --help
output in the README.md
!
9abcd64
to
c7e0718
Compare
01b3cca
to
61497ac
Compare
61497ac
to
b8f3a5a
Compare
Fixes #265 Provide more comprehensible alternative for tuning entropy checking. This is applied consistently across all target character sets, and stated in a way that is slightly easier to understand ("higher means more likely to flag a given string"). The older `--b64-entropy-score` and `--hex-entropy-score` options are marked as deprecated but retained for backwards compatibility (and they override `--sensitivity` if used together with it).
Use `--entropy-sensitivity` instead of `--sensitivity`
...in response to review comments
...and use them instead of private members
Do everything from scratch instead of storing explicitly. This is a PITA because you can't combine `@property` and `@lru_cache()` and skipping the caching would be a killer.
* Consolidate common code for entropy limit back into a single method, and rework properties related to it so they are cleaner. * Invert sensitivity scale; adjust math and doc to match. It's still weird but aligns more closely to the underlying entropy metric.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
To help us get this pull request reviewed and merged quickly, please be sure to include the following items:
PR Type
What kind of change does this PR introduce?
Backward Compatibility
Is this change backward compatible with the most recently released version? Does it introduce changes which might change the user experience in any way? Does it alter the API in any way?
Issue Linking
fixes #265
What's new?
This PR introduces a new
--entropy-sensitivity
option that replaces--b64-entropy-score
and--hex-entropy-score
. The older options are marked "deprecated" and generate warnings when used.The default sensitivity is exactly the same as it was before, but it is presented differently and applied consistently to both hexadecimal and base64 strings. "Sensitivity" is expressed on a scale of 0 (totally random) to 100 (totally nonrandom) -- the default is 25 -- and intuitively higher sensitivity means more borderline-random strings will be reported as suspicious.
For purposes of minimizing this PR's footprint, the default is applied both in
cli.py
(real world) and inscanner.py
(because existing unit tests do not specify most options).