-
Notifications
You must be signed in to change notification settings - Fork 728
Conversation
But does not this invite a bypass via 11 characters? |
@dune73 try it :-)
and here is the bypass
i wasn't able to bypass it |
removed unnecessary +
@dune73 i think the [^\w\s]+ in combination with \W*? couldnt work because you got a 'lazy' and 'greedy' regular expression matching on a non word character next to each other... |
I'm not deep enough into regexes to really understand this. @fgs: Could you chime in, please? |
Ooops. I meant @fgsch. |
This looks good but I believe we can go even further and remove the Finally, the comment is stale and it should be updated. We should also drop the separate regexp file. |
I will benchmark both at monday. thank you @fgsch |
@fgsch test files: 942490.test_fgsch
i was not able to get a perfomance win with your suggestion
the payloads.txt file had 50mb i had a little different of 0.008 sec with a 500mb file |
The performance might be negligible using time but the regexp has 2 steps less. As for the regexp, you are right, sorry.
|
|
@fgsch here a comparison with a 5.4gb file here a POST payload file with 50mb
the rule is 7 or 8 time faster here. |
Using grep to measure the speed is not the right approach. Remove the non-capturing group and I have no objections. |
@fgsch
|
limit substitution [^\w\s] from + to {1,10}
i tested it against 5276 matches and the results matches are exactly the same.
even {1,2} produced the same results.
i think {1,10} is fairly enough.
according to #1359