-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better readInt
take two
#31
Conversation
9ced9d4
to
8901586
Compare
@vdukhovni Would you mind pinging me less often about things related to |
Sure (sorry about a final notification this will cause). The reason I thought this is relevant is that there's an open discussion re |
8901586
to
da70bbd
Compare
-- on a never-ending stream of whitespace from an untrusted source. | ||
-- For unconditional dropping of all leading whitespace, use `dropWhile` | ||
-- with a suitable predicate. | ||
skipSomeWS :: Monad m => ByteString m r -> ByteString m r |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unfamiliar with streaming-bytestring
, but do you need to export this function? Is it expected to be generally useful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that skipping whitespace is generally useful, and limiting the quantity skipped makes it an interesting new element in the toolkit. Perhaps @fosskers can comment on whether skipSomeWS
makes sense to export or not. It is needed for bulk-reading a stream of whitespace-separated integer data, for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that detecting whitespace is tricky to get right (see w8IsSpace
), I think it's good for us to export skipSomeWS
.
-- on a never-ending stream of whitespace from an untrusted source. | ||
-- For unconditional dropping of all leading whitespace, use `dropWhile` | ||
-- with a suitable predicate. | ||
skipSomeWS :: Monad m => ByteString m r -> ByteString m r |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that detecting whitespace is tricky to get right (see w8IsSpace
), I think it's good for us to export skipSomeWS
.
This version outperforms the original by ~4-5x, and implements correct overflow checks for the full range [minBound..maxBound] of `Int` values. A "reasonable" number of leading zeros is supported after an optional explicit sign, but once ~32k (defaultChunkSize) of leading zeros have been seen, no further chunks will be processed and the read fails (returns `Nothing` as the `Maybe Int` payload). It also adds a `skipSomeWS` function, which skips a "reasonable" quantity of leading whitespace, but stops at the first chunk boundary, having consumed at least ~32k (defaultChunkSize) whitespace bytes.
c688bc2
to
87fe41a
Compare
Many thanks! Just pushed a squashed version. |
It is the same as what you reviewed, just squashed into one commit. So I think this is ready to be merged... |
Thanks! |
Prepping for release soon, once I get mine merged in too. |
This version still outperforms the original by ~4-5x, and now implements
correct overflow checks for the full range of
Int
values.A "reasonable" number of leading zeros is supported after an optional
explicit sign, but once ~32k of leading zeros have been seen, no further
chunks will be processed and the read fails (returns
Nothing
as theMaybe Int
payload).It also adds
skipSomeWS
which skips a "reasonable" quantity ofleading whitespace before giving up if more than ~32k of whitespace
is encountered.
The new
w8IsSpace
function is somewhat out of place here, but itis more efficient than any stock versions I know of in any other
module (barring writing some FFI code for this). It could be
kept internal, with folks who want to roll their own skipping of
leading whitespace also providing their own implementation of the
predicate...
[ EDIT: got rid of the
readIntSkip
function, turns out just providingskipSomeWS
separately actually works just as well, with no need to provide the composition. ]