Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement #scan_integer to efficiently parse Integer #115

Merged
merged 1 commit into from
Nov 26, 2024

Conversation

byroot
Copy link
Member

@byroot byroot commented Nov 25, 2024

Fix: #113

This allows to directly parse an Integer from a String without needing to first allocate a sub string.

Notes:

The implementation is limited by design, it's meant as a first step, only the most straightforward, based 10 integers are supported.

Reopening #114 for logistical reasons.

Copy link
Contributor

@headius headius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks great, thanks for adding it! Only two minor "tips" in my review.

@byroot byroot requested a review from kou November 26, 2024 07:37
Fix: ruby#113

This allows to directly parse an Integer from a String without needing
to first allocate a sub string.

Notes:

The implementation is limited by design, it's meant as a first step,
only the most straightforward, based 10 integers are supported.
@kou kou merged commit 6a3c74b into ruby:master Nov 26, 2024
37 checks passed
@kou
Copy link
Member

kou commented Nov 26, 2024

Thanks.

@byroot
Copy link
Member Author

byroot commented Nov 26, 2024

Thank you for the merge.

Let me know what you desire as next steps. I think we somewhat identified the need for a base: argument, and perhaps a pattern: one?

I suppose they are kinda tied together, e.g. to parse things like 0x123DEF, you may call scan_integer(pattern: /0x[0-9a-f]+/, base: 16) ?

@byroot byroot deleted the scan-integer branch November 26, 2024 08:24
@kou
Copy link
Member

kou commented Nov 26, 2024

How about scan_integer(base: 16) that accepts /0x[0-9a-fA-F]+/ as the next step?
What patterns should we accept by default for base: 16?
Do we have use cases for other patterns for now?

@byroot
Copy link
Member Author

byroot commented Nov 26, 2024

What patterns should we accept by default for base: 16?

I would say (0x)?[0-9a-fA-F]+ ?

Do we have use cases for other patterns for now?

Personally I don't.

One question though, if we accept base: 16, should we error on anything other than base: 10 and base: 16?

@kou
Copy link
Member

kou commented Nov 26, 2024

What patterns should we accept by default for base: 16?

I would say (0x)?[0-9a-fA-F]+ ?

OK. Let's use it.

One question though, if we accept base: 16, should we error on anything other than base: 10 and base: 16?

Yes. We may add base: 8 or something but we should raise an error for now.

byroot added a commit to byroot/strscan that referenced this pull request Nov 26, 2024
Followup: ruby#115

`scan_integer` is now implemented in Ruby as to efficiently handle
keyword arguments without allocating a Hash. Given the goal of `scan_integer`
is to more effciently parse integers without having to allocate an intermediary
object, using `rb_scan_args` would defeat the purpose.

Additionally, the C implementation now uses `rb_isdigit` and `rb_isxdigit`,
because on Windows `isdigit` is locale dependent.
byroot added a commit to byroot/strscan that referenced this pull request Nov 26, 2024
Followup: ruby#115

`scan_integer` is now implemented in Ruby as to efficiently handle
keyword arguments without allocating a Hash. Given the goal of `scan_integer`
is to more effciently parse integers without having to allocate an intermediary
object, using `rb_scan_args` would defeat the purpose.

Additionally, the C implementation now uses `rb_isdigit` and `rb_isxdigit`,
because on Windows `isdigit` is locale dependent.
byroot added a commit to byroot/strscan that referenced this pull request Nov 26, 2024
Followup: ruby#115

`scan_integer` is now implemented in Ruby as to efficiently handle
keyword arguments without allocating a Hash. Given the goal of `scan_integer`
is to more effciently parse integers without having to allocate an intermediary
object, using `rb_scan_args` would defeat the purpose.

Additionally, the C implementation now uses `rb_isdigit` and `rb_isxdigit`,
because on Windows `isdigit` is locale dependent.
matzbot pushed a commit to ruby/ruby that referenced this pull request Nov 27, 2024
(ruby/strscan#115)

Fix: ruby/strscan#113

This allows to directly parse an Integer from a String without needing
to first allocate a sub string.

Notes:

The implementation is limited by design, it's meant as a first step,
only the most straightforward, based 10 integers are supported.

ruby/strscan@6a3c74b4c8
byroot added a commit to byroot/strscan that referenced this pull request Nov 27, 2024
Followup: ruby#115

`scan_integer` is now implemented in Ruby as to efficiently handle
keyword arguments without allocating a Hash. Given the goal of `scan_integer`
is to more effciently parse integers without having to allocate an intermediary
object, using `rb_scan_args` would defeat the purpose.

Additionally, the C implementation now uses `rb_isdigit` and `rb_isxdigit`,
because on Windows `isdigit` is locale dependent.
byroot added a commit to byroot/strscan that referenced this pull request Nov 27, 2024
Followup: ruby#115

`scan_integer` is now implemented in Ruby as to efficiently handle
keyword arguments without allocating a Hash. Given the goal of `scan_integer`
is to more effciently parse integers without having to allocate an intermediary
object, using `rb_scan_args` would defeat the purpose.

Additionally, the C implementation now uses `rb_isdigit` and `rb_isxdigit`,
because on Windows `isdigit` is locale dependent.
byroot added a commit to byroot/strscan that referenced this pull request Nov 27, 2024
Followup: ruby#115

`scan_integer` is now implemented in Ruby as to efficiently handle
keyword arguments without allocating a Hash. Given the goal of `scan_integer`
is to more effciently parse integers without having to allocate an intermediary
object, using `rb_scan_args` would defeat the purpose.

Additionally, the C implementation now uses `rb_isdigit` and `rb_isxdigit`,
because on Windows `isdigit` is locale dependent.
byroot added a commit to byroot/strscan that referenced this pull request Nov 27, 2024
Followup: ruby#115

`scan_integer` is now implemented in Ruby as to efficiently handle
keyword arguments without allocating a Hash. Given the goal of `scan_integer`
is to more effciently parse integers without having to allocate an intermediary
object, using `rb_scan_args` would defeat the purpose.

Additionally, the C implementation now uses `rb_isdigit` and `rb_isxdigit`,
because on Windows `isdigit` is locale dependent.
kou pushed a commit that referenced this pull request Nov 27, 2024
Followup: #115

`scan_integer` is now implemented in Ruby as to efficiently handle
keyword arguments without allocating a Hash. Given the goal of
`scan_integer` is to more effciently parse integers without having to
allocate an intermediary object, using `rb_scan_args` would defeat the
purpose.

Additionally, the C implementation now uses `rb_isdigit` and
`rb_isxdigit`, because on Windows `isdigit` is locale dependent.
hsbt pushed a commit to hsbt/ruby that referenced this pull request Dec 2, 2024
Followup: ruby/strscan#115

`scan_integer` is now implemented in Ruby as to efficiently handle
keyword arguments without allocating a Hash. Given the goal of
`scan_integer` is to more effciently parse integers without having to
allocate an intermediary object, using `rb_scan_args` would defeat the
purpose.

Additionally, the C implementation now uses `rb_isdigit` and
`rb_isxdigit`, because on Windows `isdigit` is locale dependent.
hsbt pushed a commit to ruby/ruby that referenced this pull request Dec 2, 2024
Followup: ruby/strscan#115

`scan_integer` is now implemented in Ruby as to efficiently handle
keyword arguments without allocating a Hash. Given the goal of
`scan_integer` is to more effciently parse integers without having to
allocate an intermediary object, using `rb_scan_args` would defeat the
purpose.

Additionally, the C implementation now uses `rb_isdigit` and
`rb_isxdigit`, because on Windows `isdigit` is locale dependent.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Provide an efficient way to parse Integers (and Floats)?
3 participants