Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add unstable feature 'split_rinclusive', adding a right-inclusive version of str::split_inclusive #90388

Closed
wants to merge 14 commits into from

Conversation

logannc
Copy link

@logannc logannc commented Oct 28, 2021

Apologies, no issue associated with this one. If that is necessary for a change this small, please let me know and I'll create an issue or pFCP as needed.

Motivation

str::split_inclusive allows for easily splitting a string and including the separator in the substring on the "left".

This change proposes str::split_rinclusive which allows to easily split a string but include the separator on the "right".

This does not change result order like rsplit, so it is not, for example, a hypothetical rsplit_inclusive. Actually, they might be somewhat inverses of each other, but telling people to use rsplit_inclusive(...).rev() is less ergonomic and harder to reason about. Either way, neither currently exists.

Implementation

For the most part, just copying the existing SplitInternal impl and changing the matcher indices used.

We also introduce another param: allow_leading_empty, an opposite to the existing allow_trailing_empty. This is to mirror the functionality of split_inclusive when the final character matches (we do not return a trailing empty substring). Likewise, for this implementation, we do not return a leading empty string if the first character matches.

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @m-ou-se (or someone else) soon.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 28, 2021
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@logannc
Copy link
Author

logannc commented Oct 28, 2021

Sorry, didn't meant to use the CI as a testing platform. Thought this little afternoon PR was small enough I could skate by without figuring out x.py on Windows. If this one fails, I'll get things set up properly.

@rust-log-analyzer

This comment has been minimized.

@logannc
Copy link
Author

logannc commented Oct 28, 2021

Hm, not sure what it means when the test targets time out.

@rust-log-analyzer
Copy link
Collaborator

The job mingw-check failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
configure: rust.debug-assertions := True
configure: rust.overflow-checks := True
configure: llvm.assertions      := True
configure: dist.missing-tools   := True
configure: build.configure-args := ['--enable-sccache', '--disable-manage-submodu ...
configure: writing `config.toml` in current directory
configure: 
configure: run `python /checkout/x.py --help`
configure: 
---
skip untracked path cpu-usage.csv during rustfmt invocations
skip untracked path src/doc/book/ during rustfmt invocations
skip untracked path src/doc/rust-by-example/ during rustfmt invocations
skip untracked path src/llvm-project/ during rustfmt invocations
Diff in /checkout/library/alloc/tests/str.rs at line 1325:
     assert_eq!(split, ["SheeP", "SharK", "TurtlE", "CaT"]);
 
-
 #[test]
 fn test_split_char_iterator_rinclusive() {
 fn test_split_char_iterator_rinclusive() {
     let data = "\nMäry häd ä little lämb\nLittle lämb\n";
Diff in /checkout/library/alloc/tests/str.rs at line 1334:
     assert_eq!(split, ["\nMäry häd ä little lämb", "\nLittle lämb", "\n"]);
 
     let uppercase_separated = "SheepSharkTurtleCat";
-    let split: Vec<&str> = uppercase_separated
-        .split_rinclusive(char::is_uppercase)
-        .collect();
+    let split: Vec<&str> = uppercase_separated.split_rinclusive(char::is_uppercase).collect();
     assert_eq!(split, ["Sheep", "Shark", "Turtle", "Cat"]);
 
Diff in /checkout/library/alloc/tests/str.rs at line 1372:
Diff in /checkout/library/alloc/tests/str.rs at line 1372:
     assert_eq!(split, ["\n", "\nLittle lämb", "\nMäry häd ä little lämb"]);
 
     let uppercase_separated = "SheepSharkTurtleCat";
-    let split: Vec<&str> = uppercase_separated
-        .split_rinclusive(char::is_uppercase)
-        .rev()
-        .collect();
+    let split: Vec<&str> = uppercase_separated.split_rinclusive(char::is_uppercase).rev().collect();
     assert_eq!(split, ["Cat", "Turtle", "Shark", "Sheep"]);
 
 
Running `"/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/rustfmt" "--config-path" "/checkout" "--edition" "2018" "--unstable-features" "--skip-children" "--check" "/checkout/library/core/benches/num/dec2flt/mod.rs" "/checkout/library/alloc/tests/str.rs" "/checkout/library/alloc/tests/arc.rs" "/checkout/library/alloc/tests/cow_str.rs" "/checkout/library/alloc/tests/fmt.rs" "/checkout/library/alloc/tests/boxed.rs" "/checkout/library/alloc/tests/string.rs" "/checkout/library/core/benches/num/flt2dec/strategy/dragon.rs"` failed.

@joshtriplett
Copy link
Member

Can you give some examples of use cases for this method? (I can imagine some myself, but I'd like to know what you had in mind.)

@logannc
Copy link
Author

logannc commented Nov 1, 2021

To be perfectly honest, I'm having trouble thinking of examples besides something like CamelCaseSplitting. I'm sure there are some formats which use sentinel values to indicate the start of a new section, but I'm not widely versed enough to name any.

An argument could be made for symmetry between split_inclusive and split_rinclusive, but more examples would be better.

A sillier but still valid argument would be that I think there should be an easy way to split on caps without using Regex! Even if we include the iterator methods, its hard to implement. We can do it with fold or scan, but its more effort than I'd expect.

Of course, we could also just implement such a thing imperatively, so it's not like we're stuck without it. :)

Alternatively, I could see omitting this in favor of a dedicated function for it - it isn't an uncommon operation.

But maybe this is just my own itch.

@joshtriplett
Copy link
Member

CamelCase splitting and similar "records start with something that looks like this" formats seem like a reasonable use case. My main concern is the number of potential variations here: split_rinclusive, split_inclusive, rsplit_inclusive, rsplit_rinclusive...

@apiraino apiraino added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Nov 11, 2021
@bors
Copy link
Contributor

bors commented Nov 18, 2021

☔ The latest upstream changes (presumably #91019) made this pull request unmergeable. Please resolve the merge conflicts.

@dtolnay dtolnay removed the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Dec 10, 2021
@dtolnay dtolnay added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Dec 10, 2021
@JohnCSimon
Copy link
Member

Ping from triage:
@logannc Can you please address the build failures?

@JohnCSimon
Copy link
Member

ping from triage:
@logannc
Returning to you to address comments from reviewer

FYI: when a PR is ready for review, post a message containing
@rustbot ready to switch the PR to S-waiting-on-review so the PR appears in the reviewer's backlog.
Thanks

@JohnCSimon
Copy link
Member

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 6, 2022
@JohnCSimon
Copy link
Member

@logannc
Ping from triage: I'm closing this due to inactivity, Please reopen when you are ready to continue with this.
Note: if you do please open the PR BEFORE you force push to it, else you won't be able to reopen.
Thanks for your contribution.

@rustbot label: +S-inactive

@JohnCSimon JohnCSimon closed this Apr 23, 2022
@rustbot rustbot added the S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. label Apr 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants