rust: CStr overhaul #273

nbdd0121 · 2021-05-17T18:51:13Z

CStr is overhauled to make using it more similar to use a str.

This is split from #258, without procedural macros and CBoundedStr and does not depend on build_assert!.

Some additional improvements:

Added a BStr which is just [u8] but provide an extra semantical annotation;
Make CStr deref toBStr so many functions could be used;
Add AsRef<BStr> for CStr, combining with deref this allows CStr to be used when &BStr is exepcted;
Allow CStr to be sliced like a [u8];
Change c_string_bytes to CStr::from_char_ptr

r? @ojeda @wedsonaf

wedsonaf

I like this way more than the other one. Thanks!

wedsonaf · 2021-05-18T11:07:48Z

rust/kernel/str.rs

+
+    #[inline]
+    fn index(&self, index: ops::RangeFrom<usize>) -> &Self::Output {
+        assert!(


Would it make sense to return an empty string (pointer to null terminator) if index is greater than or equal to the length? This way we avoid an additional runtime panic while still returning a value that makes sense.

Maybe in release mode that could be done, but in debug mode I think a failing assertion is the best option. It will avoid a potentially confusing result and if the start of the range is after the last byte of the string, that is very likely a bug.

By the way the index operation would panic anyway if the range starts 2 or more bytes after the null terminator. This panic only handles the additional case where the range starts exactly one byte after the null terminator. So for example this assertion catches CStr::from_bytes_with_nul(b"\0")[1..], but CStr::from_bytes_with_nul(b"\0")[2..] would panic even without this assertion.

The current design is that CStr behaves just like a BStr (with the NUL hidden when dereferencing to BStr).

let c = CStr::from_bytes_with_nul(b"\0"); let b = &*c; b[1]; // This will panic b[1..]; // This will panic c[1]; // This will panic c[1..]; // so this also should panic

Bounds check violation is one of the cases where I think panic is preferred to fixing the issue silently (as doing so in C would be UB).

rust/kernel/str.rs

soruh · 2021-05-20T01:26:52Z

rust/kernel/str.rs

+                return Err(CStrConvertError::InteriorNul);
+            }
+            i += 1;
+        }


This could be

if bytes[..bytes.len() - 1].iter().any(|x| x == 0) { return Err(CStrConvertError::InteriorNul); }

which I would ensure that there are no excessive bounds checks for bytes[i].
(The compiler can probably elide these here but I would generally rather not rely on it.)

Alternatively it could also be

for x in &bytes[..bytes.len() - 1] { if x == 0 { return Err(CStrConvertError::InteriorNul); } }

Slicing could not be used in const fn.

Originally this is written as bytes[..bytes.len() - 1].contains(&0), but following suggestion from @wedsonaf I changed it to a while loop as it currently is (for loops also couldn't be used in const fn).

Ah, I missed that this has to be const, sorry.
I guess we'll just have to hope/check that the bounds checks are properly elided. (Or provide separate const/runtime versions but that doesn't seem very nice)

Originally this is written as bytes[..bytes.len() - 1].contains(&0)

Yeah that would have been even better, it's quite unfortunate that const support in rust is still so immature.

I looked at the assembly and it seems that LLVM couldn't infer that i < len - 1 implies i < len. This boils down to the fact that len - 1 is not UB even when len == 0, so there isn't sufficient info to optimize out the boundary check. It's unfortunate that LLVM couldn't take advantage of the knowledge that len > 0.

However somehow i + 1 < len would allow LLVM to optimize away the bound check. Perhaps it's more common to have comparison against len so LLVM is better at doing it.

That is indeed weird. I guess if i + 1 < len works we can just go with that and add a comment why it is written this way.

Alternatively we could of course also use get_unchecked to ensure there are no bounds checks but I don't know if that is actually worth the extra unsafetly here since it is technically possible without it.

We could also maybe use a NonZeroUsize for the len which I would hope LLVM can optimize properly.
We could probably even make this completely safe without losing perf by replacing the initial is_empty() check with a NonZeroUsize::new(bytes.len()) call.

I don't think get_unchecked is worth it, and it is not const fn either. TBH I don't think this function'll be used in hot path; usually CStr is constructed either statically or from char* with from_char_ptr. So even bounds checks are left, it wouldn't be a big deal. It's too premature to optimize it. I would prefer simplicity to a few unnecessary bounds checks.

NonZeroUsize is usually for memory optimisations. LLVM already knows that length is not 0 after the is_empty check, it just simply couldn't use it to infer i < len, so replacing it with NonZeroUsize wouldn't help either.

I agree that this is probably premature, I guess my usual use cases for CStr in regular Rust code may have made me overestimate the importance of this quite a bit.

nbdd0121 · 2021-05-25T18:46:08Z

Can we get this PR merged? I don't want to rebase each time another PR get merged 😂. I also had a few other PRs that are based on this one.

wedsonaf · 2021-05-25T19:22:51Z

Can we get this PR merged? I don't want to rebase each time another PR get merged 😂. I also had a few other PRs that are based on this one.

Taking one last look now.

ojeda

Couple nits.

rust/kernel/str.rs

wedsonaf

LGTM after Miguel's comments are addressed.

wedsonaf · 2021-05-25T19:44:01Z

rust/kernel/str.rs

+    pub const fn from_bytes_with_nul_unwrap(bytes: &[u8]) -> &Self {
+        match Self::from_bytes_with_nul(bytes) {
+            Ok(v) => v,
+            Err(_) => panic!("string contains interior NUL"),


Does this turn into a compile-time error?

As the comment above the function said, this function is only called by c_str macro, which uses it to initialize a const, so this will only be trigger at compile-time. Once const_panic is stablised we can just do a match at c_str! and remove this function completely.

`CStr` is overhauled to make using it more similar to use a `str`. Signed-off-by: Gary Guo <[email protected]>

Suggested-by: Wedson Almeida Filho <[email protected]> Signed-off-by: Gary Guo <[email protected]>

Signed-off-by: Gary Guo <[email protected]>

ksquirrel · 2021-05-26T01:46:58Z

Review of 3a859fde34d5:

✔️ Commit faa3cbc: Looks fine!
✔️ Commit 27f0e08: Looks fine!
✔️ Commit 3a859fd: Looks fine!

TheSven73 · 2021-05-26T11:54:48Z

Hey, this PR breaks an existing file drivers/char/hw_random/bcm2835_rng_rust.rs, that's not nice :(

Should I add that file to the CI defconfig so we can catch these things in the future?

Fix build error from Rust-for-Linux#273 ("rust: CStr overhaul"). Signed-off-by: Sven Van Asbroeck <[email protected]>

ojeda · 2021-05-26T17:55:48Z

Hey, this PR breaks an existing file drivers/char/hw_random/bcm2835_rng_rust.rs, that's not nice :(

Should I add that file to the CI defconfig so we can catch these things in the future?

Definitely!

Fix build error from Rust-for-Linux#273 ("rust: CStr overhaul"). Signed-off-by: Sven Van Asbroeck <[email protected]>

The function meson_clk_pll_enable() can be invoked under the enable_lock spinlock from the clk core logic, which risks a kernel panic during the usleep_range() call: BUG: scheduling while atomic: kworker/u4:2/36/0x00000002 Modules linked in: g_ffs usb_f_fs libcomposite CPU: 1 PID: 36 Comm: kworker/u4:2 Not tainted 6.4.0-rc5 #273 Workqueue: events_unbound async_run_entry_fn Call trace: dump_backtrace+0x9c/0x128 show_stack+0x20/0x38 dump_stack_lvl+0x48/0x60 dump_stack+0x18/0x28 __schedule_bug+0x58/0x78 __schedule+0x828/0xa88 schedule+0x64/0xd8 schedule_hrtimeout_range_clock+0xd0/0x208 schedule_hrtimeout_range+0x1c/0x30 usleep_range_state+0x6c/0xa8 meson_clk_pll_enable+0x1f4/0x310 clk_core_enable+0x78/0x200 clk_core_enable+0x58/0x200 clk_core_enable+0x58/0x200 clk_core_enable+0x58/0x200 clk_enable+0x34/0x60 So it is required to use the udelay() function instead of usleep_range() for the atomic context safety. Fixes: b6ec400 ("clk: meson: introduce new pll power-on sequence for A1 SoC family") Reported-by: Jan Dakinevich <[email protected]> Signed-off-by: Dmitry Rokosov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jerome Brunet <[email protected]>

nbdd0121 force-pushed the cstr_v2 branch 2 times, most recently from 029c83a to e27e37b Compare May 17, 2021 19:42

wedsonaf reviewed May 18, 2021

View reviewed changes

bjorn3 reviewed May 18, 2021

View reviewed changes

rust/kernel/str.rs Show resolved Hide resolved

bjorn3 reviewed May 18, 2021

View reviewed changes

rust/kernel/str.rs Outdated Show resolved Hide resolved

bjorn3 reviewed May 18, 2021

View reviewed changes

rust/kernel/str.rs Outdated Show resolved Hide resolved

nbdd0121 force-pushed the cstr_v2 branch from e27e37b to 9485c3a Compare May 18, 2021 19:29

bjorn3 reviewed May 18, 2021

View reviewed changes

rust/kernel/str.rs Show resolved Hide resolved

bjorn3 reviewed May 18, 2021

View reviewed changes

rust/kernel/str.rs Show resolved Hide resolved