-
Notifications
You must be signed in to change notification settings - Fork 438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: rust: chrdev: fix use-after-free on module unload #207
Conversation
4e1ecf0
to
bc76bef
Compare
This is probably an incomplete solution (hence the RFC) because:
https://github.com/Rust-for-Linux/linux/blob/rust/fs/char_dev.c#L583 What happens when the module is unloaded while someone in userspace has the |
Thanks! That also seems to work in practice:
So all good on that front. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for tracking down the issue!
rust/kernel/chrdev.rs
Outdated
// SAFETY: Calling unsafe functions and manipulating `MaybeUninit` | ||
// pointer. | ||
unsafe { | ||
let cdev = bindings::cdev_alloc(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lately we've been trying to keep the scope of unsafe
as tight as possible, with proper justifications for each unsafe
operation. Would you mind moving the above out to its own unsafe
block?
Something like:
// SAFETY: This only allocates memory.
let cdev = unsafe { bindings::cdev_alloc() };
if cdev.is_null() {
return Err(Error::ENOMEM);
}
rust/kernel/chrdev.rs
Outdated
@@ -121,8 +125,10 @@ impl<const N: usize> Registration<{ N }> { | |||
(*cdev).owner = this.this_module.0; | |||
let rc = bindings::cdev_add(cdev, inner.dev + inner.used as bindings::dev_t, 1); | |||
if rc != 0 { | |||
bindings::cdev_del(cdev); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm starting to think that we need a wrapper for a cdev
, such that clean up is automatic. Without it, it is very much possible for us to add an extra error path and forget to clean up; this is the sort of thing we're trying to be better at than C.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking the exact same thing while I was typing up the cdev_del
in the error path. That didn't feel very oxidated!
A wrapper/abstraction would be cool. The more kobject
instances we need to juggle around in Rust, the more useful such an abstraction would be. I guess its complexity has to balance its benefit?
rust/kernel/chrdev.rs
Outdated
// SAFETY: Calling unsafe functions and manipulating `MaybeUninit` | ||
// pointer. | ||
unsafe { | ||
let cdev = bindings::cdev_alloc(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One problem with this approach is that it doesn't allow us to pass any context to open
-- when we control cdev
, we can embed it in a outer struct and use container_of
to get per-device shared state. How can we do this sort of thing after this change? (Example here.)
More generally, how are drivers supposed to use cdev_init
safely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you can retrieve the cdev
from the struct inode
passed into the fops's open
call. E.g.:
linux/drivers/char/virtio_console.c
Line 1028 in 30d13c6
static int port_fops_open(struct inode *inode, struct file *filp) |
static int port_fops_open(struct inode *inode, struct file *filp)
{
struct cdev *cdev = inode->i_cdev;
....
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but if it is allocated via cdev_alloc
, we can't use the traditional approach of defining a struct that contains a cdev
, then using container_of
to go from the cdev
in inode->i_cdev
to the outer struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know how do drivers that do use cdev_init
ensure that the memory is freed only when the refcount on the kobject goes to zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's something I've wanted to bring up as well. I've been testing the simplest module I could find which uses a cdev
-(drivers/char/raw.ko
). It appears to (mis)treat the cdev
in the same way as the chrdev
module. Interestingly, it doesn't appear to trigger DEBUG_KOBJECT_RELEASE
warnings, like the chrdev
module does! Not the few times I ran it, anyways. So I wonder what's going on there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My reading of the
raw
module was that it couldn't possibly be working but I didn't want to poison your analysis :)
Thanks for letting me discover that on my own, it's the best way :)
I agree that raw
is broken, or at least the module unload section of it. But a quick grep shows that many more drivers are in the same boat, e.g.
char/pc8736x_gpio.c
usb/mon/mon_bin.c
hid/hid-roccat.c
hid/hidraw.c
From what I can tell, any driver which uses cdev_init/cdev_del
is at risk of triggering the debug warning, if the cdev
is still alive when the module is unloaded. I guess for most drivers that won't happen so easily, as in practice there tends to be a delay between closing the last handle (which calls cdev_del
) and running rmmod
.
That said, I found out in the past that plenty of drivers have use-after-free issues (and more!) in their module unload path. A common mistake is forgetting to call cancel_work_sync
, you can find it all over. A few years ago, I submitted a bunch of fixes for various drivers to try and address some of these issues.
From what I understand, the kernel community does not view module onload bugs as a problem. That's probably why so many remain, and few people care. See Greg KH's opinion here.
So module unload might be broken wrt DEBUG_KOBJECT_RELEASE
simply because no-one cares enough right now, and the "correct" mechanism does not yet exist. Should we work around that in the Rust code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a thought: using cdev_set_parent
the kernel will make sure that the parent (which holds the cdev's memory) is not freed before the cdev
. So maybe we can make the module itself the cdev
s parent? But will that increment the module's refcnt, and prevent it from getting unloaded?
Edit: looking at kernel/module.c
, it's obviously way more complicated than that :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm speaking to Greg about this and indeed this isn't something that they care about. He advises us to not expose these APIs in Rust just yet.
So for now, let's do the workaround that you implemented so that we don't get DEBUG_KOBJECT_RELEASE
warnings and we don't have to disable it either.
We'll find a way to share per-device state later (we already have a TODO for this in convert
below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. I will proceed with your review comments then, does that sound ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds great, thank you!
70a2bab
to
dad1a84
Compare
rust/kernel/chrdev.rs
Outdated
} | ||
} | ||
|
||
fn new_none_array<T, const N: usize>() -> [Option<T>; N] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function looks pretty bad. I couldn't work out how to initialize an [Option<SafeCdev>; N]
array to None
:
cdevs: [None; N],
Rust insists that SafeCdev
needs the Copy
trait, which is obviously a really bad idea for a refcounted object:
error[E0277]: the trait bound `Option<SafeCdev>: core::marker::Copy` is not satisfied
--> rust/kernel/chrdev.rs:155:24
|
155 | cdevs: [None; N],
| ^^^^^^^^^ the trait `core::marker::Copy` is not implemented for `Option<SafeCdev>`
|
= help: the following implementations were found:
<Option<T> as core::marker::Copy>
= note: the `Copy` trait is required because the repeated element will be copied
Does anyone know of an elegant, safe way to accomplish this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can fill an array with const
values even if they aren't Copy
: https://blog.rust-lang.org/2021/02/11/Rust-1.50.0.html#const-value-repetition-for-arrays. So
const NONE: Option<SafeCdev> = None;
[NONE; N]
should work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That works. Concise, elegant, and safe - thanks ! 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the codegen is better, even, from a quick test in Compiler Explorer.
rust/kernel/chrdev.rs
Outdated
@@ -20,10 +20,66 @@ use crate::error::{Error, KernelResult}; | |||
use crate::file_operations; | |||
use crate::types::CStr; | |||
|
|||
struct SafeCdev(*mut bindings::cdev); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I don't think we need this Safe
prefix. How about just Cdev
or CharDev
?
rust/kernel/chrdev.rs
Outdated
fn init(&mut self, fops: *const bindings::file_operations) { | ||
// SAFETY: call an unsafe function | ||
unsafe { | ||
bindings::cdev_init(self.0, fops); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we're supposed to call cdev_init
on what's returned by cdev_alloc
. Perhaps we should have the constructor above (alloc
) also take a file_operations
as argument and we just initialise cdev.ops
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are absolutely right, that pattern is incorrect! Yes, let's fold the cdev.ops
assignment into alloc
.
rust/kernel/chrdev.rs
Outdated
|
||
impl SafeCdev { | ||
fn alloc() -> KernelResult<Self> { | ||
// SAFETY: call an unsafe function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I 'm not sure if this is a final version meant for revision (in case it isn't, please ignore this), but we'd like to see a justification for the call being safe. Also, for consistency, please end the sentences in comments with a punctuation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that I understand the ins-and-outs of how we're planning to manage unsafe code in the kernel, so bear with me. How could I justify that this code is safe?
rust/kernel/chrdev.rs
Outdated
} | ||
} | ||
|
||
fn add(&mut self, dev: bindings::dev_t, pos: c_types::c_uint) -> KernelResult<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pos
should be called count
, no?
.github/workflows/ci.yaml
Outdated
@@ -223,6 +223,10 @@ jobs: | |||
| sed s:$'\r'$:: \ | |||
| tee qemu-stdout.log | |||
|
|||
# the kernel should not be generating any warnings | |||
- run: | | |||
grep -L '] WARNING:' qemu-stdout.log |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why -L
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-L
inverts grep
's return value. So the workflow will fail if a warning is present in the kernel log.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, wow -- this has changed 3 years ago. In my version of grep
it didn't, and the manual didn't mention it either.
It turns out the behavior changed in https://git.savannah.gnu.org/cgit/grep.git/commit/?id=92526f7246464825c5547ceb08e01433e035c867
Since this is just for the CI, it is OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting ! Is there a better, more universal, etc way to invert the grep
result? No problem switching to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, it was changed back again: https://git.savannah.gnu.org/cgit/grep.git/commit/?id=0435ebca64fbafcd62008c991dd9377d8a792dfc
So only grep
3.2, 3.3 and 3.4 work that way. I think it is best to just do it ourselves even if we never change the image we run the CI on (e.g. we may want to move all this into a script so that the CI can be run somewhere else).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, I'll use (! grep '] WARNING:' qemu-stdout.log)
then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that should do the trick (not sure if the parenthesis are needed -- it seems to work without them)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This indicates that parentheses are needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are multiline strings in a "literal style block", so I think !
is not reserved there, but I may be wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only one way to find out, lol
rust/kernel/chrdev.rs
Outdated
} | ||
|
||
fn set_owner(&mut self, module: &crate::ThisModule) { | ||
// SAFETY: dereference of a raw pointer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, as Wedson said -- i.e. the SAFETY
comment is not about describing what operation is unsafe (the compiler can tell us that), but about describing why that operation is sound.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're looking for a description of a proof-of-safety, which runs inside a human brain? :) Could you point me to an instance of this in the existing codebase? I often find it easier to work from an existing example.
No concern that new commits will break existing proofs-of-safety?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I inherited the SAFETY
comment practice from Rust's standard library; and most of the code we have already follows the convention.
New commits that fail to update the comment are a bug. That is, any commit updating any unsafe
block must re-check the SAFETY
comment that goes with it. Same for unsafe
functions and their Safety
section. That is why having unsafe
blocks as small as possible and as close as possible to the operation helps avoiding "comment bugs" :)
I know the C side of the kernel does not follow such strict comment guidelines, but I/we believe this leads to better code overall (specially when maintaining it). And yes, many of those comments will be trivial, but the point is not just to document what is non-trivial, but to also know there is nothing non-trivial going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree 100% with the need for rigour here. One of the big justifications for this effort is memory safety issues, so we better get this right, and keep it right. I would like to learn what's required here - I'll see if I can make a rigorous SAFETY argument in the next push.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As to how to write the soundness "proof", a good way is to check the documentation of what you are calling to see why it is unsafe
(or the language reference, for other operations). Then, think why those preconditions hold, and write it down.
So, for instance, here, the comment should explain why self.0
is guaranteed in all cases to be a valid (dereferenceable) pointer. Thus you can ask yourself the question "Is there any way someone can build a SafeCdev
object without .0
initialized properly and call it?"
From a quick look, it seems the only way to build a SafeCdev
object is through alloc()
, which ensures it is a valid pointer, and it never gets modified. So you could write something like this.
However, this is one of those cases where what we have is a type invariant. For these, we started the convention of writing an # Invariants
section in the type's doc-comment, and then referencing that in the SAFETY
comments. Doing it this way makes things much easier to reason about, and we avoid having to repeat the same explanation in several places (e.g. every time we dereference the pointer member).
Examples of type invariants being used are at pages.rs
, file_operations.rs
and sync/arc.rs
:
/// Wraps the kernel's `struct file`.
///
/// # Invariants
///
/// The pointer [`File::ptr`] is non-null and valid.
pub struct File {
ptr: *const bindings::file,
}
And then:
/// Returns the current seek/cursor/pointer position (`struct file::f_pos`).
pub fn pos(&self) -> u64 {
// SAFETY: `File::ptr` is guaranteed to be valid by the type invariants.
unsafe { (*self.ptr).f_pos as u64 }
}
I think I will put this explanation somewhere in coding.rst
, since it will probably come up several times :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry -- I was writing this second comment after submitting the other; I didn't mean to leave you hanging without a proper explanation/example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, don't worry about stuff like that! And thank you so much for the great explanation, this should give a chance of getting it right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're welcome!
Also, please don't forget to add a // INVARIANTS: ...
comment in the place you actually build (or modify) the object, explaining why the invariant starts to hold (or still holds, respectively). See the files I mentioned for examples or that.
rust/kernel/chrdev.rs
Outdated
} | ||
} | ||
|
||
fn new_none_array<T, const N: usize>() -> [Option<T>; N] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the codegen is better, even, from a quick test in Compiler Explorer.
dad1a84
to
8ecbe0c
Compare
rust/kernel/chrdev.rs
Outdated
fops: &'static bindings::file_operations, | ||
module: &'static crate::ThisModule, | ||
) -> KernelResult<Self> { | ||
// SAFETY: `bindings::cdev_alloc` returns a valid pointer or null. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return value is not what makes this particular block sound (although it is what makes future things sound -- but those should be documented at that point; see the other comments). If there are no additional preconditions to satisfy (but just that the bindings are correct), you can say something like "FFI call".
Sometimes some kernel functions may not be called from certain contexts, so in those cases that "context precondition" needs to be lifted to the outer function's doc-comment # Context
. Since this allocates memory, this may be one of those cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you tell me more about Contexts? Are you saying that every function in Rust needs a # Context
comment which indicates whether or not it can run in atomic context? Does kernel Rust have atomic contexts yet? A grep for GFP_ATOMIC
brings up nothing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only those that cannot be called from all contexts. This is one of the policies we introduced very recently and we do not really follow it in many places, so bear with us...
Furthermore, we do not require documenting private items like this struct (at least so far, still thinking about that one...), but callers that are public still need to be documented. So e.g. Registration::register()
should have it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The long-term idea is to have some analyzer or lint tell us if we are following the rules -- or even the compiler itself by annotating the functions somehow (e.g. something like the "colored unsafe
" I proposed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying, that makes sense! From what I understand, only a relatively small subset of kernel functions can be called from atomic contexts. So it might make sense to add a documentation marker (or code attribute?) only to those functions that are atomic compatible.
I can't think of any reason why one would want to create/register a cdev
from an atomic context. But it's an interesting area to think about, for sure!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PS looks like our messages crossed :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, when making things like this so strict, it brings up situations that do not make much sense (like registering, as you say). Similar to the SAFETY
comments, it is more about being consistent, even if it should never be the case that someone attempts to register something in an atomic context etc.
If we manage to get the # Context
comments right, it is my hope that reasoning about contexts is easier compared to the C side, plus we would be in a much better position later on to switch to actual annotations and automated checking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for the # Context
doc-comment, you can ignore it for the moment -- we still need to see how we will write them and do a pass over the current code (we can discuss it in one of the meetings).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good! I'll push the updated SAFETY
comments in a few minutes.
rust/kernel/chrdev.rs
Outdated
if cdev.is_null() { | ||
return Err(Error::ENOMEM); | ||
} | ||
// SAFETY: [`Cdev::cdev`] is valid and non-null. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we don't have a Cdev
object yet, so the type invariants do not guarantee the pointer is valid.
Instead, consider something like:
// SAFETY: [`Cdev::cdev`] is valid and non-null. | |
// SAFETY: `cdev` should be valid since `cdev_alloc()` returned a non-null pointer. |
rust/kernel/chrdev.rs
Outdated
} | ||
|
||
fn add(&mut self, dev: bindings::dev_t, count: c_types::c_uint) -> KernelResult<()> { | ||
// SAFETY: [`Cdev::cdev`] is valid and non-null. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// SAFETY: [`Cdev::cdev`] is valid and non-null. | |
// SAFETY: `self.0` is valid and non-null by the type invariants. |
rust/kernel/chrdev.rs
Outdated
return Err(Error::ENOMEM); | ||
} | ||
// SAFETY: [`Cdev::cdev`] is valid and non-null. | ||
// `fops` and `module` have `'static` lifetime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second line does not matter for the soundness of this unsafe
block. It may matter for future usage of this struct, such as other FFI calls etc. So this fact perhaps needs to be part of the type invariants instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that true, though? If fops
or module
get deallocated before the struct cdev
they're part of... very interesting and unsound things could happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that is what I meant by "for future usage of this struct, such as other FFI calls". But for this unsafe
block in particular, it is not making it sound or otherwise, i.e. even if both were null the block would still be safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True! I'm not sure where the crashes could happen in such a case. If owner
is freed prematurely, the call to cdev_del
in Drop
could crash. But worse, if fops
is freed prematurely, the crash could happen anywhere in the kernel's C code!
I'm not sure how this could/should be documented or analysed. (I'm sure this can be addressed, I just can't see how myself yet)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suggestion (sorry if it was not clear) is to record this fact in the type # Invariants
section, since it is something that the type guarantees. Then perhaps you can mention the fields are properly filled and will never become invalid thanks to the invariant when calling cdev_add()
.
rust/kernel/chrdev.rs
Outdated
/// | ||
/// # Invariants | ||
/// | ||
/// The pointer [`Cdev::cdev`] is valid and non-null. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no Cdev::cdev
since you have a tuple struct. Perhaps change to a named member if you prefer self.cdev
etc.
rust/kernel/chrdev.rs
Outdated
|
||
impl Drop for Cdev { | ||
fn drop(&mut self) { | ||
// SAFETY: [`Cdev::cdev`] is guaranteed valid and non-null. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// SAFETY: [`Cdev::cdev`] is guaranteed valid and non-null. | |
// SAFETY: `self.0` is guaranteed valid and non-null by the type invariants. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for the review @ojeda ! I will fix up those SAFETY
comments. My internal safety comment generator isn't properly calibrated yet, bear with me :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries at all -- thanks to you for being one of the early ones joining. We all needed to get accustomed, and if this effort is successful, we will have many other kernel developers asking the same things sooner or later. So hopefully these reviews help other readers as well (we can even reuse a few good review chains as documentation for newcomers etc., since they will likely have the same questions etc.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's awesome being a guinea pig :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear, we should avoid linking to the PRs themselves unless everyone involved agrees and instead write better docs based on the review chains found in the PRs :-)
(The reason is that some people may prefer not to be shown in that light, i.e. asking "newbie" questions -- which they are not -- we all asked the same questions at some point)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this experience extremely interesting - it shows how much SAFETY
rigour is/will be required to guarantee a safe interface between kernel C and Rust drivers. One spends a few orders of magnitude more effort in the kernel Rust core, in order to get Rust driver safety.
A few potential problems spring to mind, though:
- human fallibility regarding initial
SAFETY
reasoning - even more importantly, kernel C behaviour changing due to refactoring or bug fixes, and existing
SAFETY
reasoning going stale. this could be a serious concern, as the kernel is in constant flux (internally).
On the whole I believe Rust-for-Linux has unbelievable potential (that's why I'm here!). The optics and politics of the above could be an issue. The more technical, automated mitigations we can find, the better.
8ecbe0c
to
14f12d1
Compare
14f12d1
to
5115e9a
Compare
.github/workflows/ci.yaml
Outdated
@@ -223,6 +223,10 @@ jobs: | |||
| sed s:$'\r'$:: \ | |||
| tee qemu-stdout.log | |||
|
|||
# the kernel should not be generating any warnings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# the kernel should not be generating any warnings | |
# The kernel should not be generating any warnings |
rust/kernel/chrdev.rs
Outdated
// [`(*self.0).ops`] and [`(*self.0).owner`] are valid and have | ||
// static lifetime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would mention why, at least for owner
since it may confuse a future reader, i.e.:
- In the case of
ops
it holds because it was coerced from a reference with'static
lifetime. - However, the
owner
case is different: it is because themodule.0
pointer comes from aThisModule
(the fact that the reference to theThisModule
has static lifetime does not imply the pointer insideThisModule
does -- it is the invariants ofThisModule
that ensure that).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like owner
has module lifetime, not static lifetime. That means we could potentially cause bad stuff to happen by giving Cdev::alloc
the this_module
of a different module! I wonder if we need to consider that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Sorry I haven't followed this closely, I'll take a closer look tomorrow. For now just answer the question above.)
We need to consider it. The ultimate one should ask to help answer this is: can someone write code without unsafe
blocks that violate memory safety? If the answer is yes, then we need to do something about it; in general we have two possible routes: mark the function unsafe
and document the safety requirements that callers are required to manually satisfy, or (preferred) find a way to prevent abuse of the function (e.g., by not allowing users to specify arbitrary modules) though it's not always possible...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, ThisModule
is only to be used only within the current module (the constructor is unsafe to ensure proper usage, but I opened those two issues because it should be improved). In fact, I think nobody should be creating any ThisModule
object in normal circumstances, only using the one provided via the module!
macro (which is static
). We can create a lint to ensure this later on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall I put a TODO
in the SAFETY
comments for now re. owner
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should ensure nobody falls into that trap, so I wouldn't put in chrdev.rs
but in ThisModule
. Probably we should create a get()
unsafe method in ThisModule
and explain that you cannot pass it to another module. I will add this point to the issue I created above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just noticed that we might be in a serious spot of lifetime trouble here. I'll push the updated SAFETY
s, and place a review comment outlining the problem.
rust/kernel/chrdev.rs
Outdated
/// # Invariants | ||
/// | ||
/// - [`self.0`] is valid and non-null. | ||
/// - [`(*self.0).ops`] is valid and has static lifetime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// - [`(*self.0).ops`] is valid and has static lifetime. | |
/// - [`(*self.0).ops`] is valid, non-null and has static lifetime. |
5115e9a
to
9a00c70
Compare
rust/kernel/chrdev.rs
Outdated
// - [`(*self.0).ops`] is valid, non-null, and has a static lifetime. | ||
// - [`(*self.0).owner`] is valid and, if non-null, has module lifetime. | ||
// This guarantees that [`(*self.0).ops`] and [`(*self.0).owner`] (if non-null) | ||
// will live at least as long as [`self.0`]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong! The whole reason we're doing this, is because our struct cdev
instance might outlive the module - it's a kobject
. So the cdev
ends up still alive, but with an owner
pointer to freed memory.
But the kernel C code probably has the same issue - I'm guessing the cdev
won't touch owner
when all file handles to it are closed. In that case, we are good.
What's your opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, in the kernel, module removal is best effort. I don't know what is nowadays supposed to be guaranteed to work or not, we can ask.
cdev_del
states that the ops can still be called even after it returns (that one should be fine, since they are built into the kernel
crate rather than the modules), but does not mention the other fields. We would need to check if they dereference the module
anywhere.
In any case, for the comment, I would mention the detail about ops
, since it is the important one to guarantee to be available forever; and create an issue for owner
s in general (of all kinds of devices etc.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, I would avoid repeating all the invariants. It is more important to explain why those invariants are good enough for the unsafe
block (e.g. that cdev_del()
requires ops
to live forever, and so the 'static
lifetime is OK to satisfy that).
rust/kernel/chrdev.rs
Outdated
// - [`(*self.0).owner`] is valid and, if non-null, has module lifetime. | ||
// This guarantees that [`(*self.0).ops`] and [`(*self.0).owner`] (if non-null) | ||
// will live at least as long as [`self.0`]. | ||
// dev and count are scalars. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line can be removed, i.e. it does not make it sound or not (nevertheless, if e.g. dev
or count
have a range precondition, then it would need to be here. From a quick look, cdev_add
does not say; although kobj_map
limits n
to 255
, so looks fine).
// dev and count are scalars. |
rust/kernel/chrdev.rs
Outdated
Ok(Self(cdev)) | ||
} | ||
|
||
fn add(&mut self, dev: bindings::dev_t, count: c_types::c_uint) -> KernelResult<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fn add(&mut self, dev: bindings::dev_t, count: c_types::c_uint) -> KernelResult<()> { | |
fn add(&mut self, dev: bindings::dev_t, count: c_types::c_uint) -> KernelResult { |
rust/kernel/chrdev.rs
Outdated
// - [`(*self.0).ops`] is valid, non-null, and has a static lifetime. | ||
// - [`(*self.0).owner`] is valid and, if non-null, has module lifetime. | ||
// This guarantees that [`(*self.0).ops`] and [`(*self.0).owner`] (if non-null) | ||
// will live at least as long as [`self.0`]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, in the kernel, module removal is best effort. I don't know what is nowadays supposed to be guaranteed to work or not, we can ask.
cdev_del
states that the ops can still be called even after it returns (that one should be fine, since they are built into the kernel
crate rather than the modules), but does not mention the other fields. We would need to check if they dereference the module
anywhere.
In any case, for the comment, I would mention the detail about ops
, since it is the important one to guarantee to be available forever; and create an issue for owner
s in general (of all kinds of devices etc.).
9a00c70
to
1c83703
Compare
rust/kernel/chrdev.rs
Outdated
@@ -149,12 +197,9 @@ unsafe impl<const N: usize> Sync for Registration<{ N }> {} | |||
impl<const N: usize> Drop for Registration<{ N }> { | |||
fn drop(&mut self) { | |||
if let Some(inner) = self.inner.as_mut() { | |||
// SAFETY: Calling unsafe functions, `0..inner.used` of | |||
// `inner.cdevs` are initialized in `Registration::register`. | |||
// SAFETY: TODO needs explanation why unregister_chrdev_region |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think something that says that self.inner
being Some(_)
means that a previous call to alloc_chrdev_region
completed successfully would be fine here.
No need to apologise at all! Your suggestion made perfect sense to everyone at the time, including me. |
1c83703
to
d1276fd
Compare
unsafe { | ||
for i in 0..inner.used { | ||
bindings::cdev_del(inner.cdevs[i].as_mut_ptr()); | ||
} | ||
bindings::unregister_chrdev_region(inner.dev, N.try_into().unwrap()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a comment here about why the unwrap()
can never panic? E.g.
// N.try_into() returned Some in [`register`], and N is constant, so it will return
// Some here too, which ensures [`unwrap`] will not panic.
bindings::unregister_chrdev_region(inner.dev, N.try_into().unwrap());
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please!
In fact, I think we could start // PANIC:
and // NOPANIC:
annotations nearby unwrap()
, expect()
, etc.
- The former would annotate those that we believe may actually panic -- so it needs to be a justification of why the outer function was not made fallible to begin with.
- The latter would annotate those that we believe cannot panic -- so it needs to justify why that is so. i.e. this is the category that could have potentially been
unwrap_unchecked()
.
At the moment, we have likely too many places due to allocations, so it can wait since otherwise we would be flooded with this comments; but later on we really should not have too many of these comments, which makes it worth the noise to have them, I think.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, the whole fallible allocations thing is still in flux, so adding these annotations all over the place probably won't be productive. So let's defer until later.
That said, I'm not sure, but I think we can eliminate the unwrap
here:
let mut inner = this.inner.as_mut().unwrap();
simply by folding the alloc_chrdev_region
into new
. In that case we wouldn't need self.inner
to be an Option<RegistrationInner>
, it could simply be a RegistrationInner
. Things become simpler, and there's one fewer potential unwrap panic.
Not sure if it's worth doing a follow-up PR for this (if it does indeed work)? Maybe there are more important fish to fry at this point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PS I'm not sure what the CI failure is about? The CI worked fine in my cloned repo. Same git id. https://github.com/TheSven73/linux/actions/runs/778067926
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... I just re-triggered them. kobject_add
appears there, but looks it is not coming from us:
[ 0.628134] RIP: 0010:sysfs_create_file_ns+0x73/0x90
...
[ 0.628165] kobject_add_internal+0x21a/0x3c0
[ 0.628167] kobject_add+0x88/0xe0
[ 0.628170] add_sysfs_fw_map_entry+0x61/0x70
[ 0.628173] firmware_memmap_init+0x1f/0x2a
[ 0.628175] ? firmware_map_add_early+0x57/0x57
[ 0.628178] do_one_initcall+0xb3/0x240
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, the whole fallible allocations thing is still in flux, so adding these annotations all over the place probably won't be productive. So let's defer until later.
We could still start annotating the ones unrelated to allocations, though (nevertheless, I am not asking to be added here, since we still need to discuss whether to do it or not).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it's worth doing a follow-up PR for this (if it does indeed work)? Maybe there are more important fish to fry at this point?
Never mind, I'm running into borrow checker issue due to the Pin
, a concept which I don't understand well enough yet. I'm going to leave this well alone :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pin
is indeed a bit tricky due to limitations of the language; the concept itself is easy. Hopefully the language will improve the ergonomics. Also the docs around it could be improved a bit.
d1276fd
to
7f96c1b
Compare
Updated title and commit message to make it clear that this issue occurs widely but is not seen as a problem - because module onloading is best effort only. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've got one question, and two places that'll need a rebase, but otherwise this looks good to me. Thanks for wokring on i!
rust/kernel/chrdev.rs
Outdated
fn alloc( | ||
fops: &'static bindings::file_operations, | ||
module: &'static crate::ThisModule, | ||
) -> KernelResult<Self> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll need to rebase this on rust
and change this to Result
, see #259
this.inner = Some(RegistrationInner { | ||
dev, | ||
used: 0, | ||
cdevs: [MaybeUninit::<bindings::cdev>::uninit(); N], | ||
cdevs: [NONE; N], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does just putting None
here not work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately not... see here
rust/kernel/chrdev.rs
Outdated
Ok(Self(cdev)) | ||
} | ||
|
||
fn add(&mut self, dev: bindings::dev_t, count: c_types::c_uint) -> KernelResult { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
Note that this issue is potentially present on any driver module which stores its `cdev` in `kmalloc`-ed memory. This is not seen as a problem, as module unloading is currently "best effort" only. The kernel's `struct cdev` is a reference-counted `kobject`. This means that the object isn't guaranteed to be cleaned up after a call to `cdev_del` - the cleanup may occur later. Rust's `chrdev` places the `struct cdev` in `kmalloc`-ed memory. On module unload, it calls `cdev_del` and `kfree`s all module memory, including the `struct cdev`. But that structure might only be cleaned up later - resulting in a potential use-after-free. This issue is reliably triggered using CONFIG_DEBUG_KOBJECT_RELEASE, which has been developed specifically to catch this subtle class of bugs. Fix by allocating the `cdev` using `cdev_alloc`, which stores the object on the kernel's `kalloc` heap. Now it can outlive the module, and be cleaned up+released when the kernel decides it's time. Signed-off-by: Sven Van Asbroeck <[email protected]>
Pull Requests should not introduce any kernel warnings. Fail the workflow if warnings are spotted in the log. Signed-off-by: Sven Van Asbroeck <[email protected]>
7f96c1b
to
2bcc169
Compare
Rebased to latest master branch, as requested. Thank you @alex for the review, really appreciate it ! |
Thanks for your work on this! |
Note that this issue is potentially present on any driver module which
stores its
cdev
inkmalloc
-ed memory. This is not seen as a problem,as module unloading is currently "best effort" only.
The kernel's
struct cdev
is a reference-countedkobject
. Thismeans that the object isn't guaranteed to be cleaned up after a
call to
cdev_del
- the cleanup may occur later.Rust's
chrdev
places thestruct cdev
inkmalloc
-ed memory.On module unload, it calls
cdev_del
andkfree
s all module memory,including the
struct cdev
. But that structure might only be cleanedup later - resulting in a potential use-after-free.
This issue is reliably triggered using CONFIG_DEBUG_KOBJECT_RELEASE,
which has been developed specifically to catch this subtle class of
bugs.
Fix by allocating the
cdev
usingcdev_alloc
, which stores theobject on the kernel's
kalloc
heap. Now it can outlive themodule, and be cleaned up+released when the kernel decides it's time.