-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RootedGuard<T> construction related unsoundness #564
Conversation
We're storing a `&mut Rooted<T>` in `RootedGuard` even though that pointer is aliased by the pointer held by the GC. This is... suspicious. Background information: There are 2 proposed memory models for rust right now, [tree borrows](https://perso.crans.org/vanille/treebor/index.html) and [stacked borrows](https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md). In the stacked borrows world, there's a stack-like structure containing "tags" describing which pointers have permission to access memory. Writing through a raw pointer will always invalidate all mutable references above it, and writing though a mutable reference will always invalidate all raw pointers above it. As such any ABAB interleaving off writes accesses through a mutable pointer A and a mutable reference B is always undefined behavior. In the tree borrows world, a mutable pointer is *equivalent* to the mutable reference it was created from. This could actually excuse the pattern here, *if* the pointer stored in the GC was created from the `&mut Rooted<T>` in `RootedGuard`. That's not how we construct this. *Maybe* we could modify things so it is, but that would would be playing with fire, and there's no reason to. --- That's not all. When we construct `RootedGuard`, we do so with the following code ```rust impl<'a, T: 'a + RootKind> RootedGuard<'a, T> { pub fn new(cx: *mut JSContext, root: &'a mut Rooted<T>, initial: T) -> Self { root.ptr.write(initial); unsafe { root.add_to_root_stack(cx); } RootedGuard { root } } } impl<T: RootKind> JS::Rooted<T> { pub unsafe fn add_to_root_stack(&mut self, cx: *mut JSContext) -> *mut Self { ... Eventually create a raw pointer from self } } ``` This pattern is inevitably undefined behavior, under both stacked borrows and tree borrows. As demonstrated by this toy example that can be run in miri. ```rust fn get_ptr(this: &mut usize) -> *mut usize { let ptr: *mut usize = this; ptr } fn main() { let rooted = &mut 0; let gc_ptr: *mut usize = get_ptr(rooted); let root_ptr: *mut usize = rooted; unsafe{ // Use the RootedGuard (writing to it) *root_ptr = 1; // Run a GC (even with just a read) *gc_ptr; }; } ``` Under stacked borrows: We start with a stack for rooted that looks like `[Unique(rooted), SharedReadWrite(root_ptr), Unique(get_ptr::this), SharedReadWrite(gc_ptr)]`. Note that stacked borrows inserts new raw pointers into the stack above the item that they are created from (it doesn't work like a stack). Then when we write to `root_ptr` we remove `Unique(get_ptr::this)` (because it's a unique pointer above us), and `SharedReadWrite(gc_ptr)` (because it's another `SharedReadWrite`, above us, with a non-`SharedReadWrite` between us and it - this is Stacked Borrows 2.1). Under tree borrows: `gc_ptr` shares permissions with `get_ptr::this`. `root_ptr` shares permissions with `rooted`. By writing to `root_ptr` we disable `get_ptr::this` and thus `gc_ptr`, resulting in undefined behavior. --- One more nitpick, `add_to_root_stack` in full is ```rust impl<T: RootKind> JS::Rooted<T> { pub unsafe fn add_to_root_stack(&mut self, cx: *mut JSContext) -> *mut Self { let ptr = self as *mut _; self.base.add_to_root_stack(cx, T::KIND) } } impl RootedBase { unsafe fn add_to_root_stack(&mut self, cx: *mut JSContext, kind: JS::RootKind) { let stack = Self::get_root_stack(cx, kind); self.stack = stack; self.prev = *stack; *stack = self as *mut _ as usize as _; } } ``` This is worrying because we create a `RootedBase` mutable reference and then cast that to a raw pointer. My understanding is that under the stacked borrows model (but not the tree borrows model) the set of memory allowed to be accessed to via a pointer created from a reference to `RootedBase` is limited to the size of the `RootedBase`. Thus the `GC` should not be using this pointer to look at the rest of `Rooted<T>` under the stacked borrows model. See rust-lang/unsafe-code-guidelines#256 --- All of these issues have the same fix. Cast the mutable reference to a mutable pointer *once*, derive everything else from that mutable pointer. In stacked borrows nothing below that pointer is ever used, so it is never invalidated. In tree borrows the same except the terminology would be no parent of that pointer. That's all this pull request does - it was originally going to be an issue but the change itself is small enough that it felt better to just make a PR. Note that the public `add_to_root_stack` function goes unused in `servo`, and that's the only public change in the API, so no coordination is necessary. Signed-off-by: Greg Morenz <[email protected]>
ee09c6d
to
6e19074
Compare
|
||
*stack = self as *mut _ as usize as _; | ||
*stack = this as usize as _; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside: This cast is a roundtrip. I assume the intent is to expose the provenance.
I don't believe that should be necessary, but given that there are currently a lot of uses of deref
and deref_mut
on this type, creating &T
and &mut T
types across GC barriers, I'm not willing to change it. It looks like the sort of thing someone did to stop the compiler from performing some optimizations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am pretty sure there was a note in one of these cast hoops, but I cannot seem to find it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That usize -> _ cast was introduced in d40cfd5#diff-e27e06105da7e8ebf851404e43b36501f85aaf53097cdccf96b9f86c8a7cf1daR392 and I'm pretty sure it was needed since stackRoots_ is a u64 value, so it was doing usize -> u64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks right to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting! So it seems like the rule here is that public APIs should use safe references, but internally we need to turn those into raw pointers as soon as possible and only ever use those raw pointers to interact with the pointed-at objects.
We're storing a
&mut Rooted<T>
inRootedGuard
even though that pointer is aliased by the pointer held by the GC. This is... suspicious.Background information: There are 2 proposed memory models for rust right now, tree borrows and stacked borrows.
In the stacked borrows world, there's a stack-like structure containing "tags" describing which pointers have permission to access memory. Writing through a raw pointer will always invalidate all mutable references above it, and writing though a mutable reference will always invalidate all raw pointers above it. As such any ABAB interleaving off writes accesses through a mutable pointer A and a mutable reference B is always undefined behavior.
In the tree borrows world, a mutable pointer is equivalent to the mutable reference it was created from. This could actually excuse the pattern here, if the pointer stored in the GC was created from the
&mut Rooted<T>
inRootedGuard
. That's not how we construct this. Maybe we could modify things so it is, but that would would be playing with fire, and there's no reason to.That's not all.
When we construct
RootedGuard
, we do so with the following codeThis pattern is inevitably undefined behavior, under both stacked borrows and tree borrows. As demonstrated by this toy example that can be run in miri.
Under stacked borrows:
We start with a stack for rooted that looks like
[Unique(rooted), SharedReadWrite(root_ptr), Unique(get_ptr::this), SharedReadWrite(gc_ptr)]
. Note that stacked borrows inserts new raw pointers into the stack above the item that they are created from (it doesn't work like a stack).Then when we write to
root_ptr
we removeUnique(get_ptr::this)
(because it's a unique pointer above us), andSharedReadWrite(gc_ptr)
(because it's anotherSharedReadWrite
, above us, with a non-SharedReadWrite
between us and it - this is Stacked Borrows 2.1).Under tree borrows:
gc_ptr
shares permissions withget_ptr::this
.root_ptr
shares permissions withrooted
. By writing toroot_ptr
we disableget_ptr::this
and thusgc_ptr
, resulting in undefined behavior.One more nitpick,
add_to_root_stack
in full isThis is worrying because we create a
RootedBase
mutable reference and then cast that to a raw pointer. My understanding is that under the stacked borrows model (but not the tree borrows model) the set of memory allowed to be accessed to via a pointer created from a reference toRootedBase
is limited to the size of theRootedBase
. Thus theGC
should not be using this pointer to look at the rest ofRooted<T>
under the stacked borrows model. See rust-lang/unsafe-code-guidelines#256All of these issues have the same fix. Cast the mutable reference to a mutable pointer once, derive everything else from that mutable pointer. In stacked borrows nothing below that pointer is ever used, so it is never invalidated. In tree borrows the same except the terminology would be no parent of that pointer.
That's all this pull request does - it was originally going to be an issue but the diff part is really less work than the understanding part so it felt natural to add it on.
Note that the public
add_to_root_stack
function goes unused inservo
, and that's the only public change in the API, so no coordination is necessary.