Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RootedGuard<T> construction related unsoundness #564

Merged
merged 1 commit into from
Mar 22, 2025

Conversation

gmorenz
Copy link
Contributor

@gmorenz gmorenz commented Mar 17, 2025

We're storing a &mut Rooted<T> in RootedGuard even though that pointer is aliased by the pointer held by the GC. This is... suspicious.

Background information: There are 2 proposed memory models for rust right now, tree borrows and stacked borrows.

In the stacked borrows world, there's a stack-like structure containing "tags" describing which pointers have permission to access memory. Writing through a raw pointer will always invalidate all mutable references above it, and writing though a mutable reference will always invalidate all raw pointers above it. As such any ABAB interleaving off writes accesses through a mutable pointer A and a mutable reference B is always undefined behavior.

In the tree borrows world, a mutable pointer is equivalent to the mutable reference it was created from. This could actually excuse the pattern here, if the pointer stored in the GC was created from the &mut Rooted<T> in RootedGuard. That's not how we construct this. Maybe we could modify things so it is, but that would would be playing with fire, and there's no reason to.


That's not all.

When we construct RootedGuard, we do so with the following code

impl<'a, T: 'a + RootKind> RootedGuard<'a, T> {
    pub fn new(cx: *mut JSContext, root: &'a mut Rooted<T>, initial: T) -> Self {
        root.ptr.write(initial);
        unsafe {
            root.add_to_root_stack(cx);
        }
        RootedGuard { root }
    }
}


impl<T: RootKind> JS::Rooted<T> {
    pub unsafe fn add_to_root_stack(&mut self, cx: *mut JSContext) -> *mut Self {
        ... Eventually create a raw pointer from self
    }
}

This pattern is inevitably undefined behavior, under both stacked borrows and tree borrows. As demonstrated by this toy example that can be run in miri.

fn get_ptr(this: &mut usize) -> *mut usize {
    let ptr: *mut usize = this;
    ptr
}

fn main()  {
    let rooted = &mut 0;
    let gc_ptr: *mut usize = get_ptr(rooted);
    let root_ptr: *mut usize = rooted;

    unsafe{
        // Use the RootedGuard (writing to it)
        *root_ptr = 1;
        // Run a GC (even with just a read)
        *gc_ptr;
    };
}

Under stacked borrows:

We start with a stack for rooted that looks like [Unique(rooted), SharedReadWrite(root_ptr), Unique(get_ptr::this), SharedReadWrite(gc_ptr)]. Note that stacked borrows inserts new raw pointers into the stack above the item that they are created from (it doesn't work like a stack).

Then when we write to root_ptr we remove Unique(get_ptr::this) (because it's a unique pointer above us), and SharedReadWrite(gc_ptr) (because it's another SharedReadWrite, above us, with a non-SharedReadWrite between us and it - this is Stacked Borrows 2.1).

Under tree borrows:

gc_ptr shares permissions with get_ptr::this. root_ptr shares permissions with rooted. By writing to root_ptr we disable get_ptr::this and thus gc_ptr, resulting in undefined behavior.


One more nitpick, add_to_root_stack in full is

impl<T: RootKind> JS::Rooted<T> {
    pub unsafe fn add_to_root_stack(&mut self, cx: *mut JSContext) -> *mut Self {
        let ptr = self as *mut _;
        self.base.add_to_root_stack(cx, T::KIND)
    }
}

impl RootedBase {
    unsafe fn add_to_root_stack(&mut self, cx: *mut JSContext, kind: JS::RootKind) {
        let stack = Self::get_root_stack(cx, kind);
        self.stack = stack;
        self.prev = *stack;

        *stack = self as *mut _ as usize as _;
    }
}

This is worrying because we create a RootedBase mutable reference and then cast that to a raw pointer. My understanding is that under the stacked borrows model (but not the tree borrows model) the set of memory allowed to be accessed to via a pointer created from a reference to RootedBase is limited to the size of the RootedBase. Thus the GC should not be using this pointer to look at the rest of Rooted<T> under the stacked borrows model. See rust-lang/unsafe-code-guidelines#256


All of these issues have the same fix. Cast the mutable reference to a mutable pointer once, derive everything else from that mutable pointer. In stacked borrows nothing below that pointer is ever used, so it is never invalidated. In tree borrows the same except the terminology would be no parent of that pointer.

That's all this pull request does - it was originally going to be an issue but the diff part is really less work than the understanding part so it felt natural to add it on.

Note that the public add_to_root_stack function goes unused in servo, and that's the only public change in the API, so no coordination is necessary.

We're storing a `&mut Rooted<T>` in `RootedGuard` even though that pointer is aliased by the pointer held by the GC. This is... suspicious.

Background information: There are 2 proposed memory models for rust right now, [tree borrows](https://perso.crans.org/vanille/treebor/index.html) and [stacked borrows](https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md). 

In the stacked borrows world, there's a stack-like structure containing "tags" describing which pointers have permission to access memory. Writing through a raw pointer will always invalidate all mutable references above it, and writing though a mutable reference will always invalidate all raw pointers above it. As such any ABAB interleaving off  writes accesses through a mutable pointer A and a mutable reference B is always undefined behavior.

In the tree borrows world, a mutable pointer is *equivalent* to the mutable reference it was created from. This could actually excuse the pattern here, *if* the pointer stored in the GC was created from the `&mut Rooted<T>` in `RootedGuard`. That's not how we construct this. *Maybe* we could modify things so it is, but that would would be playing with fire, and there's no reason to.

---

That's not all.

When we construct `RootedGuard`, we do so with the following code

```rust
impl<'a, T: 'a + RootKind> RootedGuard<'a, T> {
    pub fn new(cx: *mut JSContext, root: &'a mut Rooted<T>, initial: T) -> Self {
        root.ptr.write(initial);
        unsafe {
            root.add_to_root_stack(cx);
        }
        RootedGuard { root }
    }
}


impl<T: RootKind> JS::Rooted<T> {
    pub unsafe fn add_to_root_stack(&mut self, cx: *mut JSContext) -> *mut Self {
        ... Eventually create a raw pointer from self
    }
}
```

This pattern is inevitably undefined behavior, under both stacked borrows and tree borrows. As demonstrated by this toy example that can be run in miri.

```rust
fn get_ptr(this: &mut usize) -> *mut usize {
    let ptr: *mut usize = this;
    ptr
}

fn main()  {
    let rooted = &mut 0;
    let gc_ptr: *mut usize = get_ptr(rooted);
    let root_ptr: *mut usize = rooted;

    unsafe{
        // Use the RootedGuard (writing to it)
        *root_ptr = 1;
        // Run a GC (even with just a read)
        *gc_ptr;
    };
}
```

Under stacked borrows:

We start with a stack for rooted that looks like `[Unique(rooted), SharedReadWrite(root_ptr), Unique(get_ptr::this), SharedReadWrite(gc_ptr)]`. Note that stacked borrows inserts new raw pointers into the stack above the item that they are created from (it doesn't work like a stack).

Then when we write to `root_ptr` we remove `Unique(get_ptr::this)` (because it's a unique pointer above us), and `SharedReadWrite(gc_ptr)` (because it's another `SharedReadWrite`, above us, with a non-`SharedReadWrite` between us and it - this is Stacked Borrows 2.1).

Under tree borrows:

`gc_ptr` shares permissions with `get_ptr::this`. `root_ptr` shares permissions with `rooted`. By writing to `root_ptr` we disable `get_ptr::this` and thus `gc_ptr`, resulting in undefined behavior.

---

One more nitpick, `add_to_root_stack` in full is


```rust
impl<T: RootKind> JS::Rooted<T> {
    pub unsafe fn add_to_root_stack(&mut self, cx: *mut JSContext) -> *mut Self {
        let ptr = self as *mut _;
        self.base.add_to_root_stack(cx, T::KIND)
    }
}

impl RootedBase {
    unsafe fn add_to_root_stack(&mut self, cx: *mut JSContext, kind: JS::RootKind) {
        let stack = Self::get_root_stack(cx, kind);
        self.stack = stack;
        self.prev = *stack;

        *stack = self as *mut _ as usize as _;
    }
}
```

This is worrying because we create a `RootedBase` mutable reference and then cast that to a raw pointer. My understanding is that under the stacked borrows model (but not the tree borrows model) the set of memory allowed to be accessed to via a pointer created from a reference to `RootedBase` is limited to the size of the `RootedBase`. Thus the `GC` should not be using this pointer to look at the rest of `Rooted<T>` under the stacked borrows model. See rust-lang/unsafe-code-guidelines#256

---

All of these issues have the same fix. Cast the mutable reference to a mutable pointer *once*, derive everything else from that mutable pointer. In stacked borrows nothing below that pointer is ever used, so it is never invalidated. In tree borrows the same except the terminology would be no parent of that pointer.

That's all this pull request does - it was originally going to be an issue but the change itself is small enough that it felt better to just make a PR.

Note that the public `add_to_root_stack` function goes unused in `servo`, and that's the only public change in the API, so no coordination is necessary.

Signed-off-by: Greg Morenz <[email protected]>
@gmorenz gmorenz force-pushed the rooted_construction branch from ee09c6d to 6e19074 Compare March 17, 2025 00:17

*stack = self as *mut _ as usize as _;
*stack = this as usize as _;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside: This cast is a roundtrip. I assume the intent is to expose the provenance.

I don't believe that should be necessary, but given that there are currently a lot of uses of deref and deref_mut on this type, creating &T and &mut T types across GC barriers, I'm not willing to change it. It looks like the sort of thing someone did to stop the compiler from performing some optimizations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am pretty sure there was a note in one of these cast hoops, but I cannot seem to find it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That usize -> _ cast was introduced in d40cfd5#diff-e27e06105da7e8ebf851404e43b36501f85aaf53097cdccf96b9f86c8a7cf1daR392 and I'm pretty sure it was needed since stackRoots_ is a u64 value, so it was doing usize -> u64.

Copy link
Member

@sagudev sagudev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks right to me.

@sagudev sagudev requested a review from jdm March 17, 2025 17:22
Copy link
Member

@jdm jdm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting! So it seems like the rule here is that public APIs should use safe references, but internally we need to turn those into raw pointers as soon as possible and only ever use those raw pointers to interact with the pointed-at objects.

@jdm jdm added this pull request to the merge queue Mar 22, 2025
@sagudev sagudev removed this pull request from the merge queue due to a manual request Mar 22, 2025
@sagudev sagudev added this pull request to the merge queue Mar 22, 2025
Merged via the queue into servo:main with commit 26c3518 Mar 22, 2025
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants