Skip to content

Commit

Permalink
docs(data_structures): improve docs for stack types (#8356)
Browse files Browse the repository at this point in the history
Improve docs for `Stack`, `NonEmptyStack` and `SparseStack`.
  • Loading branch information
overlookmotel committed Jan 8, 2025
1 parent fb389f7 commit e0a09ab
Show file tree
Hide file tree
Showing 4 changed files with 83 additions and 32 deletions.
11 changes: 7 additions & 4 deletions crates/oxc_data_structures/src/stack/mod.rs
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
//! Contains the following FILO data structures:
//! - [`Stack`]: A growable stack
//! - [`SparseStack`]: A stack that can have empty entries
//! - [`NonEmptyStack`]: A growable stack that can never be empty, allowing for more efficient
//! operations
//!
//! * [`Stack`]: A growable stack, equivalent to [`Vec`], but more efficient for stack usage (push/pop).
//! * [`NonEmptyStack`]: A growable stack that can never be empty, allowing for more efficient operations
//! (very fast `last` / `last_mut`).
//! * [`SparseStack`]: A growable stack of `Option`s, optimized for low memory usage when many entries in
//! the stack are empty (`None`).
mod capacity;
mod common;
mod non_empty;
Expand Down
59 changes: 41 additions & 18 deletions crates/oxc_data_structures/src/stack/non_empty.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,44 +9,67 @@ use super::{NonNull, StackCapacity, StackCommon};

/// A stack which can never be empty.
///
/// `NonEmptyStack` is created initially with 1 entry, and `pop` does not allow removing it
/// (though that initial entry can be mutated with `last_mut`).
/// [`NonEmptyStack`] is created initially with 1 entry, and [`pop`] does not allow removing it
/// (though that initial entry can be mutated with [`last_mut`]).
///
/// The fact that the stack is never empty makes all operations except `pop` infallible.
/// `last` and `last_mut` are branchless.
/// The fact that the stack is never empty makes all operations except [`pop`] infallible.
/// [`last`] and [`last_mut`] are branchless.
///
/// The trade-off is that you cannot create a `NonEmptyStack` without allocating.
/// The trade-off is that you cannot create a [`NonEmptyStack`] without allocating,
/// and you must create an initial value for the "dummy" initial entry.
/// If that is not a good trade-off for your use case, prefer [`Stack`], which can be empty.
///
/// [`NonEmptyStack`] is usually a better choice than [`Stack`], unless either:
///
/// 1. The stack will likely never have anything pushed to it.
/// [`NonEmptyStack::new`] always allocates, whereas [`Stack::new`] does not.
/// So if stack usually starts empty and remains empty, [`Stack`] will avoid an allocation.
/// This is the same as how [`Vec`] does not allocate until you push a value into it.
///
/// 2. The type the stack holds is large or expensive to construct, so there's a high cost in having to
/// create an initial dummy value (which [`NonEmptyStack`] requires, but [`Stack`] doesn't).
///
/// [`SparseStack`] may be preferable if the type you're storing is an `Option`.
///
/// To simplify implementation, zero size types are not supported (e.g. `NonEmptyStack<()>`).
///
/// ## Design
/// Designed for maximally efficient `push`, `pop`, and reading/writing the last value on stack.
/// Designed for maximally efficient [`push`], [`pop`], and reading/writing the last value on stack
/// ([`last`] / [`last_mut`]).
///
/// The alternative would likely be to use a `Vec`. But `Vec` is optimized for indexing into at
/// The alternative would likely be to use a [`Vec`]. But `Vec` is optimized for indexing into at
/// arbitrary positions, not for `push` and `pop`. `Vec` stores `len` and `capacity` as integers,
/// so requires pointer maths on every operation: `let entry_ptr = base_ptr + index * size_of::<T>();`.
///
/// In comparison, `NonEmptyStack` contains a `cursor` pointer, which always points to last entry
/// In comparison, [`NonEmptyStack`] contains a `cursor` pointer, which always points to last entry
/// on stack, so it can be read/written with a minimum of operations.
///
/// This design is similar to `std`'s slice iterator.
/// This design is similar to [`std`'s slice iterators].
///
/// Comparison to `Vec`:
/// * `last` and `last_mut` are 1 instruction, instead of `Vec`'s 4.
/// * `pop` is 1 instruction shorter than `Vec`'s equivalent.
/// * `push` is 1 instruction shorter than `Vec`'s equivalent, and uses 1 less register.
/// Comparison to [`Vec`]:
/// * [`last`] and [`last_mut`] are 1 instruction, instead of `Vec`'s 4.
/// * [`pop`] is 1 instruction shorter than `Vec`'s equivalent.
/// * [`push`] is 1 instruction shorter than `Vec`'s equivalent, and uses 1 less register.
///
/// ### Possible alternative designs
/// 1. `cursor` could point to *after* last entry, rather than *to* it. This has advantage that `pop`
/// uses 1 less register, but disadvantage that `last` and `last_mut` are 2 instructions, not 1.
/// 1. `cursor` could point to *after* last entry, rather than *to* it. This has advantage that [`pop`]
/// uses 1 less register, but disadvantage that [`last`] and [`last_mut`] are 2 instructions, not 1.
/// <https://godbolt.org/z/xnx7YP5de>
///
/// 2. Stack could grow downwards, like `bumpalo` allocator does. This would probably make `pop` use
/// 1 less register, but at the cost that the stack can never grow in place, which would incur more
/// memory copies when the stack grows.
/// 2. Stack could grow downwards, like `bumpalo` allocator does. This would probably make [`pop`] use
/// 1 less register, but at the cost that: (a) the stack can never grow in place, which would incur
/// more memory copies when the stack grows, and (b) [`as_slice`] would have the entries in
/// reverse order.
///
/// [`push`]: NonEmptyStack::push
/// [`pop`]: NonEmptyStack::pop
/// [`last`]: NonEmptyStack::last
/// [`last_mut`]: NonEmptyStack::last_mut
/// [`as_slice`]: NonEmptyStack::as_slice
/// [`Stack`]: super::Stack
/// [`Stack::new`]: super::Stack::new
/// [`SparseStack`]: super::SparseStack
/// [`std`'s slice iterators]: std::slice::Iter
pub struct NonEmptyStack<T> {
/// Pointer to last entry on stack.
/// Points *to* last entry, not *after* last entry.
Expand Down
15 changes: 13 additions & 2 deletions crates/oxc_data_structures/src/stack/sparse.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,16 @@ use super::{NonEmptyStack, Stack};

/// Stack which is sparsely filled.
///
/// Functionally equivalent to a stack implemented as `Vec<Option<T>>`, but more memory-efficient
/// Functionally equivalent to [`NonEmptyStack<Option<T>>`], but more memory-efficient
/// in cases where majority of entries in the stack will be empty (`None`).
///
/// It has the same advantages as [`NonEmptyStack`] in terms of [`last`] and [`last_mut`] being
/// infallible and branchless, and with very fast lookup (without any pointer maths).
/// [`SparseStack`]'s advantage over [`NonEmptyStack`] is less memory usage for empty entries (`None`).
///
/// Stack is initialized with a single entry which can never be popped off.
/// If `Program` has a entry on the stack, can use this initial entry for it. Get value for `Program`
/// in `exit_program` visitor with `SparseStack::take_last` instead of `SparseStack::pop`.
/// in `exit_program` visitor with [`take_last`] instead of [`pop`].
///
/// The stack is stored as 2 arrays:
/// 1. `has_values` - Records whether an entry on the stack has a value or not (`Some` or `None`).
Expand All @@ -19,12 +23,19 @@ use super::{NonEmptyStack, Stack};
///
/// e.g. if `T` is 24 bytes, and 90% of stack entries have no values:
/// * `Vec<Option<T>>` is 24 bytes per entry (or 32 bytes if `T` has no niche).
/// * `NonEmptyStack<Option<T>>` is same.
/// * `SparseStack<T>` is 4 bytes per entry.
///
/// When the stack grows and reallocates, `SparseStack` has less memory to copy, which is a performance
/// win too.
///
/// To simplify implementation, zero size types are not supported (`SparseStack<()>`).
///
/// [`last`]: SparseStack::last
/// [`last_mut`]: SparseStack::last_mut
/// [`take_last`]: SparseStack::take_last
/// [`pop`]: SparseStack::pop
/// [`NonEmptyStack<Option<T>>`]: NonEmptyStack
pub struct SparseStack<T> {
has_values: NonEmptyStack<bool>,
values: Stack<T>,
Expand Down
30 changes: 22 additions & 8 deletions crates/oxc_data_structures/src/stack/standard.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,24 +12,38 @@ use super::{NonNull, StackCapacity, StackCommon};
/// If a non-empty stack is viable for your use case, prefer [`NonEmptyStack`], which is cheaper for
/// all operations.
///
/// [`NonEmptyStack`] is usually the better choice, unless:
/// 1. You want `new()` not to allocate.
/// 2. Creating initial value for `NonEmptyStack::new()` is expensive.
/// [`NonEmptyStack`] is usually the better choice, unless either:
///
/// 1. The stack will likely never have anything pushed to it.
/// [`NonEmptyStack::new`] always allocates, whereas [`Stack::new`] does not.
/// So if stack usually starts empty and remains empty, [`Stack`] will avoid an allocation.
/// This is the same as how [`Vec`] does not allocate until you push a value into it.
///
/// 2. The type the stack holds is large or expensive to construct, so there's a high cost in having to
/// create an initial dummy value (which [`NonEmptyStack`] requires, but [`Stack`] doesn't).
///
/// To simplify implementation, zero size types are not supported (`Stack<()>`).
///
/// ## Design
/// Designed for maximally efficient `push`, `pop`, and reading/writing the last value on stack
/// (although, unlike [`NonEmptyStack`], `last` and `last_mut` are fallible, and not branchless).
/// Designed for maximally efficient [`push`], [`pop`], and reading/writing the last value on stack
/// ([`last`] / [`last_mut`]). Although, unlike [`NonEmptyStack`], [`last`] and [`last_mut`] are
/// fallible, and not branchless. So [`Stack::last`] and [`Stack::last_mut`] are a bit more expensive
/// than [`NonEmptyStack`]'s equivalents.
///
/// The alternative would likely be to use a `Vec`. But `Vec` is optimized for indexing into at
/// The alternative would likely be to use a [`Vec`]. But `Vec` is optimized for indexing into at
/// arbitrary positions, not for `push` and `pop`. `Vec` stores `len` and `capacity` as integers,
/// so requires pointer maths on every operation: `let entry_ptr = base_ptr + index * size_of::<T>();`.
///
/// In comparison, `Stack` uses a `cursor` pointer, so avoids these calculations.
/// This is similar to how `std`'s slice iterators work.
/// In comparison, [`Stack`] uses a `cursor` pointer, so avoids these calculations.
/// This is similar to how [`std`'s slice iterators] work.
///
/// [`push`]: Stack::push
/// [`pop`]: Stack::pop
/// [`last`]: Stack::last
/// [`last_mut`]: Stack::last_mut
/// [`NonEmptyStack`]: super::NonEmptyStack
/// [`NonEmptyStack::new`]: super::NonEmptyStack::new
/// [`std`'s slice iterators]: std::slice::Iter
pub struct Stack<T> {
// Pointer to *after* last entry on stack.
cursor: NonNull<T>,
Expand Down

0 comments on commit e0a09ab

Please sign in to comment.