Skip to content

Commit

Permalink
New article: VecDeque::resize() optimization (#7)
Browse files Browse the repository at this point in the history
* New article: VecDeque::resize() optimization

* Improve the article by ChatGPT.

* Unpublish the article

* Improve links

* Apply suggestions from code review

* Add an alternative section and mention PR #106564

* Improve by ChatGPT

* Add release note links
  • Loading branch information
Folyd authored Jan 30, 2023
1 parent e6993f8 commit 30f2a5b
Show file tree
Hide file tree
Showing 3 changed files with 279 additions and 0 deletions.
268 changes: 268 additions & 0 deletions content/issue-1/vecdeque-resize-optimization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,268 @@
> This is the first article on **pr-demystifying** topics. Each article labeled **pr-demystifying** will attempt to demystify the details behind the PR.
While browsing the rust-lang repository's list of pull requests last month, I came across PR [#104435], titled `VecDeque::resize should re-use the buffer in the passed-in element`. This PR caught my attention because it seemed interesting and I wanted to understand more about it. I began to wonder why it was necessary to optimize `VecDeque::resize()` and how the old version might be lacking. I also wanted to know how the author had optimized the new version. After delving into the code in the PR, I was able to gain a deeper understanding of these issues.

## VecDeque::resize()

Firstly, let's get familiar with the [VecDeque::resize()](https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.resize).

> `pub fn resize(&mut self, new_len: usize, value: T)`
>
> Modifies the deque in-place so that `len()` is equal to `new_len`, either by removing excess elements from the back or by appending clones of `value` to the back.
```rust
use std::collections::VecDeque;

let mut buf = VecDeque::new();
buf.push_back(5);
buf.push_back(10);
buf.push_back(15);
assert_eq!(buf, [5, 10, 15]);

buf.resize(2, 0);
assert_eq!(buf, [5, 10]);

buf.resize(5, 20);
assert_eq!(buf, [5, 10, 20, 20, 20]);
```

The `VecDeque::resize()` API is simple to use. It takes two arguments: the new length to which the `VecDeque` should be resized, and a value to use for any new elements that are added to the `VecDeque` when it expands.

## The problem

However, if we don't look at the implementation details of this function, we might not realize that there is room for optimization. As the PR's author [@scottmcm] pointed out, the old version did not reuse the value that was passed in as the default, resulting in unnecessary cloning of values.

```rust
use std::collections::VecDeque;

let mut buf = VecDeque::new();
buf.resize(5, String::from("rust"));
```

For example, the string "rust" was cloned five times, even though only four were needed, because the `VecDeque::resize()` function used `VecDeque::resize_with()` underneath, which passed a closure to the `repeat_with().take()`.

```rust
pub fn resize(&mut self, new_len: usize, value: T) {
self.resize_with(new_len, || value.clone());
}

pub fn resize_with(&mut self, new_len: usize, generator: impl FnMut() -> T) {
let len = self.len();

if new_len > len {
self.extend(repeat_with(generator).take(new_len - len))
} else {
self.truncate(new_len);
}
}
```

This closure was called repeatedly until it reached the `take` limit, causing unnecessary cloning.

```rust
pub fn repeat_with<A, F: FnMut() -> A>(repeater: F) -> RepeatWith<F> {
RepeatWith { repeater }
}

/// An iterator that repeats elements of type `A` endlessly by
/// applying the provided closure `F: FnMut() -> A`.
#[derive(Copy, Clone, Debug)]
pub struct RepeatWith<F> {
repeater: F,
}

impl<A, F: FnMut() -> A> Iterator for RepeatWith<F> {
type Item = A;

#[inline]
fn next(&mut self) -> Option<A> {
Some((self.repeater)())
}

#[inline]
fn size_hint(&self) -> (usize, Option<usize>) {
(usize::MAX, None)
}
}
```

Now that we have identified the problem, let's move on to how the author fixed it.

## iter::repeat_n

The most significant change made in PR [#104435] was the replacement of `repeat_with().take()` with `repeat_n()`.

```diff
pub fn resize(&mut self, new_len: usize, value: T) {
if new_len > self.len() {
let extra = new_len - self.len();
- self.extend(repeat_with(generator).take(new_len - len))
+ self.extend(repeat_n(value, extra))
} else {
self.truncate(new_len);
}
}
```

We can learn more about `repeat_n` from [ACP: Uplift iter::repeat_n from itertools](https://github.com/rust-lang/libs-team/issues/120). The author proposes to uplift [itertools::repeat_n()](https://docs.rs/itertools/latest/itertools/fn.repeat_n.html) into the standard library, just like how [iter::repeat_with()]() has obviated [itertools::repeat_call()](https://docs.rs/itertools/latest/itertools/fn.repeat_call.html).

How does `repeat_n()` avoid the unnecessary cloning? Let’s dive into the code:

```rust
pub fn repeat_n<T: Clone>(element: T, count: usize) -> RepeatN<T> {
let mut element = ManuallyDrop::new(element);

if count == 0 {
// SAFETY: we definitely haven't dropped it yet, since we only just got
// passed it in, and because the count is zero the instance we're about
// to create won't drop it, so to avoid leaking we need to now.
unsafe { ManuallyDrop::drop(&mut element) };
}

RepeatN { element, count }
}

pub struct RepeatN<A> {
count: usize,
// Invariant: has been dropped iff count == 0.
element: ManuallyDrop<A>,
}

impl<A: Clone> Iterator for RepeatN<A> {
type Item = A;

#[inline]
fn next(&mut self) -> Option<A> {
if self.count == 0 {
return None;
}

self.count -= 1;
Some(if self.count == 0 {
// SAFETY: the check above ensured that the count used to be non-zero,
// so element hasn't been dropped yet, and we just lowered the count to
// zero so it won't be dropped later, and thus it's okay to take it here.
unsafe { ManuallyDrop::take(&mut self.element) }
} else {
A::clone(&mut self.element)
})
}
}
```

Not too much code, we can easily grasp it. The `RepeatN` struct returned by `repeat_n()` is the point. To save a cloning, `RepeatN` declares its `element` as the `ManuallyDrop` type.

## ManuallyDrop\<T\>

[ManuallyDrop\<T\>](https://doc.rust-lang.org/std/mem/struct.ManuallyDrop.html) is a zero-cost wrapper to inhibit compiler from automatically calling **T**’s destructor unless you call `ManuallyDrop::drop()` method.

The two most important APIs that can help us avoid unnecessary cloning are:

- `ManuallyDrop::take()`

```rust
pub unsafe fn take(slot: &mut ManuallyDrop<T>) -> T {
// SAFETY: we are reading from a reference, which is guaranteed
// to be valid for reads.
unsafe { ptr::read(&slot.value) }
}
```

- The `DerefMut` implementation

In the `Iterator` implementation of `RepeatN`, the `next()` method clones the element in each iteration until the `count` reaches 0. In the final iteration, the `ManuallyDrop::take()` function is used to reuse the buffer and avoid an extra clone.

Wait, but why does `A::clone(&mut self.element)` will get an instance of `A`? The type of `&mut self.element` is `&mut ManuallyDrop`, not `&mut A`. Well, the use of `ManuallyDrop` may seem obscure at first, but it becomes clearer when we consider the `Deref` and `DerefMut` traits that it implements. These traits allow for a type like `&mut ManuallyDrop<A>` to be treated as if it were a type like `&mut A`. This is known as [Deref coercion](https://doc.rust-lang.org/std/ops/trait.DerefMut.html#more-on-deref-coercion). As an example, consider the following code:

```rust
fn main() {
let mut a = String::from("A");
test(&mut a);
}

fn test(s: &str) {
println!("{s}");
}
```

Here, we are able to pass a `&mut String` value to the `test()` function, even though the function's argument type is `&str`. This is because `String` implements both [Deref](https://doc.rust-lang.org/src/alloc/string.rs.html#2455) and [DerefMut](https://doc.rust-lang.org/src/alloc/string.rs.html#2465), allowing it to be treated as if it were a `&str` value.

Similarly, `ManuallyDrop<A>` also implements `DerefMut`, allowing it to be treated as if it were an `&mut A` value.

```rust
impl<T: ?Sized> const DerefMut for ManuallyDrop<T> {
#[inline(always)]
fn deref_mut(&mut self) -> &mut T {
&mut self.value
}
}
```

It's important to note that the `Deref` trait also has a implementation for `&mut T`:

```rust
impl<T: ?Sized> const Deref for &mut T {
type Target = T;

fn deref(&self) -> &T {
*self
}
}
```

So `A::clone(&mut self.element)` works because `&mut ManuallyDrop<A>` can convert to `&mut A` due to **Deref coercion**, then `&mut A` can convert to `&A` also due to **Deref coercion**.

> Thanks to [@scottmcm]'s kindful reminder, we can change `A::clone(&mut self.element)` to `A::clone(&self.element)`. So I submitted a PR([#106564]) to fix it.
## Alternative?

During the review of this article, [@XeCycle] suggested that it might be possible to achieve the same result without using *unsafe* code by using `Option<(NonZeroUsize, T)>` instead. [@XeCycle] provided a [playground example](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=fd82d81d0f485acd769c3b0aaeeb5a3b) to demonstrate this idea.

However, [@scottmcm] also commented on this issue, explaining that `ManuallyDrop` is a more appropriate choice for this case because it is designed for *niche-filling*. As an example, consider the following code:

```rust
use core::mem::MaybeUninit;
use core::mem::ManuallyDrop;
use core::num::NonZeroUsize;
use core::alloc::Layout;

fn main() {
dbg!(Layout::new::<Option<Option<(NonZeroUsize, String)>>>());
dbg!(Layout::new::<Option<(usize, MaybeUninit<String>)>>());
dbg!(Layout::new::<Option<(usize, ManuallyDrop<String>)>>());
}
```

The output of this code is:

```
[src/main.rs:6] Layout::new::<Option<Option<(NonZeroUsize, String)>>>() = Layout {
size: 40,
align: 8 (1 << 3),
}
[src/main.rs:7] Layout::new::<Option<(usize, MaybeUninit<String>)>>() = Layout {
size: 40,
align: 8 (1 << 3),
}
[src/main.rs:8] Layout::new::<Option<(usize, ManuallyDrop<String>)>>() = Layout {
size: 32,
align: 8 (1 << 3),
}
```

As we can see, using `Option<(usize, ManuallyDrop<String>)>` results in a smaller size than the other options. Using `ManuallyDrop` allows `Option<RepeatN<String>>` to have the same size as `RepeatN<String>`, which is not the case with the other options.

> For more information on this topic, you can check out the article PR [#104435].
## Conclusion

As a Rust developer, I am often most concerned with the changes listed in the stable [release notes](https://github.com/rust-lang/rust/blob/master/RELEASES.md). However, this does not mean that I should not be interested in the individual pull requests (PRs) that are being merged into the project. There are hundreds of PRs merged each week, and each one has a story and an author behind it. That's why I propose the creation of a topic called [#pr-demystifying], where we can share articles about interesting or educational PRs in the Rust community. The PR [#104435], for example, may not be a major optimization, but it allowed me to learn a lot. I would like to thank the author [@scottmcm] for their work on this PR. I hope that this article and others like it will be helpful to others in the community.

> Thanks to [@scottmcm], [@XeCycle], [@zhanghandong] for proof reading this post!
[#pr-demystifying]: /topic/pr-demystifying
[#104435]: https://github.com/rust-lang/rust/pull/104435
[#106564]: https://github.com/rust-lang/rust/pull/106564
[@scottmcm]: https://github.com/scottmcm
[@XeCycle]: https://github.com/XeCycle
[@zhanghandong]: https://github.com/zhanghandong
9 changes: 9 additions & 0 deletions content/issue-1/zine.toml
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,12 @@ pub_date = "2022-12-10"
publish = true
featured = true
i18n.zh = { path = "/announcing-zh", file = "announcing-zh.md", title = "宣布 Rust 杂志", author = ["folyd", "rust-magazine"] }

[[article]]
file = "vecdeque-resize-optimization.md"
title = "VecDeque::resize() optimization"
author = "folyd"
topic = ["pr-demystifying", "optimization"]
pub_date = "2022-12-24"
publish = false
featured = true
2 changes: 2 additions & 0 deletions zine.toml
Original file line number Diff line number Diff line change
Expand Up @@ -40,3 +40,5 @@ highlight_theme = "ayu-light"
[topics]
announcement = {}
game = { name = "Game development", description = "Topic of rust game development." }
pr-demystifying = { name = "PR Demystifying", description = "Topic of PR demystifying in Rust community."}
optimization = {}

0 comments on commit 30f2a5b

Please sign in to comment.