New article: VecDeque::resize() optimization (#7)

* New article: VecDeque::resize() optimization * Improve the article by ChatGPT. * Unpublish the article * Improve links * Apply suggestions from code review * Add an alternative section and mention PR #106564 * Improve by ChatGPT * Add release note links
RustMagazine · Jan 30, 2023 · 30f2a5b · 30f2a5b
1 parent e6993f8
commit 30f2a5b
Show file tree

Hide file tree

Showing 3 changed files with 279 additions and 0 deletions.
diff --git a/content/issue-1/vecdeque-resize-optimization.md b/content/issue-1/vecdeque-resize-optimization.md
@@ -0,0 +1,268 @@
+> This is the first article on **pr-demystifying** topics. Each article labeled **pr-demystifying** will attempt to demystify the details behind the PR.
+
+While browsing the rust-lang repository's list of pull requests last month, I came across PR [#104435], titled `VecDeque::resize should re-use the buffer in the passed-in element`. This PR caught my attention because it seemed interesting and I wanted to understand more about it. I began to wonder why it was necessary to optimize `VecDeque::resize()` and how the old version might be lacking. I also wanted to know how the author had optimized the new version. After delving into the code in the PR, I was able to gain a deeper understanding of these issues.
+
+## VecDeque::resize()
+
+Firstly, let's get familiar with the [VecDeque::resize()](https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.resize).
+
+> `pub fn resize(&mut self, new_len: usize, value: T)`
+>
+> Modifies the deque in-place so that `len()` is equal to `new_len`, either by removing excess elements from the back or by appending clones of `value` to the back.
+
+```rust
+use std::collections::VecDeque;
+
+let mut buf = VecDeque::new();
+buf.push_back(5);
+buf.push_back(10);
+buf.push_back(15);
+assert_eq!(buf, [5, 10, 15]);
+
+buf.resize(2, 0);
+assert_eq!(buf, [5, 10]);
+
+buf.resize(5, 20);
+assert_eq!(buf, [5, 10, 20, 20, 20]);
+```
+
+The `VecDeque::resize()` API is simple to use. It takes two arguments: the new length to which the `VecDeque` should be resized, and a value to use for any new elements that are added to the `VecDeque` when it expands.
+
+## The problem
+
+However, if we don't look at the implementation details of this function, we might not realize that there is room for optimization. As the PR's author [@scottmcm] pointed out, the old version did not reuse the value that was passed in as the default, resulting in unnecessary cloning of values.
+
+```rust
+use std::collections::VecDeque;
+
+let mut buf = VecDeque::new();
+buf.resize(5, String::from("rust"));
+```
+
+ For example, the string "rust" was cloned five times, even though only four were needed, because the `VecDeque::resize()` function used `VecDeque::resize_with()` underneath, which passed a closure to the `repeat_with().take()`.
+
+```rust
+pub fn resize(&mut self, new_len: usize, value: T) {
+    self.resize_with(new_len, || value.clone());
+}
+
+pub fn resize_with(&mut self, new_len: usize, generator: impl FnMut() -> T) {
+    let len = self.len();
+
+    if new_len > len {
+        self.extend(repeat_with(generator).take(new_len - len))
+    } else {
+        self.truncate(new_len);
+    }
+}
+```
+
+This closure was called repeatedly until it reached the `take` limit, causing unnecessary cloning. 
+
+```rust
+pub fn repeat_with<A, F: FnMut() -> A>(repeater: F) -> RepeatWith<F> {
+    RepeatWith { repeater }
+}
+
+/// An iterator that repeats elements of type `A` endlessly by
+/// applying the provided closure `F: FnMut() -> A`.
+#[derive(Copy, Clone, Debug)]
+pub struct RepeatWith<F> {
+    repeater: F,
+}
+
+impl<A, F: FnMut() -> A> Iterator for RepeatWith<F> {
+    type Item = A;
+
+    #[inline]
+    fn next(&mut self) -> Option<A> {
+        Some((self.repeater)())
+    }
+
+    #[inline]
+    fn size_hint(&self) -> (usize, Option<usize>) {
+        (usize::MAX, None)
+    }
+}
+```
+
+Now that we have identified the problem, let's move on to how the author fixed it.
+
+## iter::repeat_n
+
+The most significant change made in PR [#104435] was the replacement of `repeat_with().take()` with `repeat_n()`.
+
+```diff
+pub fn resize(&mut self, new_len: usize, value: T) {
+    if new_len > self.len() {
+        let extra = new_len - self.len();
+-       self.extend(repeat_with(generator).take(new_len - len))
++       self.extend(repeat_n(value, extra))
+    } else {
+        self.truncate(new_len);
+    }
+}
+```
+
+We can learn more about `repeat_n` from [ACP: Uplift iter::repeat_n from itertools](https://github.com/rust-lang/libs-team/issues/120). The author proposes to uplift [itertools::repeat_n()](https://docs.rs/itertools/latest/itertools/fn.repeat_n.html) into the standard library, just like how [iter::repeat_with()]() has obviated [itertools::repeat_call()](https://docs.rs/itertools/latest/itertools/fn.repeat_call.html).
+
+How does `repeat_n()` avoid the unnecessary cloning? Let’s dive into the code:
+
+```rust
+pub fn repeat_n<T: Clone>(element: T, count: usize) -> RepeatN<T> {
+    let mut element = ManuallyDrop::new(element);
+
+    if count == 0 {
+        // SAFETY: we definitely haven't dropped it yet, since we only just got
+        // passed it in, and because the count is zero the instance we're about
+        // to create won't drop it, so to avoid leaking we need to now.
+        unsafe { ManuallyDrop::drop(&mut element) };
+    }
+
+    RepeatN { element, count }
+}
+
+pub struct RepeatN<A> {
+    count: usize,
+    // Invariant: has been dropped iff count == 0.
+    element: ManuallyDrop<A>,
+}
+
+impl<A: Clone> Iterator for RepeatN<A> {
+    type Item = A;
+
+    #[inline]
+    fn next(&mut self) -> Option<A> {
+        if self.count == 0 {
+            return None;
+        }
+
+        self.count -= 1;
+        Some(if self.count == 0 {
+            // SAFETY: the check above ensured that the count used to be non-zero,
+            // so element hasn't been dropped yet, and we just lowered the count to
+            // zero so it won't be dropped later, and thus it's okay to take it here.
+            unsafe { ManuallyDrop::take(&mut self.element) }
+        } else {
+            A::clone(&mut self.element)
+        })
+    }
+}
+```
+
+Not too much code, we can easily grasp it. The `RepeatN` struct returned by `repeat_n()` is the point. To save a cloning, `RepeatN` declares its `element` as the `ManuallyDrop` type.
+
+## ManuallyDrop\<T\>
+
+[ManuallyDrop\<T\>](https://doc.rust-lang.org/std/mem/struct.ManuallyDrop.html) is a zero-cost wrapper to inhibit compiler from automatically calling **T**’s destructor unless you call `ManuallyDrop::drop()` method. 
+
+The two most important APIs that can help us avoid unnecessary cloning are:
+
+- `ManuallyDrop::take()`
+
+```rust
+pub unsafe fn take(slot: &mut ManuallyDrop<T>) -> T {
+    // SAFETY: we are reading from a reference, which is guaranteed
+    // to be valid for reads.
+    unsafe { ptr::read(&slot.value) }
+}
+```
+
+- The `DerefMut` implementation
+
+In the `Iterator` implementation of `RepeatN`, the `next()` method clones the element in each iteration until the `count` reaches 0. In the final iteration, the `ManuallyDrop::take()` function is used to reuse the buffer and avoid an extra clone. 
+
+Wait, but why does `A::clone(&mut self.element)` will get an instance of `A`? The type of `&mut self.element` is `&mut ManuallyDrop`, not `&mut A`. Well, the use of `ManuallyDrop` may seem obscure at first, but it becomes clearer when we consider the `Deref` and `DerefMut` traits that it implements. These traits allow for a type like `&mut ManuallyDrop<A>` to be treated as if it were a type like `&mut A`. This is known as [Deref coercion](https://doc.rust-lang.org/std/ops/trait.DerefMut.html#more-on-deref-coercion). As an example, consider the following code:
+
+```rust
+fn main() {
+    let mut a = String::from("A");
+    test(&mut a);
+}
+
+fn test(s: &str) {
+    println!("{s}");
+}
+```
+
+Here, we are able to pass a `&mut String` value to the `test()` function, even though the function's argument type is `&str`. This is because `String` implements both [Deref](https://doc.rust-lang.org/src/alloc/string.rs.html#2455) and [DerefMut](https://doc.rust-lang.org/src/alloc/string.rs.html#2465), allowing it to be treated as if it were a `&str` value. 
+
+Similarly, `ManuallyDrop<A>` also implements `DerefMut`, allowing it to be treated as if it were an `&mut A` value.
+
+```rust
+impl<T: ?Sized> const DerefMut for ManuallyDrop<T> {
+    #[inline(always)]
+    fn deref_mut(&mut self) -> &mut T {
+        &mut self.value
+    }
+}
+```
+
+It's important to note that the `Deref` trait also has a implementation for `&mut T`:
+
+```rust
+impl<T: ?Sized> const Deref for &mut T {
+    type Target = T;
+
+    fn deref(&self) -> &T {
+        *self
+    }
+}
+```
+
+So `A::clone(&mut self.element)` works because `&mut ManuallyDrop<A>` can convert to `&mut A` due to **Deref coercion**, then `&mut A` can convert to `&A` also due to **Deref coercion**.
+
+> Thanks to [@scottmcm]'s kindful reminder, we can change `A::clone(&mut self.element)` to `A::clone(&self.element)`. So I submitted a PR([#106564]) to fix it.
+
+## Alternative?
+
+During the review of this article, [@XeCycle] suggested that it might be possible to achieve the same result without using *unsafe* code by using `Option<(NonZeroUsize, T)>` instead. [@XeCycle] provided a [playground example](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=fd82d81d0f485acd769c3b0aaeeb5a3b) to demonstrate this idea.
+
+However, [@scottmcm] also commented on this issue, explaining that `ManuallyDrop` is a more appropriate choice for this case because it is designed for *niche-filling*. As an example, consider the following code:
+
+```rust
+use core::mem::MaybeUninit;
+use core::mem::ManuallyDrop;
+use core::num::NonZeroUsize;
+use core::alloc::Layout;
+
+fn main() {
+    dbg!(Layout::new::<Option<Option<(NonZeroUsize, String)>>>());
+    dbg!(Layout::new::<Option<(usize, MaybeUninit<String>)>>());
+    dbg!(Layout::new::<Option<(usize, ManuallyDrop<String>)>>());
+}
+```
+
+The output of this code is:
+
+```
+[src/main.rs:6] Layout::new::<Option<Option<(NonZeroUsize, String)>>>() = Layout {
+    size: 40,
+    align: 8 (1 << 3),
+}
+[src/main.rs:7] Layout::new::<Option<(usize, MaybeUninit<String>)>>() = Layout {
+    size: 40,
+    align: 8 (1 << 3),
+}
+[src/main.rs:8] Layout::new::<Option<(usize, ManuallyDrop<String>)>>() = Layout {
+    size: 32,
+    align: 8 (1 << 3),
+}
+```
+
+As we can see, using `Option<(usize, ManuallyDrop<String>)>` results in a smaller size than the other options. Using `ManuallyDrop` allows `Option<RepeatN<String>>` to have the same size as `RepeatN<String>`, which is not the case with the other options.
+
+> For more information on this topic, you can check out the article PR [#104435].
+
+## Conclusion
+
+As a Rust developer, I am often most concerned with the changes listed in the stable [release notes](https://github.com/rust-lang/rust/blob/master/RELEASES.md). However, this does not mean that I should not be interested in the individual pull requests (PRs) that are being merged into the project. There are hundreds of PRs merged each week, and each one has a story and an author behind it. That's why I propose the creation of a topic called [#pr-demystifying], where we can share articles about interesting or educational PRs in the Rust community. The PR [#104435], for example, may not be a major optimization, but it allowed me to learn a lot. I would like to thank the author [@scottmcm] for their work on this PR. I hope that this article and others like it will be helpful to others in the community.
+
+> Thanks to [@scottmcm], [@XeCycle], [@zhanghandong] for proof reading this post!
+
+[#pr-demystifying]: /topic/pr-demystifying
+[#104435]: https://github.com/rust-lang/rust/pull/104435
+[#106564]: https://github.com/rust-lang/rust/pull/106564
+[@scottmcm]: https://github.com/scottmcm
+[@XeCycle]: https://github.com/XeCycle
+[@zhanghandong]: https://github.com/zhanghandong
diff --git a/content/issue-1/zine.toml b/content/issue-1/zine.toml
@@ -12,3 +12,12 @@ pub_date = "2022-12-10"
 publish = true
 featured = true
 i18n.zh = { path = "/announcing-zh", file = "announcing-zh.md", title = "宣布 Rust 杂志", author = ["folyd", "rust-magazine"] }
+
+[[article]]
+file = "vecdeque-resize-optimization.md"
+title = "VecDeque::resize() optimization"
+author = "folyd"
+topic = ["pr-demystifying", "optimization"]
+pub_date = "2022-12-24"
+publish = false
+featured = true
diff --git a/zine.toml b/zine.toml
@@ -40,3 +40,5 @@ highlight_theme = "ayu-light"
 [topics]
 announcement = {}
 game = { name = "Game development", description = "Topic of rust game development." }
+pr-demystifying = { name = "PR Demystifying", description = "Topic of PR demystifying in Rust community."}
+optimization = {}