Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom vtable API #567

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
add Bytes::downcast_impl to extract underlying implementation
  • Loading branch information
HyeonuPark committed Aug 21, 2022
commit 169cd444fe00acbc3105079613ba03327b78b275
77 changes: 75 additions & 2 deletions src/bytes.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
use core::any::TypeId;
use core::iter::FromIterator;
use core::ops::{Deref, RangeBounds};
use core::{cmp, fmt, hash, mem, ptr, slice, usize};
Expand Down Expand Up @@ -114,6 +115,9 @@ pub unsafe trait BytesImpl: 'static {
/// Decompose `Self` into parts used by `Bytes`.
fn into_bytes_parts(this: Self) -> (AtomicPtr<()>, *const u8, usize);

/// Creates itself directly from the raw bytes parts decomposed with `into_bytes_parts`.
unsafe fn from_bytes_parts(data: &mut AtomicPtr<()>, ptr: *const u8, len: usize) -> Self;

/// Returns new `Bytes` based on the current parts.
unsafe fn clone(data: &AtomicPtr<()>, ptr: *const u8, len: usize) -> Bytes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HyeonuPark - Is there a case in which one would need to clone bytes using the components?
If so, is there another way to go about this?
A third party container implementor (say, Mmap https://docs.rs/memmap/latest/memmap/struct.Mmap.html) isn't going to be able to construct a Bytes instance because they won't have access to the internals of Bytes

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the current design cloning bytes should use the component, since cloning mechanisms like reference counting are implemented as component. For example they should do things like Arc::increment_strong_count() within it. Currently the Bytes is shipped with number of builtin components:

  • Static component backed by &'static [u8] and does nothing on cloning
  • Shared component which increase refcount on cloning.
  • Promotable component which initially store bytes without Arc and be promoted to shared component on first cloning.

And the BytesMut has number of components similar but slightly different stretagies. And the promotable one is the only reason we have &AtomicPtr<()> not *const (). Is it necessary? I'm not sure, but it was one of the goals of this PR to not change existing behaviors.

For the mmap part, I built my own buffer type from aligned slices of giant single mmap region. First few bytes of each slices are considered header and one of its fields are used as refcount since I don't want additional allocation for it. This refcount is modified on clone and drop.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was referring to the fact that clone returns Bytes. A 3rd party implementation isn't going to be able to construct Bytes using the parts, because they don't have access to the internals of Bytes.

We would have to provide some sort of public Bytes::from_parts but I think that isn't the best approach.

Better would be that the trait doesn't specify that the implementation returns Bytes. Instead, the consumer of the clone function (a Bytes instance) would receive the new parts, and know how to construct them.


Expand All @@ -132,6 +136,7 @@ pub unsafe trait BytesImpl: 'static {
}

struct Vtable {
type_id: fn() -> TypeId,
/// fn(data, ptr, len)
clone: unsafe fn(&AtomicPtr<()>, *const u8, usize) -> Bytes,
/// fn(data, ptr, len)
Expand Down Expand Up @@ -192,6 +197,7 @@ impl Bytes {
#[cfg(not(all(loom, test)))]
pub const fn from_static(bytes: &'static [u8]) -> Bytes {
const STATIC_VTABLE: Vtable = Vtable {
type_id: TypeId::of::<StaticImpl>,
clone: <StaticImpl as BytesImpl>::clone,
will_truncate: <StaticImpl as BytesImpl>::will_truncate,
into_vec: <StaticImpl as BytesImpl>::into_vec,
Expand All @@ -209,6 +215,7 @@ impl Bytes {
#[cfg(all(loom, test))]
pub fn from_static(bytes: &'static [u8]) -> Bytes {
const STATIC_VTABLE: Vtable = Vtable {
type_id: TypeId::of::<StaticImpl>,
clone: <StaticImpl as BytesImpl>::clone,
will_truncate: <StaticImpl as BytesImpl>::will_truncate,
into_vec: <StaticImpl as BytesImpl>::into_vec,
Expand All @@ -235,6 +242,7 @@ impl Bytes {
len,
data,
vtable: &Vtable {
type_id: TypeId::of::<T>,
clone: T::clone,
will_truncate: T::will_truncate,
into_vec: T::into_vec,
Expand Down Expand Up @@ -543,6 +551,19 @@ impl Bytes {
self.truncate(0);
}

/// Downcast this `Bytes` into its underlying implementation.
#[inline]
pub fn downcast_impl<T: BytesImpl>(self) -> Result<T, Bytes> {
if TypeId::of::<T>() == (self.vtable.type_id)() {
Ok(unsafe {
let this = &mut *mem::ManuallyDrop::new(self);
T::from_bytes_parts(&mut this.data, this.ptr, this.len)
})
} else {
Err(self)
}
}

// private

#[inline]
Expand Down Expand Up @@ -891,6 +912,7 @@ impl From<Bytes> for Vec<u8> {
impl fmt::Debug for Vtable {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
f.debug_struct("Vtable")
.field("type_id", &self.type_id)
.field("clone", &(self.clone as *const ()))
.field("will_truncate", &(self.will_truncate as *const ()))
.field("into_vec", &(self.into_vec as *const ()))
Expand All @@ -906,7 +928,15 @@ struct StaticImpl(&'static [u8]);
unsafe impl BytesImpl for StaticImpl {
fn into_bytes_parts(this: Self) -> (AtomicPtr<()>, *const u8, usize) {
let mut bytes = mem::ManuallyDrop::new(Bytes::from_static(this.0));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

into_bytes_parts is used to construct a bytes instance using with_impl, correct?
So this constructs a Bytes instance in order to decompose the Bytes instance so that we can construct a Bytes instance?
That is a tad confusing.

Do we even need to have a StaticImpl?
Since from_static "manually" constructs a VTable and the Bytes struct, it could just use free method versions of into_bytes_parts, clone, drop, into_vec.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree things are confusing with StaticImpl but it's because Bytes::from_static() is const fn so we can't use traits within it on stable. And I tried to DRY as much as possible in other places. Maybe it would be better to have right dependencies between functions even if it duplicates some logic?

StaticImpl would be needed to allow downcasting Bytes to &'static [u8] without reinventing the wheel. Actual API for it is not included. I'm not sure which one is better, adding Bytes::try_into_static() or impl TryFrom<Bytes> for &'static [u8] or exposing SharedImpl to public.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. I didn't notice the const. I think further decomposing the logic into utility functions, and then duplicating the minimum amount is the best approach.

... or exposing SharedImpl to public.

I think the solution here is "layers". We offer some default implementations, similar to SharedImpl, as a lower-level public API so that people can compose their own higher-level implementations of BytesImpl

(mem::take(&mut bytes.data), bytes.ptr, bytes.len)
(
mem::replace(&mut bytes.data, AtomicPtr::default()),
bytes.ptr,
bytes.len,
)
}

unsafe fn from_bytes_parts(_data: &mut AtomicPtr<()>, ptr: *const u8, len: usize) -> Self {
StaticImpl(slice::from_raw_parts(ptr, len))
}

unsafe fn clone(_: &AtomicPtr<()>, ptr: *const u8, len: usize) -> Bytes {
Expand All @@ -932,7 +962,6 @@ struct PromotableOddImpl(Promotable);

enum Promotable {
Owned(Box<[u8]>),
#[allow(dead_code)]
Shared(SharedImpl),
}

Expand All @@ -952,6 +981,12 @@ unsafe impl BytesImpl for PromotableEvenImpl {
(AtomicPtr::new(data.cast()), ptr, len)
}

unsafe fn from_bytes_parts(data: &mut AtomicPtr<()>, ptr: *const u8, len: usize) -> Self {
PromotableEvenImpl(promotable_from_bytes_parts(data, ptr, len, |shared| {
ptr_map(shared.cast(), |addr| addr & !KIND_MASK)
}))
}

unsafe fn clone(data: &AtomicPtr<()>, ptr: *const u8, len: usize) -> Bytes {
let shared = data.load(Ordering::Acquire);
let kind = shared as usize & KIND_MASK;
Expand Down Expand Up @@ -994,6 +1029,30 @@ unsafe impl BytesImpl for PromotableEvenImpl {
}
}

unsafe fn promotable_from_bytes_parts(
data: &mut AtomicPtr<()>,
ptr: *const u8,
len: usize,
f: fn(*mut ()) -> *mut u8,
) -> Promotable {
let shared = data.with_mut(|p| *p);
let kind = shared as usize & KIND_MASK;

if kind == KIND_ARC {
Promotable::Shared(SharedImpl::from_bytes_parts(data, ptr, len))
} else {
debug_assert_eq!(kind, KIND_VEC);

let buf = f(shared);

let cap = (ptr as usize - buf as usize) + len;

let vec = Vec::from_raw_parts(buf, cap, cap);

Promotable::Owned(vec.into_boxed_slice())
}
}

unsafe fn promotable_into_vec(
data: &mut AtomicPtr<()>,
ptr: *const u8,
Expand Down Expand Up @@ -1034,6 +1093,12 @@ unsafe impl BytesImpl for PromotableOddImpl {
(AtomicPtr::new(ptr.cast()), ptr, len)
}

unsafe fn from_bytes_parts(data: &mut AtomicPtr<()>, ptr: *const u8, len: usize) -> Self {
PromotableOddImpl(promotable_from_bytes_parts(data, ptr, len, |shared| {
shared.cast()
}))
}

unsafe fn clone(data: &AtomicPtr<()>, ptr: *const u8, len: usize) -> Bytes {
let shared = data.load(Ordering::Acquire);
let kind = shared as usize & KIND_MASK;
Expand Down Expand Up @@ -1114,6 +1179,14 @@ unsafe impl BytesImpl for SharedImpl {
(AtomicPtr::new(this.shared.cast()), this.offset, this.len)
}

unsafe fn from_bytes_parts(data: &mut AtomicPtr<()>, ptr: *const u8, len: usize) -> Self {
SharedImpl {
shared: (data.with_mut(|p| *p)).cast(),
offset: ptr,
len,
}
}

unsafe fn clone(data: &AtomicPtr<()>, ptr: *const u8, len: usize) -> Bytes {
let shared = data.load(Ordering::Relaxed);
shallow_clone_arc(shared as _, ptr, len)
Expand Down
8 changes: 8 additions & 0 deletions src/bytes_mut.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1683,6 +1683,14 @@ unsafe impl crate::BytesImpl for SharedImpl {
(AtomicPtr::new(this.shared.cast()), this.ptr, this.len)
}

unsafe fn from_bytes_parts(data: &mut AtomicPtr<()>, ptr: *const u8, len: usize) -> Self {
SharedImpl {
shared: (data.with_mut(|p| *p)).cast(),
ptr,
len,
}
}

unsafe fn clone(data: &AtomicPtr<()>, ptr: *const u8, len: usize) -> Bytes {
let shared = data.load(Ordering::Relaxed) as *mut Shared;
increment_shared(shared);
Expand Down