Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix chained .slice to not produce nested Slice wrappers #163

Closed
wants to merge 26 commits into from

Conversation

mzabaluev
Copy link
Contributor

A rework of #53 without introducing a utility trait, based in part on suggestions in #53 (comment)

Slice no longer implements IoBuf, but gains an inherent slice method that returns a Self subslice.
There is impl<T: IoBuf> From<T> for Slice<T>, and the generic buffers in the API methods are bound as impl Into<Slice<T>>.
Like in #53, the buffers are returned back as Slice<T>.

mzabaluev and others added 13 commits May 3, 2022 19:08
The module's code is no longer in use.
b"hello".to_vec().slice(1..).slice(..3) produces a
Slice<Slice<Vec<u8>>>.
The API makes it easy to inadvertendly end up with nested Slice
wrappers, which adds unnecessary overhead to the buffer accessor methods
and makes the results of .begin() and .end() difficult to interpret.

Introduce trait IntoSlice to uniformly convert all buffer parameters
to a Slice<T>, with the impl of InfoSlice for Slice<T> closing onto
Self.

The buffer parameter type in the generic I/O operation methods is
bound with IntoSlice to keep the polymorphism good, while reducing
code bloat in monomorphisation when the application passes both buffers
and their slices to tokio-uring operations. Consequently, all buffers
returned from the operations are wrapped into Slice.
Judging by the impls and the tests, it's clearly meant to work
as a high water mark for initialized portion of the buffer,
there is no setting the position back.
The documentation on Slice::set_init has already been made correct,
so it makes sense to fix this in the same PR.
The doc tests need update after the original IntoSlice refactoring.
There's no need to expose raw buffer access methods of Slice
in the public API.
@FrankReh
Copy link
Collaborator

FrankReh commented Nov 3, 2022

Nice, the synergies of multiple sets of eyes. I'm going to hold off reviewing until tomorrow. I am very interested in comparing this set against what I had reviewed in #53. I might pull both branches and run git vimdiff against the files in the two tips unless there is an easier way I can be pointed to.

@mzabaluev
Copy link
Contributor Author

I might pull both branches and run git vimdiff against the files in the two tips unless there is an easier way I can be pointed to.

@FrankReh This branch only adds 99aef45 and a small doc revert on top of #53.

@mzabaluev mzabaluev marked this pull request as draft November 3, 2022 08:58
@mzabaluev mzabaluev marked this pull request as ready for review November 3, 2022 10:07
@FrankReh
Copy link
Collaborator

FrankReh commented Nov 3, 2022

@mzabaluev Thanks for the confirmation. I've started taking a look.

@FrankReh
Copy link
Collaborator

FrankReh commented Nov 3, 2022

Random musings:

rust 1.65 is released today. Does the GAT support make any of this crate's IoBuf/Slice support even cleaner?

Should some of the crate's unit tests work with the bytes type when the bytes feature is enabled?

@ollie-etl
Copy link
Contributor

ollie-etl commented Nov 3, 2022

I still think we could go further, although maybe these items are for a different PR

IoBuf and IoBufMut are used for several purposes:

  1. To ensure that buffers are stable for the duration of an Op
  2. To access the pointer to the start of the buffer,
  3. To get and mark the initialised bytes

w.r.t 1., the same thing can be achieve by using the constraint T: Unpin + 'static where we currently use T: IoBuf

w.r.t 2. we can achieve the same by using T: AsRef<[u8]> where we currently use T: IoBuf, and T: AsMut<[u8]> where we currently use T: IoBufMut,

w.r.t 3. This can be managed with Slice directly

@FrankReh
Copy link
Collaborator

FrankReh commented Nov 3, 2022

@ollie-etl Interesting. Yes, let me finish my review. @Noah-Kennedy and @mzabaluev can then decide what to do from there.

@Noah-Kennedy
Copy link
Contributor

@ollie-etl @FrankReh I went over this on #161, but IoBuf is needed, at the very least as a marker type.

@Noah-Kennedy
Copy link
Contributor

Unpin just means "I can move this", it does not guarantee a stable pointer from a deref. We need our own trait to signal this.

Copy link
Collaborator

@FrankReh FrankReh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Sorry that was a lot of of signatures to change.

src/buf/slice.rs Show resolved Hide resolved
src/buf/slice.rs Show resolved Hide resolved
src/buf/io_buf.rs Show resolved Hide resolved
src/fs/file.rs Show resolved Hide resolved
examples/udp_socket.rs Show resolved Hide resolved
src/net/tcp/listener.rs Outdated Show resolved Hide resolved
src/net/tcp/stream.rs Show resolved Hide resolved
src/net/tcp/stream.rs Show resolved Hide resolved
src/net/udp.rs Show resolved Hide resolved
tests/buf.rs Show resolved Hide resolved
src/fs/file.rs Show resolved Hide resolved
src/fs/file.rs Show resolved Hide resolved
@Noah-Kennedy
Copy link
Contributor

I should be able to take a look at this tonight.

Clarify the behavior of Slice::slice, mention the From<T: IoBuf>
conversion for Slice.
The impl Into<Slice<T> buffer arguments and their corresponding
Slice<T> returned values do require some more explaining.
Reduce generic parameters on the implementation of write_all to
just the buffer type, to cut back on code bloat and improve branch
prediction.
Copy link
Collaborator

@FrankReh FrankReh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice. Waiting for @Noah-Kennedy to review.

@FrankReh
Copy link
Collaborator

FrankReh commented Nov 5, 2022

@mzabaluev Another PR was merged today that added two functions, write_all_at and read_exact_at. Do you want to update your branch and handle those two, they come with a unit test each I think. Or would you like me to?

@mzabaluev
Copy link
Contributor Author

@FrankReh I will update the branch.

src/fs/file.rs Outdated Show resolved Hide resolved
Remove checks for ErrorKind::Interrupted from:
- File::read_exact_at
- File::write_all_at

As mentioned in the implementation of write_all for sockets,
this error code should never be returned with the way the runtime
drives io-uring.
Also update the similar comments in net that needed a bit more
editing since the commented-out match arms were removed.
Another post-merge fix for a recently added method.
Describe the semantics of the returned Slice.
Copy link
Collaborator

@FrankReh FrankReh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, unless I missed the explanation somewhere else in the comment, there is probably one comment that could be made clearer.

Your comments are great and you're being super thorough.

///
/// If the method returns [`Ok(())`], then the read was successful.
///
/// # Errors
///
/// If this function encounters an "end of file" before completely filling
/// the buffer, it returns an error of the kind [`ErrorKind::UnexpectedEof`].
/// The buffer is returned on error.
/// The buffer is returned on error as described above.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add, "but the data that could be read will have been written to the buffer.

So it's clear that when there's an error, the buffer may have been written to somewhat. These write_all type functions are just helpers and can't be expected to be magic. If the caller cares, they should make their own copy beforehand.

}

impl<T: IoBufMut> Op<Read<T>> {
pub(crate) fn read_at(fd: &SharedFd, buf: T, offset: u64) -> io::Result<Op<Read<T>>> {
pub(crate) fn read_at(fd: &SharedFd, buf: Slice<T>, offset: u64) -> io::Result<Op<Read<T>>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how I feel about changing the APIs like this. Why is this necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also feel unhappy about Slice popping up everywhere even when not needed, and perhaps it's not justified by fixing .slice(..).slice(..).

I've got another idea which I'm going to try first.

@mzabaluev
Copy link
Contributor Author

Superseded by #172.

@mzabaluev mzabaluev closed this Nov 7, 2022
FrankReh pushed a commit that referenced this pull request Nov 21, 2022
Introduce new traits, `BoundedBuf` and `BoundedBufMut`, taking the
`slice` method and related infrastructure off `IoBuf`.
`Slice` no longer implements `IoBuf`/`IoBufMut`, instead it does
`BoundedBuf` and `BoundedBufMut`.
These traits are used as bounds in most generic I/O methods in place of
`IoBuf`/`IoBufMut`.

The purpose of this is to get rid of `Slice<Slice<T>>` arising from
chained slice calls.

* Any buffer the user has can be defined to be an IoBuf (subject to the
safety constraints of IoBuf)
* An IoBuf is by definition a BoundedBuf
* BoundedBuf<IoBuf>.slice(range) -> Slice<IoBuf> (i.e. slicing an IoBuf
BoundedBuf produces an IoBuf Slice)
* A Slice<IoBuf> is also a BoundedBuf by definition
* And to void the problem of a Slice<Slice<>> hierarchy building up
* A BoundedBuf<Slice<IoBuf>>.slice(range) -> Slice<IoBuf> (keeping the
slice flat)

All calls that take a BoundedBuf input type can take an IoBuf and a
Slice<IoBuf>.

And a brief explanation for the few functions that were split into two
layers, a generic public layer and a slice specific private layer:

The purpose of factoring out the bulk of the public method's body into
the private method working over Slice is twofold:

* Unify the code path recovering the buffer passed in for the BufResult,
rather than have it replicated in multiple return expressions as in
#163;
* Reduce monomorphization bloat, so that most of the machine code to
work with buffers passed in as, say, Vec<u8> and Slice<Vec<u8>> is
emitted in one instance, rather than replicated instances of exactly the
same code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants