-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
File operations don't use the whole buffer #1976
Comments
Looking at your original issue, I think we can support better performance by working directly with |
As hinted in tokio-rs#1976 (comment) this change replaces the inner buf attribute of the Buf struct.
I’d like to work on this. Some guidance would be appreciated.
|
@carllerche |
Not with the current implementation. It would probably require a special function for |
@carllerche Why is 16K, and not others, is there some special reason, or testing proved a better performance with 16K? |
The reason to have a maximum buffer size is that the file API allocates an intermediate buffer separate from the user-provided buffer. I don't think the exact choice of size was benchmarked. |
Why not replacing |
Well why would we? If both can be used, it's better to use a |
You're right. Any hints on how to move forward w/ this? |
The file operations that are offloaded to the E.g. maybe there is a way to have some machines on AWS with the desired kernel version participate in running CI? Not sure on the details. |
It's an extremely narrow test, but in my quest to optimize I/O performance on something reading/writing whole (1-50GiB) files sequentially, I tested a quick hack that simply changes This is an obnoxiously large I/O size, but it does work on this very specific use case: The files are on cephfs, in an erasure coded pool, with 128MiB object size. Aligning I/O to read whole objects per request substantially improves performance (approx 80MiB/s per task with four tasks up to approx. 220MiB/s in my case) This is on Fedora with kernel 5.6.13 EDIT: Fixed numbers |
I am open to changing the buffer size used by the |
@Darksonn is there anything we could help with to get this issue addressed? At Deno we got several reports regarding poor performance when working on large files (denoland/deno#10157) due to a need to perform thousands of roundtrips between Rust and JavaScript to read the data. |
Well, the options are the following:
I think I would be ok with all of these. What do you think? |
@Darksonn I can make a PR with my changes later today. I landed on 32MiB as an optimal size for my use case, but I feel like making |
@Darksonn at Deno, we will be going with option 3 for now, but ideally we would like to see option 2 happening. |
Pushed up #4580 |
tokio issue: - tokio-rs/tokio#1976 This means when we pass a buffer > 16KB to the OS, tokio truncates it to 16KB blowing up 9pfs msize expectations.
It just took me a good hour to find a bug while writing files using |
|
#5397 increased the buf size to 2MB tokio/tokio/src/io/blocking.rs Line 29 in 0d382fa
is that enough to close this issue or some refinement should be done? |
I haven't tested it lately, but I suppose it's fine. @carllerche suggested using |
tokio 0.2.4
, seetokio/tokio/src/io/blocking.rs
Line 29 in 0d38936
There's an unexpected 16 KB limit for IO operations, e.g. this will print
16384
. It seems intended, but it's somewhat confusing.The text was updated successfully, but these errors were encountered: