You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is an outstanding issue in Tokio (see tokio-rs/tokio#1976) that a call to read a AsyncFile into a buffer will only read at most 16KB, no matter how much data is available, or the size of the buffer passed into the function. Based on my initial testing, this can cause inferior performance compared to reading up to the capacity of a buffer using the logic in buffer.js
If you were to log nread per loop iteration for a large file, you will see it is always 16KB. If you were to log nread in the synchronous version of this function for the same large file, you will see that nread grows with as the resize logic increases the buffer size.
My proposed solution is to modify the read logic in io.rs to explicitly read up to our buffer size, or until we have completed our read. Below is a diff for testing:
diff --git a/runtime/ops/io.rs b/runtime/ops/io.rs
index 607da74c..75f3a699 100644
--- a/runtime/ops/io.rs+++ b/runtime/ops/io.rs@@ -498,6 +498,15 @@ pub fn op_read(
stream.read(&mut zero_copy).await?
} else if let Some(stream) = resource.downcast_rc::<StreamResource>() {
- stream.clone().read(&mut zero_copy).await?+ let mut total_bytes_read = 0;+ let len = zero_copy.len();+ let mut slice = &mut zero_copy[0..len];+ while total_bytes_read < len {+ let read = stream.clone().read(&mut slice).await?;+ if read == 0 { break; }+ total_bytes_read += read;+ slice = &mut zero_copy[total_bytes_read..len];+ }+ total_bytes_read
} else {
return Err(bad_resource_id());
};
Using the following minibenchmark, I am able to see a performance improvement for reading large files multiple times. Output is in milliseconds.
user@ubuntu:~/proj/ng-async-files$ deno run --allow-read benchmark.js
Starting...
3046
user@ubuntu:~/proj/ng-async-files$ deno run --allow-read benchmark.js
Starting...
2899
user@ubuntu:~/proj/ng-async-files$ deno run --allow-read benchmark.js
Starting...
2980
user@ubuntu:~/proj/ng-async-files$ ../deno/target/release/deno run --allow-read benchmark.js
Starting...
2462
user@ubuntu:~/proj/ng-async-files$ ../deno/target/release/deno run --allow-read benchmark.js
Starting...
2444
user@ubuntu:~/proj/ng-async-files$ ../deno/target/release/deno run --allow-read benchmark.js
Starting...
2458
user@ubuntu:~/proj/ng-async-files$
deno is the latest release downloaded today, and target is a build with this patch applied. The benchmark code is as follows. large-file is a 110MB plain text file that I generated at random prior to starting the test:
If the core maintainers are interested in me submitting this patch as a PR, I would be more than happy to. If there are additional tests or benchmarks they would like me to run, I would be happy to do that as well.
The text was updated successfully, but these errors were encountered:
I found a flaw with my initial patch, I wasn't incrementing the index of the buffer to insert the data into. I've updated the patch and benchmarks in the initial issue.
After discussion on chat with some contributors, I am going to close this issue. While it's possible to improve performance via this patch, it would be a breaking change to existing behavior for anything that implements a stream in which you would read less data then the size of your buffer (think a low UDP stream). Even limiting only to files, it would change behavior for things like reading a single character from stdin when your buffer is larger than 1 character (see raw_mode.ts).
There is an outstanding issue in Tokio (see tokio-rs/tokio#1976) that a call to read a AsyncFile into a buffer will only read at most 16KB, no matter how much data is available, or the size of the buffer passed into the function. Based on my initial testing, this can cause inferior performance compared to reading up to the capacity of a buffer using the logic in buffer.js
deno/runtime/js/13_buffer.js
Lines 159 to 181 in 2077864
If you were to log nread per loop iteration for a large file, you will see it is always 16KB. If you were to log nread in the synchronous version of this function for the same large file, you will see that nread grows with as the resize logic increases the buffer size.
My proposed solution is to modify the read logic in io.rs to explicitly read up to our buffer size, or until we have completed our read. Below is a diff for testing:
Using the following minibenchmark, I am able to see a performance improvement for reading large files multiple times. Output is in milliseconds.
deno
is the latest release downloaded today, and target is a build with this patch applied. The benchmark code is as follows.large-file
is a 110MB plain text file that I generated at random prior to starting the test:If the core maintainers are interested in me submitting this patch as a PR, I would be more than happy to. If there are additional tests or benchmarks they would like me to run, I would be happy to do that as well.
The text was updated successfully, but these errors were encountered: