-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AsyncIO support for tuning readahead_size by block cache lookup #11936
Conversation
429aeff
to
710a47b
Compare
710a47b
to
6007b44
Compare
6725798
to
7a31e73
Compare
@akankshamahajan15 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
7a31e73
to
cbaac72
Compare
@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing. |
2 similar comments
@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing. |
@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing. |
e0dad7a
to
85ea012
Compare
@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing. |
1 similar comment
@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing. |
0aa55ec
to
5e33c26
Compare
@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing. |
5e33c26
to
461d9be
Compare
@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing. |
@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing. |
1 similar comment
@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing. |
@akankshamahajan15 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
631ea5c
to
36f5d7f
Compare
@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing. |
@akankshamahajan15 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! A couple of minor comments.
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
36f5d7f
to
a766459
Compare
@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing. |
a766459
to
453144d
Compare
@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing. |
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
453144d
to
bd36213
Compare
@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing. |
@akankshamahajan15 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@akankshamahajan15 merged this pull request in c77b50a. |
Summary: Add support for tuning of readahead_size by block cache lookup for async_io.
Design/ Implementation -
BlockBasedTableIterator.cc -
BlockCacheLookupForReadAheadSize
callback API lookups in the block cache and tries to reduce the startand end offset passed. This function looks into the block cache for the blocks between
start_offset
and
end_offset
and add all the handles in the queue.It then iterates from the end in the handles to find first miss block and update the end offset to that block.
It also iterates from the start and find first miss block and update the start offset to that block.
In case there is no data to be read in that callback (because of upper_bound or all blocks are in cache),
it updates start and end offset to be equal and that
FilePrefetchBuffer
interprets that as 0 length to be read.FilePrefetchBuffer.cc -
FilePrefetchBuffer calls the callback -
ReadAheadSizeTuning
and pass the start and end offset to thatcallback to get updated start and end offset to read based on cache hits/misses.
Foreg. if following are the data blocks with cache hit/miss and start offset
and Read API found miss on DB1 and based on readahead_size (50) it passes end offset to be 50.
[DB1 - miss- 0 ] [DB2 - hit -10] [DB3 - miss -20] [DB4 - miss-30] [DB5 - hit-40]
[DB6 - hit-50] [DB7 - miss-60] [DB8 - miss - 70] [DB9 - hit - 80] [DB6 - hit 90]
For Read call - updated start offset remains 0 but end offset updates to DB4, as DB5 is in cache.
Read calls saves initial end offset 50 as that was meant to be prefetched.
Now for next ReadAsync call - the start offset will be 50 (previous buffer initial end offset) and based on readahead_size, end offset will be 100
On callback, because of cache hits - callback will update the start offset to 60 and end offset to 80 to read only 2 data blocks (DB7 and DB8).
And for that ReadAsync call - initial end offset will be set to 100 which will again used by next ReadAsync call as start offset.
initial_end_offset_
inBufferInfo
is used to save the initial end offset of that buffer.If let's say DB5 and DB6 overlaps in 2 buffers (because of alignment),
prev_buf_end_offset
is passed to make sure already prefetched data is not prefetched again in second buffer.Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags: