Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extend SegmentDirectory.Reader iface to abstract access to index buffers from StarTree index #10158

Merged
merged 4 commits into from
Jan 24, 2023

Conversation

klsince
Copy link
Contributor

@klsince klsince commented Jan 20, 2023

Move the access of StarTree index data buffers behind the SegmentDirectory interface, so that the access to ST index data can be customized via custom SegmentDirectory implementations.


public abstract String toString();

public PinotDataBuffer getStarTreeIndex()
Copy link
Contributor Author

@klsince klsince Jan 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi, getStarTreeIndex and getStarTreeIndexMap were added very recenlty by #9828 but they didn't abstract the access to ST index very cleanly and not very generic. This PR improves on this.

@klsince klsince force-pushed the segment_reader_for_st_index branch from 43ffdea to 0fc2506 Compare January 20, 2023 19:48
@codecov-commenter
Copy link

codecov-commenter commented Jan 20, 2023

Codecov Report

Merging #10158 (7d099d0) into master (4d7da0a) will increase coverage by 35.55%.
The diff coverage is 79.38%.

@@              Coverage Diff              @@
##             master   #10158       +/-   ##
=============================================
+ Coverage     34.84%   70.39%   +35.55%     
- Complexity      200     5804     +5604     
=============================================
  Files          2006     2018       +12     
  Lines        108718   109001      +283     
  Branches      16501    16541       +40     
=============================================
+ Hits          37880    76734    +38854     
+ Misses        67566    26891    -40675     
- Partials       3272     5376     +2104     
Flag Coverage Δ
integration1 24.54% <0.00%> (+0.03%) ⬆️
integration2 24.39% <0.00%> (+0.09%) ⬆️
unittests1 67.87% <79.38%> (?)
unittests2 13.62% <0.00%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ent/local/segment/store/FilePerIndexDirectory.java 94.79% <ø> (+94.79%) ⬆️
.../local/segment/store/SingleFileIndexDirectory.java 90.72% <0.00%> (+90.72%) ⬆️
.../pinot/segment/spi/store/ColumnIndexDirectory.java 40.00% <ø> (+40.00%) ⬆️
...ache/pinot/segment/spi/store/SegmentDirectory.java 45.45% <33.33%> (+45.45%) ⬆️
...t/local/segment/store/SegmentLocalFSDirectory.java 73.52% <76.47%> (+73.52%) ⬆️
...gment/local/segment/store/StarTreeIndexReader.java 81.15% <81.15%> (ø)
...indexsegment/immutable/ImmutableSegmentLoader.java 85.71% <100.00%> (+85.71%) ⬆️
...ocal/startree/v2/store/StarTreeIndexContainer.java 100.00% <100.00%> (+100.00%) ⬆️
...t/local/startree/v2/store/StarTreeLoaderUtils.java 89.18% <100.00%> (+89.18%) ⬆️
...ot/core/operator/blocks/InstanceResponseBlock.java 69.81% <0.00%> (-9.44%) ⬇️
... and 1168 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@@ -248,25 +250,26 @@ protected void load()

private synchronized void loadData()
throws IOException {
if (_columnIndexDirectory != null) {
return;
if (_columnIndexDirectory == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest reverting this part. We do the check to prevent loadData() twice. We only need to add if (CollectionUtils.isNotEmpty(_segmentMetadata.getStarTreeV2MetadataList()) {...}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new changes continue to ensure _columnIndexDirectory (and _starTreeIndexReader) is loaded once. I separate their handling because the SegmentDirectory.Writer closes and sets _columnIndexDirectory to null, but Writer doesn't need to close and set _starTreeIndexReader to null, as it is for readonly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, it doesn't seem correct to me. When closing the writer, we should close the star-tree as well

Copy link
Contributor Author

@klsince klsince Jan 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think SegmentDirectory.close() is supposed to release everything owned by the directory, and IMO the SegmentDirectory..Writer.close() is more about to flush any changes, but _starTreeIndexReader is for read only.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But after writer.close() is called, I don't think we can read the segment anymore.
The reader/writer name can be misleading as they are not independent of the segment directory. Unless I am missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment about SegmentDirectory interface suggests that's possible, but need to call createReader() to get a new reader. The buffers are not loaded redundantly with the null-checks in loadData() method.

We only need to add if (CollectionUtils.isNotEmpty(_segmentMetadata.getStarTreeV2MetadataList()) {...}

Then I should close _starTreeIndexReader in Writer.close() method too. I'm fine with that way.

return _starTreeIndexReader != null;
}

public SegmentDirectory.Reader getSegmentIndexReaderFor(String indexName, SegmentIndexType type) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a common abstraction for segment level index. Suggest for now just add APIs for star-tree: getStarTreeReader(int starTreeId)

@Jackie-Jiang Jackie-Jiang merged commit 06ed8c3 into apache:master Jan 24, 2023
@klsince klsince deleted the segment_reader_for_st_index branch January 24, 2023 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants