Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make index creator provision pluggable #7885

Merged
merged 4 commits into from
Dec 16, 2021

Conversation

richardstartin
Copy link
Member

@richardstartin richardstartin commented Dec 9, 2021

This allows index creation to be intercepted, so that the current static logic in SegmentIndexCreator can be extended or overridden. This is achieved by introducing a new interface IndexCreatorProvider which provides various new index creators from an IndexCreationContext which bundles all information about index creation. External users can register a decorator which can enhance or entirely replace the default index creator provision logic. Typically, a registered decorator should pattern match the IndexCreatorContext for very specific cases, and return a custom implementation only in those cases, and merely delegate to the default provisioning logic otherwise.

Design doc at #7895

@richardstartin richardstartin force-pushed the pluggable-index-creation branch from d42823d to 4cc35e3 Compare December 9, 2021 16:13
@richardstartin richardstartin marked this pull request as draft December 9, 2021 16:18
@codecov-commenter
Copy link

codecov-commenter commented Dec 9, 2021

Codecov Report

Merging #7885 (f861dd6) into master (fb12a50) will decrease coverage by 43.69%.
The diff coverage is 15.66%.

❗ Current head f861dd6 differs from pull request most recent head 21140ba. Consider uploading reports for the commit 21140ba to get more accurate results
Impacted file tree graph

@@              Coverage Diff              @@
##             master    #7885       +/-   ##
=============================================
- Coverage     71.34%   27.65%   -43.70%     
=============================================
  Files          1587     1583        -4     
  Lines         82071    81996       -75     
  Branches      12267    12241       -26     
=============================================
- Hits          58553    22675    -35878     
- Misses        19550    57261    +37711     
+ Partials       3968     2060     -1908     
Flag Coverage Δ
integration1 ?
integration2 27.65% <15.66%> (+<0.01%) ⬆️
unittests1 ?
unittests2 ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
.../java/org/apache/pinot/common/utils/DataTable.java 89.47% <ø> (ø)
...e/pinot/core/common/datatable/DataTableImplV2.java 0.00% <0.00%> (-77.78%) ⬇️
...rg/apache/pinot/core/minion/RawIndexConverter.java 0.00% <0.00%> (-56.61%) ⬇️
...ache/pinot/core/plan/InstanceResponsePlanNode.java 100.00% <ø> (ø)
...nverttorawindex/ConvertToRawIndexTaskExecutor.java 0.00% <0.00%> (-100.00%) ⬇️
...ment/creator/impl/DefaultIndexCreatorProvider.java 0.00% <0.00%> (ø)
...ment/creator/impl/SegmentColumnarIndexCreator.java 0.00% <0.00%> (-84.40%) ⬇️
...ocal/segment/index/loader/IndexHandlerFactory.java 0.00% <0.00%> (-100.00%) ⬇️
...t/index/loader/bloomfilter/BloomFilterHandler.java 0.00% <0.00%> (-77.59%) ⬇️
...nt/index/loader/invertedindex/FSTIndexHandler.java 0.00% <0.00%> (-82.70%) ⬇️
... and 1171 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fb12a50...21140ba. Read the comment docs.

@richardstartin richardstartin force-pushed the pluggable-index-creation branch from 4cc35e3 to a28f516 Compare December 9, 2021 17:35
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great. With this we can easily plug in out own index creators

@richardstartin richardstartin force-pushed the pluggable-index-creation branch 2 times, most recently from 874335b to 989f68b Compare December 13, 2021 12:46
@richardstartin richardstartin changed the title make index creator provision interceptible make index creator provision pluggable Dec 13, 2021
@richardstartin richardstartin marked this pull request as ready for review December 13, 2021 13:00
@richardstartin richardstartin force-pushed the pluggable-index-creation branch from 989f68b to d4a8970 Compare December 13, 2021 14:11
@richardstartin richardstartin force-pushed the pluggable-index-creation branch from b9344ef to 24b64ff Compare December 13, 2021 19:42
@siddharthteotia
Copy link
Contributor

@richardstartin just seeing this. I will review it today. Looks like the design doc link was intended to be an issue but it is linking to this PR ?

@richardstartin richardstartin force-pushed the pluggable-index-creation branch 6 times, most recently from bb122fc to d9c38cf Compare December 14, 2021 18:51
@amrishlal
Copy link
Contributor

amrishlal commented Dec 15, 2021

If needed in future, would it be possible to extend these interfaces to allow for creation of multiple indices on a column or would it lead to breaking existing index plugins? For example, I could create an H3 index and a quad-tree index on a geospatial column, or a flat index and a tree index on a JSON column and then depending upon the type of query being evaluated, pick one out of the two available indices to evaluate the query.

@richardstartin
Copy link
Member Author

If needed in future, would it be possible to extend these interfaces to allow for creation of multiple indices on a column or would it lead to breaking existing index plugins? For example, I could create an H3 index and a quad-tree index on a geospatial column, or a flat index and a tree index on a JSON column and then depending upon the type of query being evaluated, pick one out of the two available indices to evaluate the query.

great question. No, the ability to create multiple indexes on a column is not limited by this PR: that logic stays in SegmentColumnarIndexCreator.

@richardstartin richardstartin force-pushed the pluggable-index-creation branch from 81495a5 to 02aa71d Compare December 15, 2021 22:18
@richardstartin richardstartin force-pushed the pluggable-index-creation branch from ea56de2 to 9dfc034 Compare December 15, 2021 22:23
@richardstartin richardstartin force-pushed the pluggable-index-creation branch from 1cf8942 to ca18760 Compare December 16, 2021 09:57
@richardstartin
Copy link
Member Author

richardstartin commented Dec 16, 2021

@siddharthteotia there is now only one method for creating text indexes now.

@richardstartin richardstartin force-pushed the pluggable-index-creation branch from ca18760 to 1b3eb5a Compare December 16, 2021 10:38
@richardstartin richardstartin force-pushed the pluggable-index-creation branch from 2b42bc9 to 21140ba Compare December 16, 2021 13:35
Copy link
Contributor

@siddharthteotia siddharthteotia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for addressing comments

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise

}

class Text extends Wrapper {
private final boolean _commitOnClose;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not really required as it only applies to mutable index (for consuming segment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, it was hardcoded to be always true, with a comment to name it. Now it has a variable to give it a name and declare its purpose.

* The responsibility for ensuring that the correct parameters for a particular
* index type lies with the caller.
*/
public interface IndexCreationContext {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some index independent properties that can be moved to the common: min, max, sortedUniqueValues

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They can be added later if they're ever necessary. It's always easier to add things than take them away.

@Jackie-Jiang Jackie-Jiang merged commit f9ab252 into apache:master Dec 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants