-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Refactor] Introduce BaseProjectOperator and ValueBlock #10405
[Refactor] Introduce BaseProjectOperator and ValueBlock #10405
Conversation
5f71c9a
to
8bfe328
Compare
Codecov Report
@@ Coverage Diff @@
## master #10405 +/- ##
=============================================
- Coverage 68.37% 35.19% -33.19%
+ Complexity 6096 282 -5814
=============================================
Files 2055 2053 -2
Lines 111389 111397 +8
Branches 16939 16939
=============================================
- Hits 76167 39206 -36961
- Misses 29799 68778 +38979
+ Partials 5423 3413 -2010
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 1153 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
8bfe328
to
9ea9261
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we consider make ExpressionDocIdSet also take columnContextMap instead of _dataSourceMap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm overall. let's add more detail on the intellij refactor steps done to the PR description. looks like we renamed several classes/interfaces.
public void init(List<TransformFunction> arguments, Map<String, DataSource> dataSourceMap) { | ||
Preconditions | ||
.checkArgument(arguments.size() == 2, "2 arguments are required for transform function: %s", getName()); | ||
public void init(List<TransformFunction> arguments, Map<String, ColumnContext> columnContextMap) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably include ColumnContext
in the PR description for the refactor/rename
} | ||
|
||
// Build intermediate result block based on aggregation result from the executor | ||
return new AggregationResultsBlock(_aggregationFunctions, aggregationExecutor.getResult()); | ||
} | ||
|
||
@Override | ||
public List<Operator> getChildOperators() { | ||
return Collections.singletonList(_transformOperator); | ||
public List<BaseProjectOperator<?>> getChildOperators() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This API needs not to be change right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IDE is showing warning because the generic type is not included
import org.apache.pinot.core.operator.blocks.ValueBlock; | ||
|
||
|
||
public abstract class BaseProjectOperator<T extends ValueBlock> extends BaseOperator<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need the generic here? seems like the usages are always ValueBlock instead of the generic extension of T
right? any chance we want to use it differently going forward?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't right now, but I do want to make it more specific since ValueBlock
is not a concrete class
Both
Projection
andTransform
in pinot are SQL project operation.This PR introduce the
BaseProjectOperator
base class that represent the executor for SQL project, which generates theValueBlock
.The
BaseProjectOperator
is designed in a way that it can chain itself (takeValueBlock
as both input and output).With the change, pass-through transform is no longer needed because high level operator (e.g. selection/aggregation/group-by) can directly take the output of
ProjectionOperator
.In the future, we may introduce more project operators (e.g. filter on transform, local join), and make flexible query plan to solve more complex queries.
This PR includes the following refactors:
BaseProjectOperator
and makeProjectionOperator
andTransformOperator
extend itPassThrowTransformOperator
ProjectPlanNode
andStarTreeProjectPlanNode
that handle the project operator creationValueBlock
interface as the result forProjectOperator
, and makeProjectionBlock
andTransformBlock
extend itColumnContext
as the wrapper for column metadata and optionalDataSource
Incompatible (compile)
Several interfaces are changed to use the new interface/class