-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[colocated-join] Adds Support for instancePartitionsMap in Table Config #8989
Conversation
Codecov Report
@@ Coverage Diff @@
## master #8989 +/- ##
=============================================
+ Coverage 28.83% 69.99% +41.15%
- Complexity 47 4685 +4638
=============================================
Files 1805 1852 +47
Lines 94438 98822 +4384
Branches 14140 15032 +892
=============================================
+ Hits 27234 69171 +41937
+ Misses 64654 24779 -39875
- Partials 2550 4872 +2322
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
@@ -55,19 +58,36 @@ public InstanceAssignmentDriver(TableConfig tableConfig) { | |||
public InstancePartitions assignInstances(InstancePartitionsType instancePartitionsType, | |||
List<InstanceConfig> instanceConfigs, @Nullable InstancePartitions existingInstancePartitions) { | |||
String tableNameWithType = _tableConfig.getTableName(); | |||
LOGGER.info("Starting {} instance assignment for table: {}", instancePartitionsType, tableNameWithType); | |||
Preconditions.checkState(!TableConfigUtils.isTableInGroup(_tableConfig)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any concerns with making ZK propertyStore a member of InstanceAssignmentDriver
? That would allow to get rid of this Precondition here and return the pre-computed instance-partitions by doing a look-up on the propertyStore here.
Otherwise we will have to do a check whenever we call this method to see if the table is part of a table-group and handle that case every time.
...t-common/src/main/java/org/apache/pinot/common/assignment/InstanceAssignmentConfigUtils.java
Outdated
Show resolved
Hide resolved
pinot-common/src/main/java/org/apache/pinot/common/metadata/ZKMetadataProvider.java
Outdated
Show resolved
Hide resolved
pinot-common/src/main/java/org/apache/pinot/common/utils/config/TableGroupConfigUtils.java
Outdated
Show resolved
Hide resolved
pinot-common/src/main/java/org/apache/pinot/common/utils/config/TableGroupConfigUtils.java
Outdated
Show resolved
Hide resolved
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/Constants.java
Outdated
Show resolved
Hide resolved
.../src/main/java/org/apache/pinot/controller/api/resources/PinotTableGroupRestletResource.java
Outdated
Show resolved
Hide resolved
.../src/main/java/org/apache/pinot/controller/api/resources/PinotTableGroupRestletResource.java
Outdated
Show resolved
Hide resolved
...apache/pinot/controller/helix/core/assignment/segment/ColocatedOfflineSegmentAssignment.java
Outdated
Show resolved
Hide resolved
...n/java/org/apache/pinot/controller/helix/core/assignment/segment/SegmentAssignmentUtils.java
Outdated
Show resolved
Hide resolved
pinot-spi/src/main/java/org/apache/pinot/spi/config/table/TableGroupConfig.java
Outdated
Show resolved
Hide resolved
.../src/main/java/org/apache/pinot/controller/api/resources/PinotTableGroupRestletResource.java
Outdated
Show resolved
Hide resolved
pinot-spi/src/main/java/org/apache/pinot/spi/config/table/TableConfig.java
Outdated
Show resolved
Hide resolved
if (StringUtils.isBlank(tableConfig.getTableGroupName())) { | ||
return; | ||
} | ||
Preconditions.checkState(tableConfig.getValidationConfig().getReplicaGroupStrategyConfig() != null, "Must provide" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check may not be needed.
replicaGroupStrategyConfig
is the legacy way (which at some point has to be deleted so let's try not to leak it more in the code) of specifying replica groups and partitioned replica groups config.
Since adding tables to TableGroup
is wired via TableGroupConfig
and the instanceAssignmentConfig
within TableGroupConfig, it is sufficient and you don't need to enforce that each table within the TableGroup should also have their own replicaGroupStrategyConfig
.
Also, the current check does not really ensure that table is indeed partitioned. So you need to check for segmentPartitionConfig
.
The partitioning column info coming out of replicaGroupStrategyConfig is only relevant in the context when a partitioned table is partitioned at the replica group level as well. In any case, InstanceAssignmentConfig
/ InstanceReplicalGroupPartitionConfig
handle both scenarios.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I was also confused about replicaGroupStrategyConfig
. I have removed it from the validation logic now and instead check for column partition map.
Just left a few comments. Still reviewing rest of the changes. |
return assignInstances(instancePartitionsName, groupName, instanceConfigs, assignmentConfig, null); | ||
} | ||
|
||
private static InstancePartitions assignInstances(String instancePartitionsName, String entityName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand the concept of entityName
and why is it needed. Can you please elaborate ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah so this method is essentially getting called for both a table-group and a table itself, so it could either be a table-group name or a table name. This name is being used for logging and performing hash-based rotations (e.g. HashBasedRotateInstanceConstraintApplier
).
@@ -1688,13 +1690,22 @@ private void assignInstances(TableConfig tableConfig, boolean override) { | |||
} | |||
} | |||
|
|||
boolean isTableInGroup = TableConfigUtils.isTableInGroup(tableConfig); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the scenarios that we support ?
Adding a new table to TableGroup that pre-exists with 1 or more tables.
In such scenario, the replica group based InstancePartitions for the group should pre-exist and when addTable
calls comes in PinotHelixResourceManager
and internally calls assignInstances
, no new instance assignment should happen and the group's existing InstancePartitions
should be retrieved and persisted in ZK for the new table. Looks like this is supported ?
Creating a new TableGroup with new tables.
- When the group is created, we don't do instance assignment and just do metadata change w.r.t that group exists without any tables and no instances.
- When the first new table is created and added to the group, the
addTable -> assignInstances
call should recognize that this table is part of a group for which instance assignment is yet to happen. So we call the driver to assign instances for the group. - When the 2nd, 3rd ... table is created and added to the group, the
addTable -> assignInstances
call should recognize that this table is part of a group for which instance assignment is already done.
The other way to do is that when TableGroup is created, it triggers InstanceAssignment by calling the driver. So when 1st, 2nd, 3rd table is added, it finds it is member of an existing group with existing assignment so it is pretty much NO-OP
It seems like code at line 1701 is assuming that InstancePartitions for a TableGroup should exist. But I don't see instance assignment being done during creation of TableGroup. So how do we handle the case of 1st table ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Furthermore
Creating a new TableGroup out of N existing tables
Option 1 -
- Create TableGroup and assign instances so InstancePartitions for the group are computed and persisted
- Trigger rebalance on each of the N tables with reassignInstances set to true
- The rebalance code will call the instance assignment driver and it should return the existing InstancePartitions for the group.
- Segments are then assigned based on the returned InstancePartitions
Option 2 -
- Create TableGroup but don't do instance assignment.
- Trigger rebalance on each of the N tables with reassignInstances set to true.
- The rebalance code will call the instance assignment driver and it should detect that InstancePartitions don't exit for the group, so it assigns instance which are then used for segment assignment.
Adding an existing table to an existing TableGroup
This is like steps 2,3,4 of option 1 mentioned above. Correct ?
It would be good to get clarity (and add brief details to code comments/docs as well) on how such scenarios are handled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a new table to TableGroup that pre-exists with 1 or more tables.
Yes this is supported. To do this one can edit the table-config for the corresponding table and set the groupName
field. You will have to initiate a rebalance on the table to ensure data is moved around appropriately for that table.
The other way to do is that when TableGroup is created, it triggers InstanceAssignment by calling the driver.
Yup this is how I am doing it right now. You can refer to the createTableGroup API in PinotTableGroupRestletResource. When the table-group is created, we perform instance assignment for the same and persist it in ZK for future re-use.
When we need to update the instance assignment for the group (due to addition/removal of instances), we can initiate a "Re-compute Table Group" operation from the UI and will then have to call Rebalance for each of the tables. The "Re-compute" operation will do the instance assignment for the group again and update the same in ZK. When we initiate rebalance for a table in the group, data will move around for that table based on the newly updated instance partitions for the group.
Later, I am also planning to add a "Re-balance Table Group" option in the UI which should move the data for all the tables in the group.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating a new TableGroup out of N existing tables
Yes current approach is based on option-1.
Adding an existing table to an existing TableGroup
This is like steps 2,3,4 of option 1 mentioned above. Correct ?
Yes that's correct.
I'll update the code and doc with more details on this.
pinot-spi/src/main/java/org/apache/pinot/spi/config/table/TableConfig.java
Outdated
Show resolved
Hide resolved
Hi @ankitsultana, Thanks for writing that doc and also making the video. It was very helpful. I think we can simplify the design a bit and align it with how the other features achieve segment assignment. Please take a look at it and we can discuss on the doc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, only some minor comments
pinot-common/src/main/java/org/apache/pinot/common/utils/config/TableConfigUtils.java
Outdated
Show resolved
Hide resolved
pinot-spi/src/main/java/org/apache/pinot/spi/config/table/TableConfig.java
Outdated
Show resolved
Hide resolved
pinot-spi/src/main/java/org/apache/pinot/spi/utils/builder/TableConfigBuilder.java
Outdated
Show resolved
Hide resolved
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/utils/TableConfigUtils.java
Outdated
Show resolved
Hide resolved
pinot-common/src/main/java/org/apache/pinot/common/assignment/InstancePartitions.java
Show resolved
Hide resolved
...n/java/org/apache/pinot/controller/api/resources/PinotInstanceAssignmentRestletResource.java
Outdated
Show resolved
Hide resolved
...ntroller/src/main/java/org/apache/pinot/controller/helix/core/PinotHelixResourceManager.java
Outdated
Show resolved
Hide resolved
...ntroller/src/main/java/org/apache/pinot/controller/helix/core/rebalance/TableRebalancer.java
Outdated
Show resolved
Hide resolved
...ntroller/src/main/java/org/apache/pinot/controller/helix/core/rebalance/TableRebalancer.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor non-blocking comments
You can find the design doc here: https://docs.google.com/document/d/1HWkRmjUxLBnkKETw21GJ--yQdiFa9hEE4v6Y68nNWx4/edit
I am planning to raise PRs in 4 increments: