-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow segment upload via Metadata in MergeRollup Minion task #9825
Conversation
@@ -242,6 +255,10 @@ public List<SegmentConversionResult> executeTask(PinotTaskConfig pinotTaskConfig | |||
new BasicHeader(FileUploadDownloadClient.CustomHeaders.SEGMENT_ZK_METADATA_CUSTOM_MAP_MODIFIER, | |||
segmentZKMetadataCustomMapModifier.toJsonString()); | |||
|
|||
URI outputSegmentTarURI = moveSegmentToOutputPinotFS(pinotTaskConfig.getConfigs(), convertedTarredSegmentFile); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that we should check push mode first before moving the segment. Otherwise, this middle step is not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
As @zhtaoxiang mentioned, we should NOT move the segments to deep storage when we do the segment tar file push. In that case, the controller will move the segment to the deep storage during the upload phase.
...minion-builtin-tasks/src/main/java/org/apache/pinot/plugin/minion/tasks/MinionPushUtils.java
Outdated
Show resolved
Hide resolved
singleFileGenerationTaskConfig.put( | ||
BatchConfigProperties.OUTPUT_SEGMENT_DIR_URI, outputSegmentDirURI.toString()); | ||
} | ||
if ((outputDirURI == null) || (pushMode == null)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When push mode is MetadataPush, we should set the default location data_dir/tablename
when OUTPUT_SEGMENT_DIR_URI is not set by users. Most of the time, users will not need to set it.
String downloadUrl = downloadURLList[0]; | ||
URI downloadURI = URI.create(downloadUrl); | ||
URI outputDirURI = null; | ||
if (!downloadURI.getScheme().contentEquals("http")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I misunderstand something, but I don't fully follow the logic here. Why is the outputDir decided by the downloadUrl?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the segment needs to be copied to some place that is accessible by the controller in case of metadata push.
If user doesn't set the outputDirUri then the best place would be to copy it in deep store. The download url refers to deep store path and hence using it to derive the metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's the goal, then we just need to use controllerConfig.getDataDir()/tableName
. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes more sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the commit to use dataDir.
...pinot-minion-builtin-tasks/src/main/java/org/apache/pinot/plugin/minion/tasks/TaskUtils.java
Outdated
Show resolved
Hide resolved
...c/main/java/org/apache/pinot/plugin/minion/tasks/BaseMultipleSegmentsConversionExecutor.java
Outdated
Show resolved
Hide resolved
@@ -242,6 +255,10 @@ public List<SegmentConversionResult> executeTask(PinotTaskConfig pinotTaskConfig | |||
new BasicHeader(FileUploadDownloadClient.CustomHeaders.SEGMENT_ZK_METADATA_CUSTOM_MAP_MODIFIER, | |||
segmentZKMetadataCustomMapModifier.toJsonString()); | |||
|
|||
URI outputSegmentTarURI = moveSegmentToOutputPinotFS(pinotTaskConfig.getConfigs(), convertedTarredSegmentFile); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
As @zhtaoxiang mentioned, we should NOT move the segments to deep storage when we do the segment tar file push. In that case, the controller will move the segment to the deep storage during the upload phase.
...c/main/java/org/apache/pinot/plugin/minion/tasks/BaseMultipleSegmentsConversionExecutor.java
Outdated
Show resolved
Hide resolved
...c/main/java/org/apache/pinot/plugin/minion/tasks/BaseMultipleSegmentsConversionExecutor.java
Outdated
Show resolved
Hide resolved
@@ -242,6 +256,10 @@ public List<SegmentConversionResult> executeTask(PinotTaskConfig pinotTaskConfig | |||
new BasicHeader(FileUploadDownloadClient.CustomHeaders.SEGMENT_ZK_METADATA_CUSTOM_MAP_MODIFIER, | |||
segmentZKMetadataCustomMapModifier.toJsonString()); | |||
|
|||
URI outputSegmentTarURI = moveSegmentToOutputPinotFS(pinotTaskConfig.getConfigs(), convertedTarredSegmentFile); | |||
LOGGER.info("Moved generated segment from [{}] to location: [{}]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the Minion somehow got killed (e.g. restart) after finish the segment move to outputPinotFS but before finishing the metadata push to controller?
I think that this can potentially leave some files in the deep storage without any Pinot side segment metadata pointing those files. This will potentially become the problem unless we have the cleanup mechanism. Do we have any design/solution to address this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One solution can be to always mention a seperate OUTPUT_DIR and always clean that dir in case the task fails. Even if we fail to clean the outputDir it should not be a problem if it is seperate from the deep store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I think this may not be a problem. Because Minion task should be idempotent, a later retry should generate segments with the same name and the existing ones will be reused or be override.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not assume that the later retry will always come to finish this particular task. Similarly, the controller can be down and fail the metadata update. If the controller recovery happens after the minion's retry attempt fails, I think that this will cause some files to sit at the deep storage.
Codecov Report
@@ Coverage Diff @@
## master #9825 +/- ##
=============================================
+ Coverage 28.06% 70.43% +42.36%
- Complexity 53 4984 +4931
=============================================
Files 1949 1981 +32
Lines 104632 106370 +1738
Branches 15847 16117 +270
=============================================
+ Hits 29362 74918 +45556
+ Misses 72395 26235 -46160
- Partials 2875 5217 +2342
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
public static Map<String, String> getPushTaskConfig(String tableName, Map<String, String> taskConfigs, | ||
ClusterInfoAccessor clusterInfoAccessor) { | ||
try { | ||
URI outputDirURI = URI.create(clusterInfoAccessor.getDataDir() + "/" + tableName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should check push mode first before deciding the outputDir?
If push mode is tar, we simply need to build segment locally and tar push to Pinot.
Let me know if I misunderstood something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not really used in the case when pushMode is TAR.
if (BatchConfigProperties.SegmentPushType.valueOf(pushMode.toUpperCase())
!= BatchConfigProperties.SegmentPushType.TAR) {
outputSegmentTarURI = moveSegmentToOutputPinotFS(configs, convertedTarredSegmentFile);
LOGGER.info("Moved generated segment from [{}] to location: [{}]", convertedTarredSegmentFile,
outputSegmentTarURI);
} else {
outputSegmentTarURI = convertedTarredSegmentFile.toURI();
}
......
case TAR:
try (PinotFS pinotFS = MinionTaskUtils.getLocalPinotFs()) {
SegmentPushUtils.pushSegments(
spec, pinotFS, Arrays.asList(outputSegmentTarURI.toString()), headers, parameters);
} catch (RetriableOperationException | AttemptsExceededException e) {
throw new RuntimeException(e);
}
break;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that it is not really used in the case when push mode is TAR. However, I feel the current logic is a little bit hard to follow if we don't have the full context.
If we can check push mode first then decide what to do, it's easier to follow. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. Made the changes. A lot of the code used here has been borrowed from SegmentGenerationAndPushTask. I will also raise another PR to port some of these changes from here to there.
@@ -97,6 +97,49 @@ public static URI generateSegmentTarURI(URI dirURI, URI fileURI, String prefix, | |||
public static void pushSegments(SegmentGenerationJobSpec spec, PinotFS fileSystem, List<String> tarFilePaths) | |||
throws RetriableOperationException, AttemptsExceededException { | |||
String tableName = spec.getTableSpec().getTableName(); | |||
AuthProvider authProvider = AuthProviderUtils.makeAuthProvider(spec.getAuthToken()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just curious: why do we need to create those 3 new methods? It seems that there is no logic change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to pass this custom header in the new flow
Older methods don't allow passing new headers
// Set segment ZK metadata custom map modifier into HTTP header to modify the segment ZK metadata
SegmentZKMetadataCustomMapModifier segmentZKMetadataCustomMapModifier =
getSegmentZKMetadataCustomMapModifier(pinotTaskConfig, segmentConversionResult);
Header segmentZKMetadataCustomMapModifierHeader =
new BasicHeader(FileUploadDownloadClient.CustomHeaders.SEGMENT_ZK_METADATA_CUSTOM_MAP_MODIFIER,
segmentZKMetadataCustomMapModifier.toJsonString());
...ot/plugin/minion/tasks/realtimetoofflinesegments/RealtimeToOfflineSegmentsTaskGenerator.java
Outdated
Show resolved
Hide resolved
...c/main/java/org/apache/pinot/plugin/minion/tasks/BaseMultipleSegmentsConversionExecutor.java
Outdated
Show resolved
Hide resolved
...rc/test/java/org/apache/pinot/integration/tests/MergeRollupMinionClusterIntegrationTest.java
Outdated
Show resolved
Hide resolved
...c/main/java/org/apache/pinot/plugin/minion/tasks/BaseMultipleSegmentsConversionExecutor.java
Show resolved
Hide resolved
...c/main/java/org/apache/pinot/plugin/minion/tasks/BaseMultipleSegmentsConversionExecutor.java
Outdated
Show resolved
Hide resolved
...c/main/java/org/apache/pinot/plugin/minion/tasks/BaseMultipleSegmentsConversionExecutor.java
Outdated
Show resolved
Hide resolved
...c/main/java/org/apache/pinot/plugin/minion/tasks/BaseMultipleSegmentsConversionExecutor.java
Outdated
Show resolved
Hide resolved
public static Map<String, String> getPushTaskConfig(String tableName, Map<String, String> taskConfigs, | ||
ClusterInfoAccessor clusterInfoAccessor) { | ||
try { | ||
URI outputDirURI = URI.create(clusterInfoAccessor.getDataDir() + "/" + tableName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that it is not really used in the case when push mode is TAR. However, I feel the current logic is a little bit hard to follow if we don't have the full context.
If we can check push mode first then decide what to do, it's easier to follow. WDYT?
...c/main/java/org/apache/pinot/plugin/minion/tasks/BaseMultipleSegmentsConversionExecutor.java
Outdated
Show resolved
Hide resolved
|
||
if (!isLocalOutputDir(outputDirURIScheme)) { | ||
singleFileGenerationTaskConfig.put(BatchConfigProperties.OUTPUT_SEGMENT_DIR_URI, outputDirURI.toString()); | ||
singleFileGenerationTaskConfig.put(BatchConfigProperties.PUSH_MODE, pushMode); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: should we set push_mode to metadata
when the provided one is URI
? and we should log this change.
370bc3d
to
72efe9e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please address the following issue.
...c/main/java/org/apache/pinot/plugin/minion/tasks/BaseMultipleSegmentsConversionExecutor.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's fix the test. LGTM other than minor comments. Thank you for working on this!
String uploadURL = taskConfigs.get(MinionConstants.UPLOAD_URL_KEY); | ||
SegmentConversionUtils.uploadSegment(taskConfigs, headers, parameters, tableNameWithType, segmentName, | ||
uploadURL, tarFile); | ||
} catch (RetriableOperationException | AttemptsExceededException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to catch specific exceptions and wrap it with RuntimeException
here? Our top caller function's definition: executeTask() throw Exception
SegmentPushUtils.getSegmentUriToTarPathMap(outputSegmentDirURI, pushJobSpec, | ||
new String[]{outputSegmentTarURI.toString()}); | ||
SegmentPushUtils.sendSegmentUriAndMetadata(spec, outputFileFS, segmentUriToTarPathMap, headers, parameters); | ||
} catch (RetriableOperationException | AttemptsExceededException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
Users need to use the same configs as
SegmentGenerationAndPushTask
to switch to metadata push.e.g.
pushType: METADATA