Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding isInstantDelete API for segment deletion #8069

Closed

Conversation

walterddr
Copy link
Contributor

Support instant delete instead of asking the retention manager to remove stuff after 7 days.

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise.

We might also want to add a controller config for retention manager to skip moving segments to deleted dir. That can be done in a separate PR

@codecov-commenter
Copy link

codecov-commenter commented Jan 25, 2022

Codecov Report

Merging #8069 (bc86448) into master (7ec47c4) will decrease coverage by 57.05%.
The diff coverage is 10.03%.

❗ Current head bc86448 differs from pull request most recent head fc6d981. Consider uploading reports for the commit fc6d981 to get more accurate results
Impacted file tree graph

@@              Coverage Diff              @@
##             master    #8069       +/-   ##
=============================================
- Coverage     71.29%   14.23%   -57.06%     
+ Complexity     4224       81     -4143     
=============================================
  Files          1599     1561       -38     
  Lines         82957    81538     -1419     
  Branches      12375    12253      -122     
=============================================
- Hits          59148    11611    -47537     
- Misses        19815    69065    +49250     
+ Partials       3994      862     -3132     
Flag Coverage Δ
integration1 ?
integration2 ?
unittests1 ?
unittests2 14.23% <10.03%> (-0.13%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...a/org/apache/pinot/common/metrics/ServerMeter.java 0.00% <0.00%> (-100.00%) ⬇️
...e/pinot/common/utils/FileUploadDownloadClient.java 0.00% <0.00%> (-62.46%) ⬇️
...apache/pinot/common/utils/SqlResultComparator.java 0.00% <ø> (ø)
...pinot/common/utils/fetcher/BaseSegmentFetcher.java 0.00% <0.00%> (-81.82%) ⬇️
...ler/api/resources/PinotSegmentRestletResource.java 3.38% <0.00%> (-19.81%) ⬇️
.../controller/helix/ControllerRequestURLBuilder.java 74.01% <0.00%> (-9.32%) ⬇️
...org/apache/pinot/core/auth/BasicAuthPrincipal.java 0.00% <0.00%> (-70.00%) ⬇️
...manager/realtime/LLRealtimeSegmentDataManager.java 0.00% <0.00%> (-71.00%) ⬇️
...ata/manager/realtime/RealtimeTableDataManager.java 0.00% <0.00%> (-69.11%) ⬇️
...a/org/apache/pinot/core/transport/QueryServer.java 0.00% <0.00%> (-74.47%) ⬇️
... and 1311 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7ec47c4...fc6d981. Read the comment docs.

}

protected void deleteSegmentsWithDelay(final String tableName, final Collection<String> segmentIds,
final long deletionDelaySeconds) {
public void deleteSegments(String tableName, Collection<String> segmentIds, boolean isInstantDeletion) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: make the delete segments api with a time argument public, and call it with 0 (or -1) to indicate "delete right away"

deletedSegmentDestURI.toString());
URI fileToDeleteURI = URIUtils.getUri(_dataDir, rawTableName, URIUtils.encode(segmentId));
PinotFS pinotFS = PinotFSFactory.create(fileToDeleteURI.getScheme());
if (isInstantDeletion) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not take this from the config for number of days to save the segment? If the config is 0, then delete it right here. This will avoid changing the API

Copy link
Contributor Author

@walterddr walterddr Jan 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the config is global across the entire cluster, where this API applies to a specific table/segment only on demand.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then maybe add a table config to override the setting, so that the behavior is the same for all segments in the table, and not dependent on how someone chooses to delete a segment?

@mcvsubbu
Copy link
Contributor

@walterddr looks like you created a new issue. I am not sure about the problem you are trying to solve, so once we have that, we can come up with a solution that makes sense. Please do clarify the problem on the issue, thanks. Would appreciate if you can hold the merge until then

/**
* Delete a list of segments from ideal state and remove them from the local storage.
*
* @param tableNameWithType Table name with type suffix
* @param segmentNames List of names of segment to be deleted
* @param isInstanceDelete Indicate if the deleted segments will be put in retention before deletion.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you update this description a bit? What happens if it's true and what happens if it's false.

* @return Request response
*/
public synchronized PinotResourceManagerResponse deleteSegments(String tableNameWithType, List<String> segmentNames) {
public synchronized PinotResourceManagerResponse deleteSegments(String tableNameWithType, List<String> segmentNames,
boolean isInstanceDelete) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter name seems inconsistent across the PR. It'd be good to unify it.

@walterddr
Copy link
Contributor Author

closing

@walterddr walterddr closed this Jan 26, 2022
@walterddr walterddr reopened this Feb 3, 2022
@walterddr
Copy link
Contributor Author

after some thought and discussion in #8078 i think this is probably the only way to do it. because once table config is deleted, there's no way to honor the table-level retention override. (since tableConfig has already gone).

The only thing we can honor is whether to instantly delete (e.g. retention can only be 0 or cluster default) which essentially making this a binary override

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants