Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable recursion in PinotFS copy #8162

Merged
merged 4 commits into from
Feb 8, 2022
Merged

Conversation

mcvsubbu
Copy link
Contributor

@mcvsubbu mcvsubbu commented Feb 8, 2022

All uses of PinotFS copy API involve copying a tarred segment and
untarring it. So, copying a directory recursively will not work (the
untar will fail). It also results in wastage of effort in copying
across file systems.

Also disabled the file scheme in during segment upload on the controller,
since the URL based upload is meant to provide an external URL to be
picked up by the controller.

Description

Upgrade Notes

Does this PR prevent a zero down-time upgrade? (Assume upgrade order: Controller, Broker, Server, Minion)

  • Yes (Please label as backward-incompat, and complete the section below on Release Notes)

Does this PR fix a zero-downtime upgrade introduced earlier?

  • Yes (Please label this as backward-incompat, and complete the section below on Release Notes)

Does this PR otherwise need attention when creating release notes? Things to consider:

  • New configuration options
  • Deprecation of configurations
  • Signature changes to public methods/interfaces
  • New plugins added or old plugins removed
  • Yes (Please label this PR as release-notes and complete the section on Release Notes)

Release Notes

Documentation

All uses of PinotFS copy API involve copying a tarred segment and
untarring it. So, copying a directory recursively will not work (the
untar will fail). It also results in wastage of effort in copying
across file systems.

Also disabled the file scheme in during segment upload on the controller,
since the URL based upload is meant to provide an external URL to be
picked up by the controller.
Copy link
Contributor

@snleee snleee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise

Copy link
Contributor

@snleee snleee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@codecov-commenter
Copy link

codecov-commenter commented Feb 8, 2022

Codecov Report

Merging #8162 (71cac36) into master (26bad8b) will decrease coverage by 0.02%.
The diff coverage is 14.28%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #8162      +/-   ##
============================================
- Coverage     71.39%   71.37%   -0.03%     
+ Complexity     4306     4303       -3     
============================================
  Files          1620     1624       +4     
  Lines         83917    84302     +385     
  Branches      12545    12637      +92     
============================================
+ Hits          59915    60168     +253     
- Misses        19916    20014      +98     
- Partials       4086     4120      +34     
Flag Coverage Δ
integration1 28.83% <0.00%> (-0.14%) ⬇️
integration2 27.68% <0.00%> (-0.09%) ⬇️
unittests1 67.89% <18.18%> (-0.06%) ⬇️
unittests2 14.17% <0.00%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ot/common/utils/fetcher/SegmentFetcherFactory.java 96.49% <ø> (ø)
...ces/PinotSegmentUploadDownloadRestletResource.java 58.79% <0.00%> (+0.53%) ⬆️
.../java/org/apache/pinot/spi/filesystem/PinotFS.java 0.00% <0.00%> (ø)
.../org/apache/pinot/spi/filesystem/LocalPinotFS.java 72.72% <20.00%> (-17.07%) ⬇️
...data/manager/realtime/DefaultSegmentCommitter.java 0.00% <0.00%> (-80.00%) ⬇️
...er/api/resources/LLCSegmentCompletionHandlers.java 43.56% <0.00%> (-18.82%) ⬇️
...nt/local/startree/v2/store/StarTreeDataSource.java 40.00% <0.00%> (-13.34%) ⬇️
...data/manager/realtime/SegmentCommitterFactory.java 88.23% <0.00%> (-11.77%) ⬇️
...ache/pinot/core/operator/docidsets/OrDocIdSet.java 86.36% <0.00%> (-11.37%) ⬇️
...altime/ServerSegmentCompletionProtocolHandler.java 51.42% <0.00%> (-6.67%) ⬇️
... and 62 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 26bad8b...71cac36. Read the comment docs.

@mcvsubbu mcvsubbu merged commit 1382d29 into apache:master Feb 8, 2022
@mcvsubbu mcvsubbu deleted the disable-recursion branch February 8, 2022 23:10
@@ -85,7 +85,7 @@ public boolean doMove(URI srcUri, URI dstUri)
@Override
public boolean copy(URI srcUri, URI dstUri)
throws IOException {
copy(toFile(srcUri), toFile(dstUri));
copy(toFile(srcUri), toFile(dstUri), false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the behavior of this API is not the same in LocalPinotFs vs. S3PinotFs.

should we also expose a copy(URI srcUri, URI dstUri, boolean isRecursive) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Wen there is a need, we can add it. In general, discourage recursive copy, since we know not what is being copied.

Copy link
Contributor

@walterddr walterddr Feb 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the fact is that after this PR
LocalPinotFS.copy(srcUri, dstUri) throws exception if srcUri is a directory.
S3PinotFS.copy(srcUri, dstUri) return true and a recursive copy happened if srcUri is a directory.

this basically means users who needs to do a recursive copying would need to know exactly what underlying implementation of the PinotFS it has. i am not sure that's the right abstraction for this API.

my suggestion is to

  1. make the current S3PintoFS.copy(srcUri, dstUri) to also throw exception when the URI is a folder. thus make it consistent with LocalFS
  2. move the current recursive copy implementation in S3 to a new API: PinotFS.copy(srcUri, dstUri, recursive = true).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mcvsubbu any additional comments? I made a PR based on my idea above #8200?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants