Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make upsert inner segment update atomic #7844

Merged
merged 2 commits into from
Dec 1, 2021

Conversation

Jackie-Jiang
Copy link
Contributor

When updating the upsert metadata, if the update (invalidate an old record, and validate a new record) applies to the same segment, make the update atomic to reduce the temporary inconsistency.

_result = _partialUpsertHandler.merge(previousRecord, record);
}
assert currentSegment.getValidDocIds() != null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to move this up? can you add some comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert is used to suppress the IDE warning on potential NPE without adding overhead. It generally means it is not possible to have null validDocIds. I don't think it is worth commenting though

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant segment.getValidDocIds() is called earlier

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assert statements generally do add some overhead because they add about 10 bytes of bytecode each, which interferes with hotspot’s bytecode size based inlining heuristics and may result in a method containing asserts not being inlined. I absolutely wouldn’t suggest removing it, but it’s important to know they aren’t free.

Copy link
Member

@richardstartin richardstartin Nov 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, don’t be concerned about adding null checks, they aren’t compiled the way you see them - they get speculatively optimised away based on profiling data collected in lower tiers. So this is not more efficient than a null check which never fails, it’s exactly as efficient but with more bytecode weight. This is a good related read on the topic (it’s about implicit null checks which raise NPEs, but the same mechanism applies to branches not taken): https://shipilev.net/jvm/anatomy-quarks/25-implicit-null-checks/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yupeng9 That shouldn't matter as it just reads a member variable from the segment into a local variable. I simply extracted that common logic ahead to avoid making multiple assertions.

@richardstartin Great article, thanks for sharing. In that case should we prefer Objects.requireNonNull(currentSegment.getValidDocIds()).remove(currentDocId); as IDE suggested to suppress the warning?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Jackie-Jiang what about just a null check? If it really can't be null then it doesn't change the logic. Also, if it really can't be null, why not annotate the method @Nonnull?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@richardstartin The method can return null for non-upsert case, but in the current context it can never be null. Objects.requireNonNull() is basically the concise version null check without putting the if/else block

@codecov-commenter
Copy link

codecov-commenter commented Nov 30, 2021

Codecov Report

Merging #7844 (183c30c) into master (eff570b) will decrease coverage by 6.40%.
The diff coverage is 68.87%.

❗ Current head 183c30c differs from pull request most recent head 82aab9e. Consider uploading reports for the commit 82aab9e to get more accurate results
Impacted file tree graph

@@             Coverage Diff              @@
##             master    #7844      +/-   ##
============================================
- Coverage     71.68%   65.28%   -6.41%     
- Complexity     4089     4091       +2     
============================================
  Files          1579     1535      -44     
  Lines         80819    79223    -1596     
  Branches      12017    11865     -152     
============================================
- Hits          57932    51717    -6215     
- Misses        18982    23833    +4851     
+ Partials       3905     3673     -232     
Flag Coverage Δ
integration1 ?
integration2 ?
unittests1 68.76% <70.96%> (+0.08%) ⬆️
unittests2 14.57% <2.20%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
.../core/realtime/PinotLLCRealtimeSegmentManager.java 68.20% <ø> (-10.85%) ⬇️
...lix/core/realtime/PinotRealtimeSegmentManager.java 76.96% <ø> (-3.87%) ⬇️
...ot/controller/helix/core/util/ZKMetadataUtils.java 100.00% <ø> (ø)
...in/java/org/apache/pinot/core/common/Operator.java 0.00% <ø> (ø)
...manager/realtime/HLRealtimeSegmentDataManager.java 0.00% <ø> (-82.95%) ⬇️
...operator/AcquireReleaseColumnsSegmentOperator.java 0.00% <0.00%> (ø)
...he/pinot/core/operator/BitmapDocIdSetOperator.java 68.18% <0.00%> (-6.82%) ⬇️
.../core/operator/combine/GroupByCombineOperator.java 73.91% <0.00%> (-1.09%) ⬇️
...nMaxValueBasedSelectionOrderByCombineOperator.java 60.60% <0.00%> (-11.15%) ⬇️
...or/dociditerators/ExpressionScanDocIdIterator.java 47.18% <0.00%> (-19.25%) ⬇️
... and 421 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update eff570b...82aab9e. Read the comment docs.

@Jackie-Jiang Jackie-Jiang merged commit 0290d74 into apache:master Dec 1, 2021
@Jackie-Jiang Jackie-Jiang deleted the upsert_consistency branch December 1, 2021 21:41
kriti-sc pushed a commit to kriti-sc/incubator-pinot that referenced this pull request Dec 12, 2021
When updating the upsert metadata, if the update (invalidate an old record, and validate a new record) applies to the same segment, make the update atomic to reduce the temporary inconsistency.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants