Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix DataSchema thread-safe issue #9619

Merged
merged 3 commits into from
Oct 18, 2022

Conversation

walterddr
Copy link
Contributor

@walterddr walterddr commented Oct 18, 2022

Observed some NPE occuring in multistage engine for stored column data type.

this is b/c the data schema object were created during distributed stage planner to create the operator chain

  • it is shared among multiple threads (operator chain is executed in multi-thread fashion) and upstream/downstream operator (most operators required to know its upstream data schema)
  • however, due to a single data schema object, multiple thread could've accessed at the same time to gather stored column data type.
  • the check-null, then create empty array, fill in each position individual could cause race-condition.

@walterddr walterddr changed the title DataSchema thread-safe issue fix DataSchema thread-safe issue Oct 18, 2022
@codecov-commenter
Copy link

codecov-commenter commented Oct 18, 2022

Codecov Report

Merging #9619 (1a51b55) into master (209c060) will increase coverage by 3.47%.
The diff coverage is 100.00%.

@@             Coverage Diff              @@
##             master    #9619      +/-   ##
============================================
+ Coverage     60.33%   63.80%   +3.47%     
- Complexity     5148     5299     +151     
============================================
  Files          1926     1890      -36     
  Lines        103209   101451    -1758     
  Branches      15670    15467     -203     
============================================
+ Hits          62266    64733    +2467     
+ Misses        36271    31983    -4288     
- Partials       4672     4735      +63     
Flag Coverage Δ
integration2 ?
unittests1 67.32% <100.00%> (+0.02%) ⬆️
unittests2 15.63% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...java/org/apache/pinot/common/utils/DataSchema.java 72.80% <100.00%> (-3.72%) ⬇️
...t/core/query/reduce/SelectionDataTableReducer.java 69.04% <100.00%> (-2.39%) ⬇️
...va/org/apache/pinot/core/routing/RoutingTable.java 0.00% <0.00%> (-100.00%) ⬇️
...va/org/apache/pinot/common/config/NettyConfig.java 0.00% <0.00%> (-100.00%) ⬇️
...a/org/apache/pinot/common/metrics/MinionMeter.java 0.00% <0.00%> (-100.00%) ⬇️
...g/apache/pinot/common/metrics/ControllerMeter.java 0.00% <0.00%> (-100.00%) ⬇️
.../apache/pinot/common/metrics/BrokerQueryPhase.java 0.00% <0.00%> (-100.00%) ⬇️
.../apache/pinot/common/metrics/MinionQueryPhase.java 0.00% <0.00%> (-100.00%) ⬇️
...ache/pinot/server/access/AccessControlFactory.java 0.00% <0.00%> (-100.00%) ⬇️
...he/pinot/common/messages/SegmentReloadMessage.java 0.00% <0.00%> (-100.00%) ⬇️
... and 603 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@walterddr walterddr marked this pull request as ready for review October 18, 2022 18:51
Copy link
Contributor

@agavra agavra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! elegant fix - we may want to leave some comments inline explaining that this is necessary for thread-safety so it doesn't get refactored away

@walterddr walterddr merged commit c3a5280 into apache:master Oct 18, 2022
@walterddr walterddr deleted the dataschema_threadsafe branch December 6, 2023 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants