[pinot-spark-connector] Fix empty data table handling in GRPC reader #9837

cbalci · 2022-11-21T03:16:06Z

Fixes a bug where an exception is thrown when all of the data tables retuned in a Spark 'split' is empty.

When the Spark reader pushes down a filter which effectively reduces the returned data for all the segments to zero, GRPC data fetcher (PinotGrpcServerDataFetcher) wrongly assumes that there was an error. This behavior was introduced when porting logic from the non-GRPC fetcher (PinotServerDataFetcher) where the assumption is correct. Http and GRPC interfaces differ in behavior where the HTTP interface will return empty instances of datatables whereas with GRPC you only get metadata.

I'm updating the error handling to remove this assumption and record an error when there is actual exception thrown by the GRPC server.

bugfix

codecov-commenter · 2022-11-21T03:54:44Z

Codecov Report

Merging #9837 (abc8c65) into master (3724ba2) will increase coverage by 35.60%.
The diff coverage is n/a.

@@              Coverage Diff              @@
##             master    #9837       +/-   ##
=============================================
+ Coverage     34.69%   70.29%   +35.60%     
- Complexity      190     5000     +4810     
=============================================
  Files          1965     1965               
  Lines        105115   105115               
  Branches      15909    15909               
=============================================
+ Hits          36474    73895    +37421     
+ Misses        65542    26079    -39463     
- Partials       3099     5141     +2042

Flag	Coverage Δ
integration1	`25.32% <ø> (+0.19%)`	⬆️
integration2	`24.53% <ø> (?)`
unittests1	`67.96% <ø> (?)`
unittests2	`15.82% <ø> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
...lix/core/realtime/PinotRealtimeSegmentManager.java	`76.43% <0.00%> (-1.58%)`	⬇️
...pache/pinot/core/query/utils/idset/EmptyIdSet.java	`25.00% <0.00%> (ø)`
...anager/realtime/SegmentBuildTimeLeaseExtender.java	`63.23% <0.00%> (ø)`
.../helix/core/realtime/SegmentCompletionManager.java	`73.17% <0.00%> (+0.20%)`	⬆️
.../core/realtime/PinotLLCRealtimeSegmentManager.java	`72.07% <0.00%> (+0.45%)`	⬆️
...che/pinot/broker/routing/BrokerRoutingManager.java	`86.07% <0.00%> (+0.55%)`	⬆️
...e/pinot/common/function/TransformFunctionType.java	`100.00% <0.00%> (+0.94%)`	⬆️
...x/core/realtime/MissingConsumingSegmentFinder.java	`87.09% <0.00%> (+1.07%)`	⬆️
.../apache/pinot/common/exception/QueryException.java	`94.44% <0.00%> (+1.11%)`	⬆️
...va/org/apache/pinot/controller/ControllerConf.java	`58.11% <0.00%> (+1.13%)`	⬆️
... and 1116 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

cbalci · 2022-11-23T20:55:46Z

Thanks @Jackie-Jiang !

Fix Spark connector empty datatable handling in GRPC reader

abc8c65

Jackie-Jiang added the bugfix label Nov 22, 2022

Jackie-Jiang approved these changes Nov 22, 2022

View reviewed changes

Jackie-Jiang merged commit 32314bb into apache:master Nov 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pinot-spark-connector] Fix empty data table handling in GRPC reader #9837

[pinot-spark-connector] Fix empty data table handling in GRPC reader #9837

cbalci commented Nov 21, 2022

codecov-commenter commented Nov 21, 2022 •

edited

Loading

cbalci commented Nov 23, 2022

[pinot-spark-connector] Fix empty data table handling in GRPC reader #9837

[pinot-spark-connector] Fix empty data table handling in GRPC reader #9837

Conversation

cbalci commented Nov 21, 2022

codecov-commenter commented Nov 21, 2022 • edited Loading

Codecov Report

cbalci commented Nov 23, 2022

codecov-commenter commented Nov 21, 2022 •

edited

Loading