Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor timeValueCheck for Pinot TableConfig #9349

Merged
merged 1 commit into from
Sep 9, 2022

Conversation

xiangfu0
Copy link
Contributor

@xiangfu0 xiangfu0 commented Sep 9, 2022

  • Move _continueOnError to TableConfig.IngestionConfig
  • Rename _validateTimeValue to _rowTimeValueCheck and move to TableConfig.IngestionConfig
  • Add _segmentTimeValueCheck to TableConfig.IngestionConfig and use it to set _skipTimeValueCheck in SegmentGeneratorConfig.

Release Notes

  • Added a new table ingestion config to skip problematic rows:
    • _continueOnError which if set to true will let Pinot skip any row indexing error and move on to the next row.
    • _rowTimeValueCheck which if set to true will let Pinot check the time value for each row and replace the value with null if the value is invalid.
    • _segmentTimeValueCheck which if set to false will let Pinot bypass any segment metadata level time validation.

@xiangfu0 xiangfu0 force-pushed the skip_time_value_check branch 4 times, most recently from 19371ef to 02cb9d8 Compare September 9, 2022 01:12
@xiangfu0 xiangfu0 changed the title Set SkipTimeValueCheck from table config Refactor timeValueCheck for Pinot TableConfig Sep 9, 2022
@xiangfu0 xiangfu0 force-pushed the skip_time_value_check branch 2 times, most recently from 116db56 to 1fbe2b8 Compare September 9, 2022 03:01
@KKcorps
Copy link
Contributor

KKcorps commented Sep 9, 2022

The only issue I see is these two configs will be pretty difficult to understand for a new user. e.g. why are there two time checks in record and segment? What are records in the first place?

Copy link
Contributor

@KKcorps KKcorps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@xiangfu0 xiangfu0 force-pushed the skip_time_value_check branch 6 times, most recently from 8f54296 to 074becf Compare September 9, 2022 06:54
@xiangfu0 xiangfu0 force-pushed the skip_time_value_check branch from 074becf to 3ed8662 Compare September 9, 2022 06:54
@codecov-commenter
Copy link

Codecov Report

Merging #9349 (3ed8662) into master (33dc520) will decrease coverage by 28.36%.
The diff coverage is 2.85%.

@@              Coverage Diff              @@
##             master    #9349       +/-   ##
=============================================
- Coverage     63.38%   35.02%   -28.37%     
+ Complexity     4755      185     -4570     
=============================================
  Files          1831     1884       +53     
  Lines         97999   100273     +2274     
  Branches      14990    15252      +262     
=============================================
- Hits          62115    35118    -26997     
- Misses        31299    62147    +30848     
+ Partials       4585     3008     -1577     
Flag Coverage Δ
integration1 26.09% <2.85%> (?)
unittests1 ?
unittests2 15.27% <2.85%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...l/realtime/converter/RealtimeSegmentConverter.java 0.00% <0.00%> (-80.71%) ⬇️
...t/local/recordtransformer/DataTypeTransformer.java 0.00% <0.00%> (-89.00%) ⬇️
...ot/segment/spi/creator/SegmentGeneratorConfig.java 0.00% <0.00%> (-81.36%) ⬇️
.../apache/pinot/spi/config/table/IndexingConfig.java 0.00% <ø> (-93.26%) ⬇️
...ot/spi/config/table/ingestion/IngestionConfig.java 0.00% <0.00%> (-100.00%) ⬇️
...he/pinot/spi/utils/builder/TableConfigBuilder.java 0.00% <ø> (-83.98%) ⬇️
...ces/PinotSegmentUploadDownloadRestletResource.java 53.94% <50.00%> (+14.77%) ⬆️
...in/java/org/apache/pinot/spi/utils/BytesUtils.java 0.00% <0.00%> (-100.00%) ⬇️
.../java/org/apache/pinot/spi/utils/BooleanUtils.java 0.00% <0.00%> (-100.00%) ⬇️
...java/org/apache/pinot/spi/trace/BaseRecording.java 0.00% <0.00%> (-100.00%) ⬇️
... and 1268 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@xiangfu0 xiangfu0 merged commit c8a114d into apache:master Sep 9, 2022
@xiangfu0 xiangfu0 deleted the skip_time_value_check branch September 9, 2022 08:12
@xiangfu0 xiangfu0 added the release-notes Referenced by PRs that need attention when compiling the next release notes label Sep 9, 2022
@Jackie-Jiang Jackie-Jiang added the Configuration Config changes (addition/deletion/change in behavior) label Sep 9, 2022
@Jackie-Jiang
Copy link
Contributor

Note that since we enabled the row time value check by default, it might slow down the ingestion for existing use cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Configuration Config changes (addition/deletion/change in behavior) release-notes Referenced by PRs that need attention when compiling the next release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants