-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for gracefully handling the errors while transformations #9377
Conversation
e1fe791
to
2574c9f
Compare
Codecov Report
@@ Coverage Diff @@
## master #9377 +/- ##
============================================
- Coverage 69.76% 68.40% -1.37%
+ Complexity 4787 4705 -82
============================================
Files 1885 1885
Lines 100293 100464 +171
Branches 15256 15292 +36
============================================
- Hits 69966 68718 -1248
- Misses 25381 26793 +1412
- Partials 4946 4953 +7
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
flattenMap(record, new ArrayList<>(record.getFieldToValueMap().keySet())); | ||
for (String collection : _fieldsToUnnest) { | ||
unnestCollection(record, collection); | ||
try { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc: @Jackie-Jiang I remember you mentioned that try-catch will impact the performance? shall do if-else check first?
Codewise I think this is simpler.
if (_continueOnError) {
flattenMap(record, new ArrayList<>(record.getFieldToValueMap().keySet()));
for (String collection : _fieldsToUnnest) {
unnestCollection(record, collection);
}
} else {
try {
flattenMap(record, new ArrayList<>(record.getFieldToValueMap().keySet()));
for (String collection : _fieldsToUnnest) {
unnestCollection(record, collection);
}
} catch (Exception e) {
record.putValue(GenericRow.INCOMPLETE_RECORD_KEY, true);
LOGGER.debug("Caught exception while transforming complex types for record: {}", record.toString(), e);
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can do that. It just looked like code was being duplicated and hence didn't do it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Jackie-Jiang AFAIK, compiler doesn't optimize the code inside the try block, but other than that there's no performance overhead for just using a try block. Let me know if you have experienced otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current way should be fine. I don't remember seeing extra overhead on try-catch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Jackie-Jiang for confirming! @KKcorps can you also update the change in this PR(https://github.com/apache/pinot/pull/9376/files) to make the code consistent?
...-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/FilterTransformer.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, otherwise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR only handles segment generation part. We need to take care of the realtime ingestion as well.
...java/org/apache/pinot/segment/local/segment/creator/impl/SegmentIndexCreationDriverImpl.java
Outdated
Show resolved
Hide resolved
flattenMap(record, new ArrayList<>(record.getFieldToValueMap().keySet())); | ||
for (String collection : _fieldsToUnnest) { | ||
unnestCollection(record, collection); | ||
try { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Jackie-Jiang AFAIK, compiler doesn't optimize the code inside the try block, but other than that there's no performance overhead for just using a try block. Let me know if you have experienced otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great!
flattenMap(record, new ArrayList<>(record.getFieldToValueMap().keySet())); | ||
for (String collection : _fieldsToUnnest) { | ||
unnestCollection(record, collection); | ||
try { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current way should be fine. I don't remember seeing extra overhead on try-catch
...al/src/main/java/org/apache/pinot/segment/local/recordtransformer/ExpressionTransformer.java
Show resolved
Hide resolved
f7b3844
to
ebe6dad
Compare
Agreed. The reason I haven't tackled it in this PR is because it is not clear what user can do to fix missing data in case we ignore it in realtime-ingestion. |
IMO it should be up to users. If they set |
…pache#9377) * Add support for gracefully handling the errors while transformations * Maintain consistent checks across transformers * Add incompleteRowCount metric to realtime servers Co-authored-by: Kartik Khare <[email protected]>
Add support for honouring
continueOnError
duringAlso, adds support for keeping track of such records using
incompleteRecordCount