-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Dwrf Sequence Ids in Writer #16037
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some minor comments, but the change looks good once they are addressed. I will approve them once they are addressed.
presto-orc/src/main/java/com/facebook/presto/orc/writer/ByteColumnWriter.java
Outdated
Show resolved
Hide resolved
presto-orc/src/main/java/com/facebook/presto/orc/stream/DecimalOutputStream.java
Outdated
Show resolved
Hide resolved
Can you please rebase on the latest master ? This change will have lot of merge conflicts with, 30cd883 |
0867ec0
to
ffb9aca
Compare
presto-orc/src/main/java/com/facebook/presto/orc/metadata/DwrfMetadataWriter.java
Outdated
Show resolved
Hide resolved
presto-orc/src/main/java/com/facebook/presto/orc/metadata/DwrfMetadataWriter.java
Outdated
Show resolved
Hide resolved
presto-orc/src/main/java/com/facebook/presto/orc/metadata/OrcMetadataReader.java
Outdated
Show resolved
Hide resolved
presto-orc/src/main/java/com/facebook/presto/orc/metadata/Stream.java
Outdated
Show resolved
Hide resolved
presto-orc/src/main/java/com/facebook/presto/orc/stream/LongOutputStream.java
Outdated
Show resolved
Hide resolved
fcf3ba4
to
f79a5d0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good once both the comments are addressed.
presto-orc/src/main/java/com/facebook/presto/orc/metadata/DwrfMetadataWriter.java
Show resolved
Hide resolved
presto-orc/src/main/java/com/facebook/presto/orc/writer/BooleanColumnWriter.java
Outdated
Show resolved
Hide resolved
ee7515e
to
1ce3d07
Compare
@@ -283,6 +283,7 @@ public int writeStripeFooter(SliceOutput output, StripeFooter footer) | |||
|
|||
private static OrcProto.Stream toStream(Stream stream) | |||
{ | |||
checkArgument(stream.getSequence() == ColumnEncoding.DEFAULT_SEQUENCE_ID, "Writing streams with non-zero sequence IDs is not supported in ORC " + stream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- stream will always call stream.toString() which will create garbage. Remove the + stream or use format specifier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stream has custom toString already implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use format instead of concat
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will print the right details (some readable string, instead of some memory address), that is not a concern. Calling toString will create a temporary string and lot of string concatenation. We want to avoid that.
two ways to avoid it are
- Do not include the stream in the error message. less perfereable, as we will not see what stream failed.
- call checkArgument(condition, "some message {}", object). Here object.toString() will be only called condition is false and avoiding the garbage on the hot path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Also fixed the checkArgument in toColumnEncoding() to use format instead of concat
presto-orc/src/main/java/com/facebook/presto/orc/metadata/DwrfMetadataReader.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG
presto-orc/src/test/java/com/facebook/presto/orc/TestDecryption.java
Outdated
Show resolved
Hide resolved
Sequence Ids are used by Dwrf FlatMap implementations. This commit introduces sequence for DWRF writer. Following PR of flat map implementation will use this sequence id to add multiple streams per column.
Sequence Ids are used by Dwrf FlatMap implementations. This commit introduces sequence for DWRF writer.
Following PR of flat map implementation will use this sequence id to add multiple streams per column.