Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
ORC-1697: Fix IllegalArgumentException when reading json timestamp ty…
…pe in benchmark ### What changes were proposed in this pull request? This PR aims to fix `IllegalArgumentException` when reading json timestamp type in benchmark. Write and read json, convert timestamp type to long type instead of string type. ### Why are the changes needed? ORC-1191 Switch the csv format of taxi to parquet and read the timestamp format of parquet, but it is in microseconds format, which is different from the millisecond format of Java's `java.sql.Timestamp`. taxi source parquet meta ```bash optional int64 tpep_pickup_datetime (TIMESTAMP(MICROS,false)); optional int64 tpep_dropoff_datetime (TIMESTAMP(MICROS,false)); ``` When we write the data into json and then use the scan command, we will get the following error. ```java java -jar core/target/orc-benchmarks-core-*-uber.jar scan data -format json ``` ``` Exception in thread "main" java.lang.IllegalArgumentException: Timestamp format must be yyyy-mm-dd hh:mm:ss[.fffffffff] at java.sql/java.sql.Timestamp.valueOf(Timestamp.java:224) at org.apache.orc.bench.core.convert.json.JsonReader$TimestampColumnConverter.convert(JsonReader.java:175) at org.apache.orc.bench.core.convert.json.JsonReader.nextBatch(JsonReader.java:86) at org.apache.orc.bench.core.convert.ScanVariants.run(ScanVariants.java:92) at org.apache.orc.bench.core.Driver.main(Driver.java:64) ``` Because json data of type timestamp is written via `java.sql.Timestamp#toString`, but reading the data `java.sql.Timestamp#valueOf` will report an error. ```java Timestamp ts = new Timestamp(1446341079000000L); System.out.println(ts); System.out.println(Timestamp.valueOf(ts.toString())); ``` ``` 47802-09-23 02:50:00.0 Exception in thread "main" java.lang.IllegalArgumentException: Timestamp format must be yyyy-mm-dd hh:mm:ss[.fffffffff] at java.sql.Timestamp.valueOf(Timestamp.java:237) ``` ### How was this patch tested? local test ```bash java -jar core/target/orc-benchmarks-core-*-uber.jar generate data -format json -data taxi -compress snappy ``` ```bash java -jar core/target/orc-benchmarks-core-*-uber.jar scan data -format json -data taxi -compress snappy ``` ### Was this patch authored or co-authored using generative AI tooling? No Closes #1902 Closes #1930 from cxzl25/ORC-1697_v2. Authored-by: sychen <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
- Loading branch information