Skip to content

Commit

Permalink
[PIP-71][SQL]Pulsar SQL migrate SchemaHandle to presto decoder (#8422)
Browse files Browse the repository at this point in the history
Fixes #4747 
Fixes #7652 

### Motivation

PIP-71: https://github.com/apache/pulsar/wiki/PIP-71:-Pulsar-SQL-migrate-SchemaHandle-to-presto-decoder

**Pip-Doc** : [[PIP-71][SQL]Migrate SchemaHandle to Presto-decoder](https://docs.google.com/document/d/1KwG0GoHccju4-QNPfvT6tOwhp5Fvs6-iZlfLooPxTDM/edit?usp=sharing)

In current version , pulsar-presto deserialize fields  rely on SchemaHandler , but this causes the following restrictions :

- **Metadata**: current nested field is dissociate with presto ParameterizedType , It treated nested field as a separated field , so  presto compiler can’t understand the type hierarchy . nested field should be Row type in presto (e.g.  Hive struct type support) . In the same way,array \ map type also shoud associate with presto ParameterizedTypes.
- **Decoder** : SchemaHandler is hard to work with  `RecordCursor.getObject()` to support ROW,MAP,ARRAY .etc

The **motivations** of this pull request :
-  ` PulsarMetadata` take advantage of `ParameterizedType`  to describe `row/array/map` Type instead of resolve nested columns in pulsar-presto connecter.
- Customize `RowDecoder | RowDecoderFactory | ColumnDecoder` to work with pulsar interface, and with some our own extensions  compare to presto original version , we can support more type for backward compatible (e.g. 
 ` TIMESTAMP\DATE\TIME\Real\ARRAY\MAP\ROW ` support).
- Decouple avro or schema type with `pulsar-presto main module` (RecordSet,ConnectorMetadata .etc ), aim to friendly with other schema type ( [ProtobufNative](apache/pulsar#8372)  、thrift etc..).

### Modifications

Describe in [PIP-71: Pulsar SQL migrate SchemaHandle to presto decoder](https://docs.google.com/document/d/1KwG0GoHccju4-QNPfvT6tOwhp5Fvs6-iZlfLooPxTDM/edit?usp=sharing) 

----

### Does this pull request potentially affect one of the following parts:

*If `yes` was chosen, please highlight the changes*

  - Dependencies (does it add or upgrade a dependency): (**yes** )
  - The public API: (no)
  - The schema: ( no)
  - The default values of configurations: (no)
  - The wire protocol: (no)
  - The rest endpoints: (no)
  - The admin cli options: (no)
  - Anything that affects deployment: (no)

### Documentation

  - Does this pull request introduce a new feature? (yes)
  [[PIP][SQL]Migrate SchemaHandle to Presto-decoder](https://docs.google.com/document/d/1KwG0GoHccju4-QNPfvT6tOwhp5Fvs6-iZlfLooPxTDM/edit?usp=sharing)

* codeStyle fix

* Update pulsar-sql/presto-pulsar/src/test/java/org/apache/pulsar/sql/presto/TestPulsarConnector.java

Co-authored-by: ran <[email protected]>

* Update pulsar-sql/presto-pulsar/src/test/java/org/apache/pulsar/sql/presto/TestPulsarConnector.java

Co-authored-by: ran <[email protected]>

* Update pulsar-sql/presto-pulsar/src/test/java/org/apache/pulsar/sql/presto/TestPulsarConnector.java

Co-authored-by: ran <[email protected]>

* add keyValue\Primitive schema test && add schema cyclic definition detect

* merge master

* merge master

Co-authored-by: wangguowei <[email protected]>
Co-authored-by: ran <[email protected]>
  • Loading branch information
3 people authored Feb 1, 2021
1 parent 8f00033 commit de1b8b3
Show file tree
Hide file tree
Showing 48 changed files with 4,324 additions and 2,192 deletions.
3 changes: 3 additions & 0 deletions pulsar-sql/presto-distribution/LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -387,6 +387,7 @@ The Apache Software License, Version 2.0
- presto-parser-332.jar
- presto-plugin-toolkit-332.jar
- presto-spi-332.jar
- presto-record-decoder-332.jar
* RocksDB JNI
- rocksdbjni-6.10.2.jar
* SnakeYAML
Expand Down Expand Up @@ -433,6 +434,8 @@ The Apache Software License, Version 2.0
- commons-logging-1.2.jar
* GSON
- gson-2.8.6.jar
* Snappy
- snappy-java-1.1.7.3.jar
* Jackson
- jackson-module-parameter-names-2.10.0.jar
- jackson-module-parameter-names-2.11.1.jar
Expand Down
26 changes: 26 additions & 0 deletions pulsar-sql/presto-pulsar/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,32 @@
<version>${joda.version}</version>
</dependency>

<dependency>
<groupId>io.prestosql</groupId>
<artifactId>presto-record-decoder</artifactId>
<version>${presto.version}</version>
</dependency>

<dependency>
<groupId>${project.groupId}</groupId>
<artifactId>pulsar-client-original</artifactId>
<version>${project.version}</version>
</dependency>

<dependency>
<groupId>io.prestosql</groupId>
<artifactId>presto-main</artifactId>
<version>${presto.version}</version>
<scope>test</scope>
</dependency>

<dependency>
<groupId>io.prestosql</groupId>
<artifactId>presto-testing</artifactId>
<version>${presto.version}</version>
<scope>test</scope>
</dependency>

</dependencies>

<build>
Expand Down

This file was deleted.

This file was deleted.

This file was deleted.

Loading

0 comments on commit de1b8b3

Please sign in to comment.