Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[PIP-71][SQL]Pulsar SQL migrate SchemaHandle to presto decoder (#8422)
Fixes #4747 Fixes #7652 ### Motivation PIP-71: https://github.com/apache/pulsar/wiki/PIP-71:-Pulsar-SQL-migrate-SchemaHandle-to-presto-decoder **Pip-Doc** : [[PIP-71][SQL]Migrate SchemaHandle to Presto-decoder](https://docs.google.com/document/d/1KwG0GoHccju4-QNPfvT6tOwhp5Fvs6-iZlfLooPxTDM/edit?usp=sharing) In current version , pulsar-presto deserialize fields rely on SchemaHandler , but this causes the following restrictions : - **Metadata**: current nested field is dissociate with presto ParameterizedType , It treated nested field as a separated field , so presto compiler can’t understand the type hierarchy . nested field should be Row type in presto (e.g. Hive struct type support) . In the same way,array \ map type also shoud associate with presto ParameterizedTypes. - **Decoder** : SchemaHandler is hard to work with `RecordCursor.getObject()` to support ROW,MAP,ARRAY .etc The **motivations** of this pull request : - ` PulsarMetadata` take advantage of `ParameterizedType` to describe `row/array/map` Type instead of resolve nested columns in pulsar-presto connecter. - Customize `RowDecoder | RowDecoderFactory | ColumnDecoder` to work with pulsar interface, and with some our own extensions compare to presto original version , we can support more type for backward compatible (e.g. ` TIMESTAMP\DATE\TIME\Real\ARRAY\MAP\ROW ` support). - Decouple avro or schema type with `pulsar-presto main module` (RecordSet,ConnectorMetadata .etc ), aim to friendly with other schema type ( [ProtobufNative](apache/pulsar#8372) 、thrift etc..). ### Modifications Describe in [PIP-71: Pulsar SQL migrate SchemaHandle to presto decoder](https://docs.google.com/document/d/1KwG0GoHccju4-QNPfvT6tOwhp5Fvs6-iZlfLooPxTDM/edit?usp=sharing) ---- ### Does this pull request potentially affect one of the following parts: *If `yes` was chosen, please highlight the changes* - Dependencies (does it add or upgrade a dependency): (**yes** ) - The public API: (no) - The schema: ( no) - The default values of configurations: (no) - The wire protocol: (no) - The rest endpoints: (no) - The admin cli options: (no) - Anything that affects deployment: (no) ### Documentation - Does this pull request introduce a new feature? (yes) [[PIP][SQL]Migrate SchemaHandle to Presto-decoder](https://docs.google.com/document/d/1KwG0GoHccju4-QNPfvT6tOwhp5Fvs6-iZlfLooPxTDM/edit?usp=sharing) * codeStyle fix * Update pulsar-sql/presto-pulsar/src/test/java/org/apache/pulsar/sql/presto/TestPulsarConnector.java Co-authored-by: ran <[email protected]> * Update pulsar-sql/presto-pulsar/src/test/java/org/apache/pulsar/sql/presto/TestPulsarConnector.java Co-authored-by: ran <[email protected]> * Update pulsar-sql/presto-pulsar/src/test/java/org/apache/pulsar/sql/presto/TestPulsarConnector.java Co-authored-by: ran <[email protected]> * add keyValue\Primitive schema test && add schema cyclic definition detect * merge master * merge master Co-authored-by: wangguowei <[email protected]> Co-authored-by: ran <[email protected]>
- Loading branch information