[Umbrella] Flink Engine Improvement and Quality Assurance #2100

yaooqinn · 2022-03-11T02:16:15Z

Code of Conduct

I agree to follow this project's Code of Conduct

Search before asking

I have searched in the issues and found no similar issues.

Describe the proposal

We introduced the Flink engine in #1322.

In this ticket, we collect feedback, improvements, bugfixes, aim to make it production-ready

Task list

Bugs

Improvement

Documentations

Brainstorming

Miscs

Are you willing to submit PR?

Yes I am willing to submit a PR!

SteNicholas · 2022-03-11T02:19:51Z

@yaooqinn, the module label should be flink, not hive.

yaooqinn · 2022-03-11T02:21:07Z

@yaooqinn, the module label should be flink, not hive.

oops..

link3280 · 2022-04-18T11:34:52Z

@yaooqinn shall we make this a KPIP and let the corresponding issues follow the naming pattern like [SUBTASK][KPIP-X]?

yaooqinn · 2022-04-18T12:04:12Z

I am not sure that we can propose a KPIP on the status of this ticket, which seems not to meet the requirement of a KPIP.

In fact, we shall not create subtasks for KPIP-2 as it has been resolved. [SUBTASK][#2100] may be enough?

link3280 · 2022-04-18T13:07:24Z

@yaooqinn LGTM

…port

### _Why are the changes needed?_ Currently, Flink uses its legacy data type system in CollectSink, but sooner would move to the new type system (see https://issues.apache.org/jira/browse/FLINK-12251). Kyuubi should adapt to the new data type system beforehand. This PR supports StringData in Flink. This is a subtask of #2100 . ### _How was this patch tested?_ - [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request Closes #2718 from link3280/KYUUBI-2405. Closes #2718 951b20a [Paul Lin] [KYUUBI#2405] Optimize code style 9236083 [Paul Lin] [KYUUBI#2405] Simplify sampling code 8708fa8 [Paul Lin] [KYUUBI#2405] Update comments 773d860 [Paul Lin] [KYUUBI#2405] Fix index out of range when sampling b087b41 [Paul Lin] [KYUUBI#2405] Update externals/kyuubi-flink-sql-engine/src/main/scala/org/apache/kyuubi/engine/flink/schema/RowSet.scala dfeeda9 [Paul Lin] [KYUUBI#2405] Fix index out of range when result set is empty e627e5f [Paul Lin] [KYUUBI#2405] Support Flink StringData Data Type Authored-by: Paul Lin <[email protected]> Signed-off-by: Cheng Pan <[email protected]>

pan3793 · 2023-02-07T06:45:01Z

Postpone to 1.8, because this feature is not under rapid development, and it's not supposed to be accomplished in a short time.

waywtdcc · 2023-03-18T08:14:58Z

The jdbc interface supports asynchronous real-time tasks to obtain results. Can this be done? @pan3793

pan3793 · 2023-03-18T14:50:22Z

@waywtdcc technically, I don't think there is any blocker in Kyuubi framework, the JDBC driver retrieves result from Kyuubi Server in mini-batch, and we do similar thing in Spark which called incremental collection.

So it could be true if the Flink engine can return the streaming data in an Iterator.

cc the Flink experts @SteNicholas @link3280 @yanghua

pan3793 · 2023-03-18T14:57:05Z

@waywtdcc are you using Flink 1.14? Actually, the Kyuubi community is going to add support for Flink 1.17 and drop support for Flink 1.14, because of the lack of developer resources.

It would be great if you can share more about your use case / challenge / expectation on Kyuubi Flink egnine :)

waywtdcc · 2023-03-20T01:33:54Z

@waywtdcc are you using Flink 1.14? Actually, the Kyuubi community is going to add support for Flink 1.17 and drop support for Flink 1.14, because of the lack of developer resources.

It would be great if you can share more about your use case / challenge / expectation on Kyuubi Flink egnine :)

We use flink1.14 for data synchronization and real-time computing

waywtdcc · 2023-03-20T01:36:10Z

@waywtdcc technically, I don't think there is any blocker in Kyuubi framework, the JDBC driver retrieves result from Kyuubi Server in mini-batch, and we do similar thing in Spark which called incremental collection.

So it could be true if the Flink engine can return the streaming data in an Iterator.

cc the Flink experts @SteNicholas @link3280 @yanghua

Ok, I see. So what if I need to get the historical checkpoint list and stop after executing the savepoint operation?

pan3793 · 2023-03-20T03:20:39Z

All things you need to do is construct a proper FetchIterator on the Flink engine side.

link3280 · 2023-03-20T06:12:23Z

@waywtdcc technically, I don't think there is any blocker in Kyuubi framework, the JDBC driver retrieves result from Kyuubi Server in mini-batch, and we do similar thing in Spark which called incremental collection.
So it could be true if the Flink engine can return the streaming data in an Iterator.
cc the Flink experts @SteNicholas @link3280 @yanghua

Ok, I see. So what if I need to get the historical checkpoint list and stop after executing the savepoint operation?

@waywtdcc There're on-going efforts on Flink to improve the savepoint management via SQLs (see FLIP-222 for details). Kyuubi will support these statements once they are available.

waywtdcc · 2023-03-21T03:29:26Z

Add a jar package, how to execute a certain method of this jar package?

waywtdcc · 2023-03-21T08:40:04Z

All things you need to do is construct a proper FetchIterator on the Flink engine side.

Yes, we also need to get the resulting data in a streaming manner.

yaooqinn added kind:umbrella This a umbrella ticket priority:major labels Mar 11, 2022

yaooqinn mentioned this issue Mar 11, 2022

[Umbrella][KPIP-2] Support FlinkSQL Engine #1322

Closed

45 tasks

yaooqinn added help wanted release highlight module:hive labels Mar 11, 2022

yaooqinn added this to the v1.6.0 milestone Mar 11, 2022

yaooqinn added module:flink and removed module:hive labels Mar 11, 2022

yaooqinn pinned this issue Mar 11, 2022

zhaomin1423 mentioned this issue Mar 30, 2022

[SUBTASK][#2100] Flink Engine - Events support #2252

Open

3 tasks

zhaomin1423 added a commit to zhaomin1423/kyuubi that referenced this issue May 2, 2022

[KYUUBI apache#2252] [SUBTASK][apache#2100] Flink Engine - Events sup…

067ff68

…port

zhaomin1423 added a commit to zhaomin1423/kyuubi that referenced this issue May 2, 2022

[KYUUBI apache#2252] [SUBTASK][apache#2100] Flink Engine - Events sup…

4f120fa

…port

zhaomin1423 added a commit to zhaomin1423/kyuubi that referenced this issue May 2, 2022

[KYUUBI apache#2252] [SUBTASK][apache#2100] Flink Engine - Events sup…

68e6d97

…port

zhaomin1423 added a commit to zhaomin1423/kyuubi that referenced this issue May 2, 2022

[KYUUBI apache#2252] [SUBTASK][apache#2100] Flink Engine - Events sup…

7418e04

…port

zhaomin1423 added a commit to zhaomin1423/kyuubi that referenced this issue May 3, 2022

[KYUUBI apache#2252] [SUBTASK][apache#2100] Flink Engine - Events sup…

c8e32e7

…port

zhaomin1423 added a commit to zhaomin1423/kyuubi that referenced this issue May 4, 2022

[KYUUBI apache#2252] [SUBTASK][apache#2100] Flink Engine - Events sup…

4bfe879

…port

zhaomin1423 added a commit to zhaomin1423/kyuubi that referenced this issue May 5, 2022

[KYUUBI apache#2252] [SUBTASK][apache#2100] Flink Engine - Events sup…

df074f3

…port

link3280 mentioned this issue May 22, 2022

[KYUUBI#2405] Support Flink StringData Data Type #2718

Closed

3 tasks

yaooqinn unpinned this issue Jul 14, 2022

yaooqinn modified the milestones: v1.6.0, v1.7.0 Dec 6, 2022

pan3793 modified the milestones: v1.7.0, v1.8.0 Feb 7, 2023

link3280 mentioned this issue Mar 3, 2023

[BUG] [kyuubi-flink-sql-engine] FlinkSQL returns duplicate results when executing query statement #4083

Open

4 tasks

pan3793 removed this from the v1.8.0 milestone Nov 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Umbrella] Flink Engine Improvement and Quality Assurance #2100

[Umbrella] Flink Engine Improvement and Quality Assurance #2100

yaooqinn commented Mar 11, 2022 •

edited

Loading

SteNicholas commented Mar 11, 2022

yaooqinn commented Mar 11, 2022

link3280 commented Apr 18, 2022

yaooqinn commented Apr 18, 2022

link3280 commented Apr 18, 2022

pan3793 commented Feb 7, 2023

waywtdcc commented Mar 18, 2023

pan3793 commented Mar 18, 2023 •

edited

Loading

pan3793 commented Mar 18, 2023 •

edited

Loading

waywtdcc commented Mar 20, 2023

waywtdcc commented Mar 20, 2023

pan3793 commented Mar 20, 2023

link3280 commented Mar 20, 2023

waywtdcc commented Mar 21, 2023

waywtdcc commented Mar 21, 2023

[Umbrella] Flink Engine Improvement and Quality Assurance #2100

[Umbrella] Flink Engine Improvement and Quality Assurance #2100

Comments

yaooqinn commented Mar 11, 2022 • edited Loading

Code of Conduct

Search before asking

Describe the proposal

Task list

Bugs

Improvement

Documentations

Brainstorming

Miscs

Are you willing to submit PR?

SteNicholas commented Mar 11, 2022

yaooqinn commented Mar 11, 2022

link3280 commented Apr 18, 2022

yaooqinn commented Apr 18, 2022

link3280 commented Apr 18, 2022

pan3793 commented Feb 7, 2023

waywtdcc commented Mar 18, 2023

pan3793 commented Mar 18, 2023 • edited Loading

pan3793 commented Mar 18, 2023 • edited Loading

waywtdcc commented Mar 20, 2023

waywtdcc commented Mar 20, 2023

pan3793 commented Mar 20, 2023

link3280 commented Mar 20, 2023

waywtdcc commented Mar 21, 2023

waywtdcc commented Mar 21, 2023

yaooqinn commented Mar 11, 2022 •

edited

Loading

pan3793 commented Mar 18, 2023 •

edited

Loading

pan3793 commented Mar 18, 2023 •

edited

Loading