Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Connect from PowerBI #3032

Closed
3 tasks done
hanna-liashchuk opened this issue Jul 8, 2022 · 11 comments
Closed
3 tasks done

[Bug] Connect from PowerBI #3032

hanna-liashchuk opened this issue Jul 8, 2022 · 11 comments
Labels
kind:bug This is a clearly a bug priority:major

Comments

@hanna-liashchuk
Copy link
Contributor

Code of Conduct

Search before asking

  • I have searched in the issues and found no similar issues.

Describe the bug

Since Kyuubi is an alternative to STS, I've tried to connect to it from PowerBI (PB) for some reports but met some errors along the way. PB refuses to connect to version 1.5.2 with the error "The host and port specified for the connection do not seem to belong to a Spark server", connection to 1.4.1 works OK, but functionality is very limited, e.g. creating a "Top N" filter fails with an error.
After some investigation on the second one, I found out PB thinks that 1.4.1 is a Spark version and cuts functionality as it's too old.

Simba::SparkODBC::SOConnection::Connect: Server version (1.4.1-incubating) is lower than 2.1.0. Setting FETCH_FIRST fetch orientation as                 not supported because server lower than version 2.1.0 treats FETCH_FIRST as                 FETCH_NEXT.
Simba::SparkODBC::HardySettings::SetServerNotSupportsFetchFirst: +++++ enter +++++

ODBC logs can bee found here
SparkODBC-1.4.1.log
SparkODBC-1.5.2.log

I've built Kyuubi with spark version (3.2.1) instead of 1.4.1 and PB was working as expected. Under the hood, PB is using Microsoft Spark ODBC Driver.
Is there a way to configure Kyuubi metadata to work with Power BI? Otherwise it's a great tool that can't be used in companies that have found themselves in a vendor lock-in and since Power BI is a Microsoft tool, I believe there are enough of them

Affects Version(s)

1.4.1-incubating, 1.5.2-incubating

Kyuubi Server Log Output

No response

Kyuubi Engine Log Output

No response

Kyuubi Server Configurations

No response

Kyuubi Engine Configurations

No response

Additional context

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!
@hanna-liashchuk hanna-liashchuk added kind:bug This is a clearly a bug priority:major labels Jul 8, 2022
@pan3793
Copy link
Member

pan3793 commented Jul 8, 2022

I'm sorry I'm not familiar w/ PB and ODBC, and I think the Microsoft Spark ODBC Driver is not an open-source project.

I see Simba::SparkODBC from your logs, and we got reported that "Simba" ODBC drvier use a tricky way to retrieve server‘s version, please try to set spark.sql.hive.version=2.3.9 and spark.sql.hive.metastore.version=2.3.9 (actually I forget which one take effect) in your $SPAKR_HOME/spark-defaults.conf (2.3.9 is chosen because you are using Spark 3.2.1)

@pan3793
Copy link
Member

pan3793 commented Jul 8, 2022

And for Simba::SparkODBC::SOConnection::Connect: Server version (1.4.1-incubating) ...

The DBMS_NAME and DBMS_VER are defined at
https://github.com/apache/incubator-kyuubi/blob/95cb57e80bf9ca8dd0510df3d45c52d9f22ef448/kyuubi-common/src/main/scala/org/apache/kyuubi/session/AbstractSession.scala#L110-L111
You can change it to "2.3.9" to see what will happen, if it does work, I think we can make these two values configurable.

@hanna-liashchuk
Copy link
Contributor Author

hi @pan3793, thanks for your help
setting spark.sql.* params didn't help, but changing the source code - did!
So far, everything seems to be working fine.
I checked with 1.4.1-incubating version, let me check with 1.5.2

@hanna-liashchuk
Copy link
Contributor Author

So, great news, 1.5.2 with the code edit is working as well :)

@pan3793
Copy link
Member

pan3793 commented Jul 8, 2022

Glad to see it works! Would you like to send a PR to "make these two values configurable" and(or) add a document in "quick start" section to share how to use Power BI to connect Kyuubi?

@hanna-liashchuk
Copy link
Contributor Author

Yes, would like to. Except, this would be my first commit to kyuubi project, so guide me here :)
By configurable you mean that a user can override that value in kyuubi-defaults.conf via some parameter?

@pan3793
Copy link
Member

pan3793 commented Jul 10, 2022

this would be my first commit to kyuubi project, so guide me here :)

Yes, of course.

By configurable you mean that a user can override that value in kyuubi-defaults.conf via some parameter?

Yes. To achieve this, you should add the ConfigEntrys(they should be named as kyuubi.server.cli.dbms.name and kyuubi.server.cli.dbms.ver or something) in KyuubiConf, and refer to them in AbstractSession#getInfo, then we can configure them in kyuubi-defaults.conf

For documents, you can add a new page in the folder docs/quick_start and follow the https://kyuubi.readthedocs.io/en/latest/develop_tools/build_document.html to build and preview the document site in local

pan3793 added a commit that referenced this issue Sep 8, 2022
### _Why are the changes needed?_

Workaround for #3032, close #3323

There are known some ODBC drivers e.g. [Databricks ODBC driver](https://www.databricks.com/spark/odbc-drivers-download) depending on `TGetInfoType.CLI_DBMS_VER` and `TGetInfoType.CLI_DBMS_NAME` to check server compatibilities and abilities.

This PR proposes to introduce a new configuration `kyuubi.server.info.provider=SERVER/ENGINE` to make GetInfo support return either server or engine information.

Since beeline will call GetInfo in the initialization phase, to make sure the beeline fast open experience, in async launch mode, when the engine is not ready, return server info regardless `kyuubi.server.info.provider`.

### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [x] Add screenshots for manual tests if appropriate

Testing w/ PowerBI

![image](https://user-images.githubusercontent.com/26535726/188945975-0d0fc95c-f989-4025-ad7d-c024e23ec328.png)

- [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #3122 from pan3793/info.

Closes #3122

742bdbe [Cheng Pan] nit
bb85d2b [Cheng Pan] style
fd75238 [Cheng Pan] fix
9ddb2af [Cheng Pan] nit
fd8f797 [Cheng Pan] fix ut
840205e [Cheng Pan] nit
f9996d5 [Cheng Pan] GetInfo supports returning engine info

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
pan3793 added a commit that referenced this issue Sep 8, 2022
### _Why are the changes needed?_

Workaround for #3032, close #3323

There are known some ODBC drivers e.g. [Databricks ODBC driver](https://www.databricks.com/spark/odbc-drivers-download) depending on `TGetInfoType.CLI_DBMS_VER` and `TGetInfoType.CLI_DBMS_NAME` to check server compatibilities and abilities.

This PR proposes to introduce a new configuration `kyuubi.server.info.provider=SERVER/ENGINE` to make GetInfo support return either server or engine information.

Since beeline will call GetInfo in the initialization phase, to make sure the beeline fast open experience, in async launch mode, when the engine is not ready, return server info regardless `kyuubi.server.info.provider`.

### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [x] Add screenshots for manual tests if appropriate

Testing w/ PowerBI

![image](https://user-images.githubusercontent.com/26535726/188945975-0d0fc95c-f989-4025-ad7d-c024e23ec328.png)

- [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #3122 from pan3793/info.

Closes #3122

742bdbe [Cheng Pan] nit
bb85d2b [Cheng Pan] style
fd75238 [Cheng Pan] fix
9ddb2af [Cheng Pan] nit
fd8f797 [Cheng Pan] fix ut
840205e [Cheng Pan] nit
f9996d5 [Cheng Pan] GetInfo supports returning engine info

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
@pan3793
Copy link
Member

pan3793 commented Sep 11, 2022

This issue has been addressed in #3122, w/ configuration kyuubi.server.info.provider=ENGINE, the GetInfo will return the engine(e.g. Spark)'s information instead of Kyuubi.

The feature will be available in 1.6.1-incubating.

@pan3793 pan3793 closed this as completed Sep 11, 2022
@pan3793
Copy link
Member

pan3793 commented Sep 11, 2022

cc @WANGHui2022, since you reported this issue in another channel.

mahmoudbahaa pushed a commit to mahmoudbahaa/kyuubi that referenced this issue Sep 12, 2022
### _Why are the changes needed?_

Workaround for apache#3032, close apache#3323

There are known some ODBC drivers e.g. [Databricks ODBC driver](https://www.databricks.com/spark/odbc-drivers-download) depending on `TGetInfoType.CLI_DBMS_VER` and `TGetInfoType.CLI_DBMS_NAME` to check server compatibilities and abilities.

This PR proposes to introduce a new configuration `kyuubi.server.info.provider=SERVER/ENGINE` to make GetInfo support return either server or engine information.

Since beeline will call GetInfo in the initialization phase, to make sure the beeline fast open experience, in async launch mode, when the engine is not ready, return server info regardless `kyuubi.server.info.provider`.

### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [x] Add screenshots for manual tests if appropriate

Testing w/ PowerBI

![image](https://user-images.githubusercontent.com/26535726/188945975-0d0fc95c-f989-4025-ad7d-c024e23ec328.png)

- [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request

Closes apache#3122 from pan3793/info.

Closes apache#3122

742bdbe [Cheng Pan] nit
bb85d2b [Cheng Pan] style
fd75238 [Cheng Pan] fix
9ddb2af [Cheng Pan] nit
fd8f797 [Cheng Pan] fix ut
840205e [Cheng Pan] nit
f9996d5 [Cheng Pan] GetInfo supports returning engine info

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
@marszd
Copy link

marszd commented Feb 16, 2023

@hanna-liashchuk I used PowerBi to connect kyuub_1.6.0, but there was a problem. Because there was an extra layer of "spark_catalog" in kyuubi's schema, PB would treat the database as a table. I saw that you also used PB to connect kyuubi, could you please give me some advice.

  1. ODBC configuration
    image

  2. Problem phenomenon
    image

@hanna-liashchuk
Copy link
Contributor Author

@marszd we are using Spark connector with protocol: Standart. Have you tried using it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug This is a clearly a bug priority:major
Projects
None yet
Development

No branches or pull requests

3 participants