Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support connector-x read_sql #7793

Open
tempbottle opened this issue Feb 8, 2023 · 5 comments
Open

support connector-x read_sql #7793

tempbottle opened this issue Feb 8, 2023 · 5 comments
Assignees

Comments

@tempbottle
Copy link

Is your feature request related to a problem? Please describe.

I want to read MV from risingwave and join it with some data from pandas.
Code like this:

import connectorx as cx
db1 = "postgresql://root:@127.0.0.1:4566/dev"
sql = """
SELECT * FROM mv_avg_speed;
"""
data = cx.read_sql(db1, sql)
# do data join pandas/polars dataframe which is read from other DBMS

this code got error, on ./risedev playground console, error is

ERROR risingwave_frontend::session: failed to parse sql:
COPY (
SELECT * FROM mv_avg_speed;
) TO STDOUT WITH BINARY:
sql parser error: Expected identifier, found: (

the COPY stmt may not support in risingwave.

Describe the solution you'd like

support COPY stmt and return data as postgresql would be great!

Describe alternatives you've considered

psycopg2 works fine, but I want to use connector-x to make things faster.

Additional context

No response

@github-actions github-actions bot added this to the release-0.1.17 milestone Feb 8, 2023
@xxchan
Copy link
Member

xxchan commented Feb 8, 2023

I found connector-x support specifying protocols:

cursor: Conventional wire protocol (slowest one), recommend to use only when binary and csv is not supported by the source (e.g. Redshift).

Could you try if that works? We may also investigate whether we can support COPY TO later :)

@tempbottle
Copy link
Author

tempbottle commented Feb 8, 2023

this code can not work:

sql = """
SELECT * FROM mv_avg_speed;
"""
db1 = "postgresql://root:@127.0.0.1:4566/dev"
data = cx.read_sql(db1, query=sql, protocol="cursor", return_type="polars")
print(data)

as connectorx emit error:

python3.11/site-packages/connectorx/__init__.py", line 330, in reconstruct_arrow 
   raise ValueError("Empty result")

maybe the response of risingwave miss something here, or connectorx can not parse the response.

@neverchanje
Copy link
Contributor

Hi tempbottle, Thanks for raising this issue. This requirement looks very interesting!

From our end, supporting COPY TO STDOUT is not difficult. What could bring a little complexity is supporting Postgres's COPY protocol: https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-COPY I've created an issue for this task: #7983. We can prioritize development if it really blocks your adoption of RisingWave.

@tempbottle
Copy link
Author

That will be great! work with connector-x is rusty and fast.

@jon-chuang
Copy link
Contributor

maybe the response of risingwave miss something here, or connectorx can not parse the respons

I wonder if this is due to some missing features in our binary PSQL response. #8213

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants