Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read_sql unable to handle missing values #2023

Closed
hamzamohdzubair opened this issue Dec 10, 2021 · 1 comment
Closed

read_sql unable to handle missing values #2023

hamzamohdzubair opened this issue Dec 10, 2021 · 1 comment

Comments

@hamzamohdzubair
Copy link

hamzamohdzubair commented Dec 10, 2021

Are you using Python or Rust?

Python.

What version of polars are you using?

polars==0.10.27
connectorx==0.2.2

What operating system are you using polars on?

CentOS Linux release 7.9.2009 (Core)

Describe your bug.

read_sql returns following exceptions if any column contains missing values, i.e. no value.

Note: No error occurs if all rows have values

Errors

import polars as pl
pl.read_sql("select column1 from Table", <url_to_database>)

If column1 completely empty

RuntimeError: Invalid argument error: column types must match schema types, expected Time64(Nanosecond) but found LargeBinary at column index 0
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-67-2867bd04669a> in <module>
----> 1 df_pl = pl.read_sql("""
      2 select Replaced_Query_Bill
      3 from Event_Master limit 10""", cart)#.to_pandas()
      4 # Follows_Waterfall, Condition, Event_Limit, Slug, Conversation, Campaign_Key
      5 # Arms, Is_MAB_Driven, Is_Good_Time, Is_On_Demand

~/ipynbs/.venv/lib/python3.9/site-packages/polars/io.py in read_sql(sql, connection_uri, partition_on, partition_range, partition_num)
    802     """
    803     if _WITH_CX:
--> 804         tbl = cx.read_sql(
    805             conn=connection_uri,
    806             query=sql,

~/ipynbs/.venv/lib/python3.9/site-packages/connectorx/__init__.py in read_sql(conn, query, return_type, protocol, partition_on, partition_range, partition_num, index_col)
    133             raise ValueError("You need to install pyarrow first")
    134 
--> 135         result = _read_sql(
    136             conn,
    137             "arrow",

RuntimeError: Invalid argument error: column types must match schema types, expected Time64(Nanosecond) but found LargeBinary at column index 0

If column1 has datetime values but if even one row has no value

PanicException: Could not retrieve chrono::naive::datetime::NaiveDateTime from Value
---------------------------------------------------------------------------
PanicException                            Traceback (most recent call last)
<ipython-input-101-17df85ddf35d> in <module>
----> 1 cart.read_sql('select * from Event_Master limit 10')

<ipython-input-97-7fb8f6c351ab> in read_sql(self, query)
     25     def read_sql(self, query):
     26         try:
---> 27             result = pl.read_sql(query, self.__url)
     28         except Exception as e:
     29             print('Polar failed trying pandas')

~/ipynbs/.venv/lib/python3.9/site-packages/polars/io.py in read_sql(sql, connection_uri, partition_on, partition_range, partition_num)
    802     """
    803     if _WITH_CX:
--> 804         tbl = cx.read_sql(
    805             conn=connection_uri,
    806             query=sql,

~/ipynbs/.venv/lib/python3.9/site-packages/connectorx/__init__.py in read_sql(conn, query, return_type, protocol, partition_on, partition_range, partition_num, index_col)
    133             raise ValueError("You need to install pyarrow first")
    134 
--> 135         result = _read_sql(
    136             conn,
    137             "arrow",

PanicException: Could not retrieve chrono::naive::datetime::NaiveDateTime from Value

What is the expected behavior?

Read empty values in dataframe just like pandas

@hamzamohdzubair
Copy link
Author

Since the issue is mainly in connectorx closing this one and filing it in connectorx.

This is actually related to a closed issue: sfu-db/connector-x#111
Note: This was apparently solved in connectorx==0.2.1a1, I am using connectorx==0.2.2. So I shouldn't be getting this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant