-
Notifications
You must be signed in to change notification settings - Fork 223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
import dataframe using set_dataframe, TypeError when column dtype: 'Int64' (with <NA> and integer) #405
Comments
Can you please provide a small example to reproduce this? |
example snippet, set_dataframe() cloumn with 2 dtype: float / Int64, nullable integerhttps://pandas.pydata.org/pandas-docs/stable/user_guide/integer_na.htmlimport numpy as np pygsheets authorize and worksheetgc = pygsheets.authorize() try 1: a dataframe with 2 columns, 'col2' with a NaNd = {'col1': [1, 2], 'col2': [3, np.nan]} print(df)
wks.set_dataframe(df, (1,1), nan="") worksheet, 'col2' will keep float and "", "" is good, but 3.0 is not what i want
try 2: make a copy and change dtype of 'col2' to 'Int64' (not int64)df_Int64 = df.copy() print(df_Int64) # 'col2' will keep integer 3 and NaN, both are what i want
wks.set_dataframe(df_Int64, (1,1), nan="") TypeError: <U1 cannot be converted to an IntegerDtype |
Workaround for this is to call |
Problem is that pandas v1.0.0 introduced a new dtype; Int64 (opposed to int64). This dtype allows for Calling pygsheets/pygsheets/worksheet.py Line 1303 in 8a74911
My suggestion would be to drop the If you want I can pick up this issue and submit a PR. |
do you mean
but nan is not what i expected for this column on gsheet, better kept as empty like below.
That dtype Int64 should be since pandas 0.24.0 (Jan 2019), i tried it last year
i think the gerneral problem:
i think the option your sugesstion |
This isn't a problem within pygsheets, but instead a problem with pandas: pandas-dev/pandas#25288 A fix can be done by calling astype('object') before .fillna().
This is also a non-issue because the first version of the function worked (without the Int62 dtype) and once uploaded to google sheets, just tell google sheets to only display 0 decimal point for your whole table. |
@keyapi are you still experiencing problems? Latest version should have it fixed. |
Fixed |
In pandas dataframe, dtype of a column with NaN+integer would be changed into 'float', in cases of must-keep-integer, it's useful to replace dtype:'float' with dtype:'Int64', see: https://pandas.pydata.org/pandas-docs/stable/user_guide/integer_na.html
I use wks.set_dataframe(df, (1,1), nan="") to transfer a dataframe to a gsheet worksheet, but the pandas column with dtype:'float' (mixture of NaN and integer) could only be transfered as 'float' in a gsheet, but what i wanted for this column is only ""+integer, not float.
I also tried changing the dtype of the column first into 'Int64', e.g. df.column = df.column.astype('Int64') and then set_dataframe(), but got a "TypeError: <U1 cannot be converted to an IntegerDtype"
solution 1:
new parameter in set_dataframe() to import float as int
solution 2:
new parameter in set_dataframe() to deal with dytpe: "Int64", take < > like a "NaN"
The text was updated successfully, but these errors were encountered: