Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: is_nan #1583

Closed
MarcoGorelli opened this issue Dec 13, 2024 · 3 comments · Fixed by #1625
Closed

feat: is_nan #1583

MarcoGorelli opened this issue Dec 13, 2024 · 3 comments · Fixed by #1625
Labels
enhancement New feature or request good first issue Good for newcomers, but anyone is welcome to submit a pull request!

Comments

@MarcoGorelli
Copy link
Member

MarcoGorelli commented Dec 13, 2024

Looks like this would be useful to Tubular: azukds/tubular#349 (comment)

There would need to be a caveat in the docstring about how pandas doesn't distinguish between nan and null but other libraries typically do

This requires some attention to check that we get such nan vs null details correct:

@MarcoGorelli MarcoGorelli added enhancement New feature or request good first issue Good for newcomers, but anyone is welcome to submit a pull request! and removed good first issue Good for newcomers, but anyone is welcome to submit a pull request! labels Dec 13, 2024
@camriddell
Copy link
Contributor

For a specific nan do you want to disambiguate between Float64 & float64 in pandas? Where the latter would have the traditional nan, but the former would coerce all present nan to NA?

from pandas import NA, Series
from numpy import nan

s = Series([0, NA, nan, 3], dtype='object')

print(
    s,                                     # both (object dtype)
    s.astype('Float64'),                   # pd.NA
    s.astype('Int64'),                     # pd.NA
    s.astype('Float64').astype('float64'), # np.nan
    sep=f'\n{"-" * 40}\n'
)

@MarcoGorelli
Copy link
Member Author

Float64 only coerces to NA on IO, but nan can still arise:

In [4]: s = pd.Series([1, 0, None], dtype='Float64')

In [5]: s / s
Out[5]:
0     1.0
1     NaN
2    <NA>
dtype: Float64

So here the expected output would be [False, True, NA]

@camriddell
Copy link
Contributor

Last question- any reason to prefer value != value over using numpy.isnan for this operation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers, but anyone is welcome to submit a pull request!
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants