Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: improve conversion to BooleanArray from int/float array #29838

Closed
jorisvandenbossche opened this issue Nov 25, 2019 · 5 comments · Fixed by #30095
Closed

PERF: improve conversion to BooleanArray from int/float array #29838

jorisvandenbossche opened this issue Nov 25, 2019 · 5 comments · Fixed by #30095
Assignees
Labels
ExtensionArray Extending pandas with custom dtypes or arrays. good first issue
Milestone

Comments

@jorisvandenbossche
Copy link
Member

Currently, the creation of a BooleanArray from an int/float array goes through a conversion to object dtype (to do it together with the generic conversion from any list-like):

else:
# TODO conversion from integer/float ndarray can be done more efficiently
# (avoid roundtrip through object)
values_object = np.asarray(values, dtype=object)
inferred_dtype = lib.infer_dtype(values_object, skipna=True)
integer_like = ("floating", "integer", "mixed-integer-float")
if inferred_dtype not in ("boolean", "empty") + integer_like:
raise TypeError("Need to pass bool-like values")
mask_values = isna(values_object)
values = np.zeros(len(values), dtype=bool)
values[~mask_values] = values_object[~mask_values].astype(bool)

For the specific case of int/float ndarray, this could be optimized with a specific path for those cases without the casting to object array (probably just skipping the np.asarray(values, dtype=object) if values is a float/int ndarray will be enough).

@jorisvandenbossche jorisvandenbossche added good first issue ExtensionArray Extending pandas with custom dtypes or arrays. labels Nov 25, 2019
@jorisvandenbossche jorisvandenbossche added this to the Contributions Welcome milestone Nov 25, 2019
@leonicus
Copy link

take

@ethanywang ethanywang removed their assignment Dec 4, 2019
@ethanywang
Copy link

take

@leonicus
Copy link

leonicus commented Dec 6, 2019

@ethanywang hi, you can't take the issue because it's assigned (to me :)). I had some technical difficulties last week, which stopped me from working on it, but I should submit a PR shortly. If you have a specific reason to want to take it or want to discuss, feel free to comment

@ethanywang
Copy link

@leonicus Hi Leonicus, sorry about taking this issue without asking you first, but we have submitted the PR and get a response from the maintainer... We saw you claimed it a week ago without doing anything, so we thought you were facing some difficulties to work on it. Actually, for now, we are doing this issue as a course project, and it is quite difficult for us to change the chosen issue as we have already submitted it to our professor... So sorry about that...

@leonicus
Copy link

leonicus commented Dec 6, 2019

@ethanywang that's OK, just please notice that the "take" keyword doesn't really work on assigned issues (as you can see, it is still assigned to me as well). So next time just write to the issue assignee to check
N.B- Well, now I unassigned it manually, but just FYI

@leonicus leonicus removed their assignment Dec 6, 2019
@jorisvandenbossche jorisvandenbossche modified the milestones: Contributions Welcome, 1.0 Dec 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ExtensionArray Extending pandas with custom dtypes or arrays. good first issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants