-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(python): Clarify interaction between the CDeviceArray, the CArrayView, and the CArray #409
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
@@ -427,7 +427,7 @@ def c_array_view(obj, schema=None) -> CArrayView: | |||
if isinstance(obj, CArrayView) and schema is None: | |||
return obj | |||
|
|||
return CArrayView.from_array(c_array(obj, schema)) | |||
return c_array(obj, schema).view() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool!
python/tests/test_device.py
Outdated
assert darray.device_type == 1 | ||
assert darray.device_id == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, are there enums that can be used here instead of values 1, 0? Or do we need to wait for DLPack support.
python/tests/test_device.py
Outdated
assert darray.device_type == 1 | ||
assert darray.device_id == 0 | ||
assert darray.array.length == 3 | ||
assert "device_type: 1" in repr(darray) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are enums, it would be nice to print the name (e.g. device_type: 1 (CPU)
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a bit of a rabbit hole but a very good rabbit hole. There's no enum, but there is an ABI-stable set of defines that I turned into one and the result is much better!
cdef _set_device(self, ArrowDeviceType device_type, int64_t device_id): | ||
self._device_type = device_type | ||
self._device_id = device_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is called frequently after initialization. Is it worth allowing __cinit__
to set device type/id?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to move away from any arguments in __cinit__
in most cases because a user could theoretically call nanoarrow.CArray(...)
and get very strange errors. They should really all be ClassName._construct()
or something (but maybe in a future PR).
Apologies for the two additional changes, but:
|
When device support was first added, the
CArrayView
was device-aware but theCArray
was not. This worked well until it was clear that__arrow_c_array__
needed to error if it did not represent a CPU array (and theCArray
had no way to check). Now, theCArray
has adevice_type
anddevice_id
. A nice side-effect of this is that we get back theview()
method (whose removal @jorisvandenbossche had lamented!).This also implements the device array protocol to help test apache/arrow#40717 . This protocol isn't finalized yet and I could remove that part until it is (although it doesn't seem likely to change).
The non-cpu case is still hard to test without real-world CUDA support...this PR is just trying to get the right information in the right place as early as possible.