-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrap GMT_Put_Strings to pass str columns into GMT C API directly #520
Changes from 9 commits
c811e65
5c5f152
d7c3053
fccedae
07b95ed
88eef3a
6390e04
42af00e
9ef04b0
ab1e041
01754fc
452d3d3
9c8dad9
6dff9ba
cfb9124
2485437
7fa1f0c
e8e7768
2f0c798
6d6ea24
9d063b3
2d81f76
95afa90
ed0dce6
aeabd8e
abcbcf7
129135e
4f1ccda
30c18eb
62469cb
26f51c9
726de44
4c99477
2f9d689
fe794c9
f59d970
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -44,7 +44,7 @@ | |
"GMT_IS_SURFACE", | ||
] | ||
|
||
MODES = ["GMT_CONTAINER_ONLY", "GMT_OUTPUT"] | ||
MODES = ["GMT_CONTAINER_ONLY", "GMT_IS_OUTPUT"] | ||
|
||
REGISTRATIONS = ["GMT_GRID_PIXEL_REG", "GMT_GRID_NODE_REG"] | ||
|
||
|
@@ -235,7 +235,7 @@ def __getitem__(self, name): | |
value = c_get_enum(session, name.encode()) | ||
|
||
if value is None or value == -99999: | ||
raise GMTCLibError("Constant '{}' doesn't exits in libgmt.".format(name)) | ||
raise GMTCLibError(f"Constant '{name}' doesn't exist in libgmt.") | ||
|
||
return value | ||
|
||
|
@@ -511,13 +511,13 @@ def create_data(self, family, geometry, mode, **kwargs): | |
---------- | ||
family : str | ||
A valid GMT data family name (e.g., ``'GMT_IS_DATASET'``). See the | ||
``data_families`` attribute for valid names. | ||
``FAMILIES`` attribute for valid names. | ||
geometry : str | ||
A valid GMT data geometry name (e.g., ``'GMT_IS_POINT'``). See the | ||
``data_geometries`` attribute for valid names. | ||
``GEOMETRIES`` attribute for valid names. | ||
mode : str | ||
A valid GMT data mode (e.g., ``'GMT_OUTPUT'``). See the | ||
``data_modes`` attribute for valid names. | ||
A valid GMT data mode (e.g., ``'GMT_IS_OUTPUT'``). See the | ||
``MODES`` attribute for valid names. | ||
dim : list of 4 integers | ||
The dimensions of the dataset. See the documentation for the GMT C | ||
API function ``GMT_Create_Data`` (``src/gmt_api.c``) for the full | ||
|
@@ -731,7 +731,7 @@ def put_vector(self, dataset, column, vector): | |
""" | ||
Attach a numpy 1D array as a column on a GMT dataset. | ||
|
||
Use this functions to attach numpy array data to a GMT dataset and pass | ||
Use this function to attach numpy array data to a GMT dataset and pass | ||
it to GMT modules. Wraps ``GMT_Put_Vector``. | ||
|
||
The dataset must be created by :meth:`~gmt.clib.Session.create_data` | ||
|
@@ -791,11 +791,61 @@ def put_vector(self, dataset, column, vector): | |
) | ||
) | ||
|
||
def put_strings(self, dataset, family, strings): | ||
""" | ||
Attach a numpy 1D array of dtype str as a column on a GMT dataset. | ||
|
||
Use this function to attach string type numpy array data to a GMT | ||
dataset and pass it to GMT modules. Wraps ``GMT_Put_Strings``. | ||
|
||
The dataset must be created by :meth:`~gmt.clib.Session.create_data` | ||
first. | ||
|
||
.. warning:: | ||
The numpy array must be C contiguous in memory. If it comes from a | ||
column slice of a 2d array, for example, you will have to make a | ||
copy. Use :func:`numpy.ascontiguousarray` to make sure your vector | ||
is contiguous (it won't copy if it already is). | ||
|
||
Parameters | ||
---------- | ||
dataset : :class:`ctypes.c_void_p` | ||
The ctypes void pointer to a ``GMT_Dataset``. Create it with | ||
:meth:`~gmt.clib.Session.create_data`. | ||
family : str | ||
The family type of the dataset. Can be either ``GMT_IS_VECTOR`` or | ||
``GMT_IS_MATRIX``. | ||
strings : numpy 1d-array | ||
The array that will be attached to the dataset. Must be a 1d C | ||
contiguous array. | ||
|
||
Raises | ||
------ | ||
GMTCLibError | ||
If given invalid input or ``GMT_Put_Strings`` exits with status != | ||
0. | ||
|
||
""" | ||
c_put_strings = self.get_libgmt_func( | ||
"GMT_Put_Strings", | ||
argtypes=[ctp.c_void_p, ctp.c_uint, ctp.c_void_p, ctp.c_void_p], | ||
restype=ctp.c_int, | ||
) | ||
|
||
strings_pointer = strings.ctypes.data_as(ctp.c_char_p) | ||
status = c_put_strings( | ||
self.session_pointer, self[family], dataset, strings_pointer | ||
) | ||
if status != 0: | ||
raise GMTCLibError( | ||
f"Failed to put strings of type {strings.dtype} into dataset" | ||
) | ||
|
||
def put_matrix(self, dataset, matrix, pad=0): | ||
""" | ||
Attach a numpy 2D array to a GMT dataset. | ||
|
||
Use this functions to attach numpy array data to a GMT dataset and pass | ||
Use this function to attach numpy array data to a GMT dataset and pass | ||
it to GMT modules. Wraps ``GMT_Put_Matrix``. | ||
|
||
The dataset must be created by :meth:`~gmt.clib.Session.create_data` | ||
|
@@ -856,12 +906,12 @@ def write_data(self, family, geometry, mode, wesn, output, data): | |
---------- | ||
family : str | ||
A valid GMT data family name (e.g., ``'GMT_IS_DATASET'``). See the | ||
``data_families`` attribute for valid names. Don't use the | ||
``FAMILIES`` attribute for valid names. Don't use the | ||
``GMT_VIA_VECTOR`` or ``GMT_VIA_MATRIX`` constructs for this. Use | ||
``GMT_IS_VECTOR`` and ``GMT_IS_MATRIX`` instead. | ||
geometry : str | ||
A valid GMT data geometry name (e.g., ``'GMT_IS_POINT'``). See the | ||
``data_geometries`` attribute for valid names. | ||
``GEOMETRIES`` attribute for valid names. | ||
mode : str | ||
How the data is to be written to the file. This option varies | ||
depending on the given family. See the GMT API documentation for | ||
|
@@ -1085,6 +1135,9 @@ def virtualfile_from_vectors(self, *vectors): | |
arrays = vectors_to_arrays(vectors) | ||
|
||
columns = len(arrays) | ||
if np.issubdtype(arrays[-1].dtype, np.str_): | ||
columns -= 1 | ||
|
||
rows = len(arrays[0]) | ||
if not all(len(i) == rows for i in arrays): | ||
raise GMTInvalidInput("All arrays must have same size.") | ||
|
@@ -1096,8 +1149,12 @@ def virtualfile_from_vectors(self, *vectors): | |
family, geometry, mode="GMT_CONTAINER_ONLY", dim=[columns, rows, 1, 0] | ||
) | ||
|
||
for col, array in enumerate(arrays): | ||
# Use put_vector for first n columns with numerical type data | ||
for col, array in enumerate(arrays[:columns]): | ||
self.put_vector(dataset, column=col, vector=array) | ||
# Use put_strings for last column with string type data | ||
for array in arrays[columns:]: | ||
self.put_strings(dataset, family="GMT_IS_VECTOR", strings=array) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These codes are incorrect. We can't call See the comments in #483 (comment), we have to combine all trailing string arrays into one array and call There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It does just call put strings once, or zero times. If you examine the code before, the slice either returns a list with one item or a list with zero items. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you mean these lines? columns = len(arrays)
if np.issubdtype(arrays[-1].dtype, np.str_):
columns -= 1 It won't work for input with multiple trailing string arrays. For example, for the
We have to combine
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok I see, but do we want to do the concatenation:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Option 2 is easier and more intuitive for module wrapper developers. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not so easy for the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. From what I now about Python and GMT, I think we may have to duplicate the memory in PyGMT, and then duplicate it again in GMT. |
||
|
||
with self.open_virtual_file( | ||
family, geometry, "GMT_IN|GMT_IS_REFERENCE", dataset | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I tried this following your datetime example, but I think (?) it's functionally equivalent to
strings.ctypes.data_as(ctp.c_char_p)
? Either way it runs, but the string data doesn't seem to get written to the virtualfile so the test fails. I've also triedstrings.ctypes.data_as(ctp.POINTER(ctp.c_char_p))
but it doesn't seem to make a difference. 😕There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strings.ctypes.data_as(ctp.c_char_p)
may not work.What I did is adding a print statement inside the GMT_Put_Strings function, printing the string arrays passed to it. Using
strings.ctypes.data_as(ctp.c_char_p)
crashes immediately for me, but using my code can print the strings correctly, but then crash 🤦♂️.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about using:
Does it crash for you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it still crashes, but I can see that the strings are correctly passed to the GMT_Put_Strings function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about
strings.ctypes.data_as(ctp.POINTER(ctp.c_char_p))
? Are the strings correctly passed in?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I was just taking a look at it (thanks for tracking down the bug by the way!). Sounds like we'll need to pin to GMT > 6.1.1 for the next release?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I'm not 100% sure GenericMappingTools/gmt#3718 is related to the issue here. I just had the feeling that the string array may be freed by Python when GMT tries to read it.
Hopefully, GMT 6.1.1 can fix issue and #515.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, GenericMappingTools/gmt#3718 is already merged into GMT master, and will be backported to 6.1 branch soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! I'll test it out when I get the time (need to prepare some stuff for an online conference next week). Still need to wait on the grid problem at #515 but that's a separate issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just triggered the GMT master branch testing. Ideally, we should only see one failure from
test_put_strings
.