Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-33856: [C#] Implement C Data Interface for C# #35496
GH-33856: [C#] Implement C Data Interface for C# #35496
Changes from 5 commits
1b929bc
5f3013c
a0dc92b
366225b
c679e2e
67c5044
f740b36
2bcb347
bfcab3a
4eb9c1f
edc74af
ad7907b
8541069
f816507
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't mandate this, since the user can call this with a local uninitialized
struct ArrowArray
variable (if called from raw C rather than, say, Python).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was following the existing pattern in CArrowSchemaExporter. But I also think that the documentation, in saying that "A released structure is indicated by setting its release callback to NULL. Before reading and interpreting a structure’s data, consumers SHOULD check for a NULL release callback and treat it accordingly (probably by erroring out)." suggests that this check is appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then
CArrowSchemaExporter
should probably be modified as well.This is producer code, not consumer code. At this point, the structure is still uninitialized. "Uninitialized" in the C (or C++) sense, that is "may contain any arbitrary bytes", not "zero-initialized".
Producer code therefore shouldn't care about what is already in the structure.
cc @paleolimbot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is part of why C is awful ;).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed as part of #35996.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry I missed this and I see that it's been solved. In nanoarrow we definitely assume that pointer output arguments point to uninitialized memory (and strive to not touch that memory until failure is impossible). There are a few places where we do something like
...to simplify (to the extent that anything in C is simple) the error handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can do this another way. If you add
[UnmanagedCallersOnly]
to theprivate unsafe static void ReleaseArray(CArrowArray* cArray)
method, then this line should just be:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This applies for all the function pointers we need to set on these structs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would still work correctly if an array were exported via the C API to another bit of managed code which consumed it (via the C API)?
Edited: the bigger problem is that I would still like to support netstandard20, where UnmanagedCallersOnly isn't available. Is it worth using conditional compilation to optimize for .NET 5+?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes because it is the
release
pointer is an unmanaged function pointer, and not a managed delegate. It is a bit hard to explain. But even though the caller is "managed" (i.e. .NET code), the way the method is invoked is through an unmanged function pointer. So it is still an "UnmanagedCaller".There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe trying writing a benchmark test in https://github.com/apache/arrow/tree/main/csharp/test/Apache.Arrow.Benchmarks to compare the difference? If we see major differences, it may make sense to split the compilation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, looking at this deeper, the current code may have problems since the managed Delegate might be GC'd.
See https://learn.microsoft.com/en-us/dotnet/api/system.runtime.interopservices.marshal.getfunctionpointerfordelegate?view=net-8.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you'd think I'd have remembered this given that I pointed it out in a private email last month :/.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I've taken care of the lifetime issues and would probably file a work item to test for optimization.