Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ctypes array cannot be returned by value on ARM64 #110190

Closed
pdegroote opened this issue Oct 1, 2023 · 24 comments
Closed

ctypes array cannot be returned by value on ARM64 #110190

pdegroote opened this issue Oct 1, 2023 · 24 comments
Labels
pending The issue will be closed if no feedback is provided topic-ctypes type-bug An unexpected behavior, bug, or error

Comments

@pdegroote
Copy link

pdegroote commented Oct 1, 2023

Bug report

Bug description:

Dear,

I'm wrapping a C-library with the ctypes module on Darwin (ARM64). I have a function that simply initializes a C-struct and returns the struct by value. The C-struct contains one simple C-array type. When using ArrayTypes in Python, it seems the struct is not returned properly.

On Windows and Linux platforms the same code does work as expected.

The strange thing is that if I lay out the struct explicitly with singular types in Python (as I expect the layout of the struct is the same), it does work as expected on all tried platforms.

To reproduce, I first create a simple .c file with the struct and the initializing function. In Python, I give code that I use to compile (with gcc), and then three different ways to wrap the struct and function in Python using ctypes. I believe the three ways should be equivalent (and they are on Windows and Linux), but not on Darwin (ARM64).

I saved this file as ctypesexample.c

struct example{double coords[3];};
struct example set_defaults(){
    struct example ex;
    ex.coords[0] = 1.0;
    ex.coords[1] = 2.0;
    ex.coords[2] = 3.0;
    return ex;}

I saved this file as reproduce.py

import os
import subprocess
import ctypes

def compile():
    """Compile the c file using gcc."""
    subprocess.call(["gcc","-c","-Wall","-Werror","-fpic","ctypesexample.c"])
    subprocess.call(["gcc","-shared","-o","libctypesexample","ctypesexample.o"])
    assert os.path.isfile('libctypesexample')

# Three ways to wrap the struct:
class Example1(ctypes.Structure):
    _fields_ = [
            ('coords', ctypes.c_double * 3)
               ]

class CoordType(ctypes.Array):
    _type_ = ctypes.c_double
    _length_ = 3

class Example2(ctypes.Structure):
    _fields_ = [('coords', CoordType)]
    
class Example3(ctypes.Structure):
    _fields_ = [
            ('x', ctypes.c_double),
            ('y', ctypes.c_double),
            ('z', ctypes.c_double)
               ]

def run():
    # Load the shared library
    so_location = os.path.abspath("libctypesexample")
    _lib = ctypes.cdll.LoadLibrary(so_location)
    _lib.set_defaults.restype = Example1
    o = _lib.set_defaults()
    print(f"Example1: ({o.coords[0]}, {o.coords[1]}, {o.coords[2]})")
    
    _lib.set_defaults.restype = Example2
    o = _lib.set_defaults()
    print(f"Example2: ({o.coords[0]}, {o.coords[1]}, {o.coords[2]})")

    _lib.set_defaults.restype = Example3
    o = _lib.set_defaults()
    print(f"Example3: ({o.x}, {o.y}, {o.z})")

if __name__ == "__main__":
    if not os.path.isfile('libctypesexample'):
        compile()
    run()

I tried this code on Linux (debian) -- both in Docker and WSL. And this works perfectly. Equivalent code in Windows is also working fine. In the past, I have run similar code as above on many different versions of Python (3.5-3.11) on both platforms for years (and still) without problems. For example on WSL with the following platform info

uname_result(system='Linux', release='5.15.90.1-microsoft-standard-WSL2', version='#1 SMP Fri Jan 27 02:56:13 UTC 2023', machine='x86_64')

and Python:

Python 3.10.6 (main, Mar 10 2023, 10:55:28) [GCC 11.3.0] on linux

I get this correct output:

Example1: (1.0, 2.0, 3.0)
Example2: (1.0, 2.0, 3.0)
Example3: (1.0, 2.0, 3.0)

While on Darwin (ARM) with the following platform info:

system='Darwin', release='22.1.0', version='Darwin Kernel Version 22.1.0: Sun Oct 9 20:14:30 PDT 2022; root:xnu-8792.41.9~2/RELEASE_ARM64_T8103', machine='arm64')

and Python

Python 3.11.4 (main, Jun 12 2023, 11:39:45) [Clang 14.0.3 (clang-1403.0.22.14.1)] on darwin

I get the following output:

Example1: (3.025693809e-314, 3.0256938013e-314, 2.1481083817e-314)
Example2: (3.025693809e-314, 3.0256938013e-314, 2.1481083817e-314)
Example3: (1.0, 2.0, 3.0)

In summary, the struct doesn't seem to be properly returned as value using array types, though with a literal listing of the types, it seems there is no problem. So there is a workaround, but as we're using ctypes for automatic wrapping through reflection and the code works properly on Windows and Linux, it would be quite some work to implement the workaround.

I hope my bug report finds you without any errors, and many thanks for giving the opportunity to report.

Best regards

CPython versions tested on:

3.10, 3.11

Operating systems tested on:

Linux, macOS, Windows

Linked PRs

@pdegroote pdegroote added the type-bug An unexpected behavior, bug, or error label Oct 1, 2023
@sunmy2019
Copy link
Member

I looked into it. ctypes utilizes libffi to make foreign function calls, and ctypes generates wrong signatures.

Long words in short: stgdict->ffi_type_pointer is wrong. Current behavior would cause return types interpreted as ctypes.c_void_p * 3.

class Example1(ctypes.Structure):
    _fields_ = [("coords", ctypes.c_double * 3)]


class Example2(ctypes.Structure):
    _fields_ = [("coords", ctypes.c_void_p * 3)]


# Three ways to wrap the struct:
class Example3(ctypes.Structure):
    _fields_ = [("x", ctypes.c_void_p), ("y", ctypes.c_void_p), ("z", ctypes.c_void_p)]


def run():
    # Load the shared library
    so_location = os.path.abspath("libctypesexample")
    _lib = ctypes.cdll.LoadLibrary(so_location)

    _lib.set_defaults.restype = Example1
    o = _lib.set_defaults()
    print(
        f"Example1: ({float.hex(o.coords[0])}, {float.hex(o.coords[1])}, {float.hex(o.coords[2])})"
    )

    _lib.set_defaults.restype = Example2
    o = _lib.set_defaults()
    print(f"Example2: ({o.coords[0]:x}, {o.coords[1]:x}, {o.coords[2]})")

    _lib.set_defaults.restype = Example3
    o = _lib.set_defaults()
    print(f"Example3: ({o.x:x}, {o.y:x}, {o.z})")
Example1: (0x0.0ffffcc6b11d0p-1022, 0x0.0ffff95567ef8p-1022, 0x0.0p+0)
Example2: (ffffcc6b11d0, ffff95567ef8, None)
Example3: (ffffcc6b11d0, ffff95567ef8, None)

Notice the binary representation of Example1 has ffffcc6b11d0 which is same as Example2 and Example3.

@sunmy2019
Copy link
Member

sunmy2019 commented Oct 2, 2023

The special handling is here:

#define MAX_STRUCT_SIZE 16
if (arrays_seen && (size <= MAX_STRUCT_SIZE)) {
/*
* See bpo-22273. Arrays are normally treated as pointers, which is
* fine when an array name is being passed as parameter, but not when
* passing structures by value that contain arrays. On 64-bit Linux,
* small structures passed by value are passed in registers, and in
* order to do this, libffi needs to know the true type of the array
* members of structs. Treating them as pointers breaks things.
*

You seem to hit one special HVA calling convention of aarch64 which is currently not handled.
https://learn.microsoft.com/en-us/cpp/build/arm64-windows-abi-conventions?view=msvc-170#return-values

Types that are HFAs with four or fewer elements are returned in s0-s3, d0-d3, or v0-v3, as appropriate.

Details

Integral values are returned in x0.

Floating-point values are returned in s0, d0, or v0, as appropriate.

A type is considered to be an HFA or HVA if all of the following hold:

  • It's non-empty,
  • It doesn't have any non-trivial default or copy constructors, destructors, or assignment operators,
  • All of its members have the same HFA or HVA type, or are float, double, or neon types that match the other members' HFA or HVA types.

HVA values with four or fewer elements are returned in s0-s3, d0-d3, or v0-v3, as appropriate.

Types returned by value are handled differently depending on whether they have certain properties, and whether the function is a non-static member function. Types which have all of these properties,

  • they're aggregate by the C++14 standard definition, that is, they have no user-provided constructors, no private or protected - non-static data members, no base classes, and no virtual functions, and
  • they have a trivial copy-assignment operator, and
  • they have a trivial destructor,

and are returned by non-member functions or static member functions, use the following return style:

  • Types that are HFAs with four or fewer elements are returned in s0-s3, d0-d3, or v0-v3, as appropriate.
  • Types less than or equal to 8 bytes are returned in x0.
  • Types less than or equal to 16 bytes are returned in x0 and x1, with x0 containing the lower-order 8 bytes.
  • For other aggregate types, the caller shall reserve a block of memory of sufficient size and alignment to hold the result. The address of the memory block shall be passed as an additional argument to the function in x8. The callee may modify the result memory block at any point during the execution of the subroutine. The callee isn't required to preserve the value stored in x8.

All other types use this convention:

  • The caller shall reserve a block of memory of sufficient size and alignment to hold the result. The address of the memory block shall be passed as an additional argument to the function in x0, or x1 if $this is passed in x0. The callee may modify the result memory block at any point during the execution of the subroutine. The callee returns the address of the memory block in x0.

@sunmy2019
Copy link
Member

Now the cause is clear, ctypes did not handle one special rule of aarch64, but fixing it is beyond my ability.

@diegorusso
Copy link
Contributor

Hello, I'm looking at this issue and working on a fix.

diegorusso added a commit to diegorusso/cpython that referenced this issue Dec 1, 2023
Set MAX_STRUCT_SIZE to 32 in stgdict.c when on Arm platforms.
This because on Arm platforms structs with at most 4 elements of any
floating point type values can be passed through registers. If the type
is double the maximum size of the struct is 32 bytes.
On x86-64 Linux, it's maximum 16 bytes hence we need to differentiate.
@diegorusso
Copy link
Contributor

Hello, I've investigated this further and actually this was happening across Arm platforms including Linux aarch64 and aarch32 and not just on Darwin. I've raised the PR that fixes the issue providing information and context. Feel free to review it and any feedback is welcome.

ambv pushed a commit that referenced this issue Dec 5, 2023
Set MAX_STRUCT_SIZE to 32 in stgdict.c when on Arm platforms.
This because on Arm platforms structs with at most 4 elements of any
floating point type values can be passed through registers. If the type
is double the maximum size of the struct is 32 bytes.
On x86-64 Linux, it's maximum 16 bytes hence we need to differentiate.
diegorusso added a commit to diegorusso/cpython that referenced this issue Dec 5, 2023
…112604)

Set MAX_STRUCT_SIZE to 32 in stgdict.c when on Arm platforms.
This because on Arm platforms structs with at most 4 elements of any
floating point type values can be passed through registers. If the type
is double the maximum size of the struct is 32 bytes.
On x86-64 Linux, it's maximum 16 bytes hence we need to differentiate.

(cherry picked from commit bc68f4a)
diegorusso added a commit to diegorusso/cpython that referenced this issue Dec 5, 2023
…112604)

Set MAX_STRUCT_SIZE to 32 in stgdict.c when on Arm platforms.
This because on Arm platforms structs with at most 4 elements of any
floating point type values can be passed through registers. If the type
is double the maximum size of the struct is 32 bytes.
On x86-64 Linux, it's maximum 16 bytes hence we need to differentiate.

(cherry picked from commit bc68f4a)
ambv pushed a commit that referenced this issue Dec 6, 2023
…2767)

Set MAX_STRUCT_SIZE to 32 in stgdict.c when on Arm platforms.
This because on Arm platforms structs with at most 4 elements of any
floating point type values can be passed through registers. If the type
is double the maximum size of the struct is 32 bytes.
On x86-64 Linux, it's maximum 16 bytes hence we need to differentiate.

(cherry picked from commit bc68f4a)
ambv pushed a commit that referenced this issue Dec 6, 2023
…2766)

Set MAX_STRUCT_SIZE to 32 in stgdict.c when on Arm platforms.
This because on Arm platforms structs with at most 4 elements of any
floating point type values can be passed through registers. If the type
is double the maximum size of the struct is 32 bytes.
On x86-64 Linux, it's maximum 16 bytes hence we need to differentiate.

(cherry picked from commit bc68f4a)
@ambv
Copy link
Contributor

ambv commented Dec 6, 2023

This is now fixed in (to be released) 3.11.8, 3.12.1, and 3.13.0 alpha 3.
Thanks, Diego! ✨ 🍰 ✨

@ambv ambv closed this as completed Dec 6, 2023
@ambv ambv reopened this Dec 6, 2023
@ambv
Copy link
Contributor

ambv commented Dec 6, 2023

While this is fixed, it uncovered that a similar failure is happening on PPC64LE. The new tests introduced by Diego here fail on our PPC64LE buildbots as well. See for instance:
https://buildbot.python.org/all/#/builders/559/builds/4236/steps/6/logs/stdio

ambv added a commit to ambv/cpython that referenced this issue Dec 6, 2023
Yhg1s added a commit to ambv/cpython that referenced this issue Dec 7, 2023
ambv added a commit to ambv/cpython that referenced this issue Dec 7, 2023
ambv added a commit to ambv/cpython that referenced this issue Dec 7, 2023
Yhg1s pushed a commit that referenced this issue Dec 7, 2023
…n PPC64LE (GH-112818) (#112829)

(cherry picked from commit 9f67042)

Co-authored-by: Łukasz Langa <[email protected]>
ambv added a commit to ambv/cpython that referenced this issue Dec 7, 2023
kulikjak added a commit to kulikjak/cpython that referenced this issue Apr 24, 2024
@kulikjak
Copy link
Contributor

Hi. The problem is resolved, the main issue is indeed in the GCC code generator, which is breaking the SPARC ABI.

Still, even with that fixed, similarly to Windows ARM64 (#114753), MAX_STRUCT_SIZE should be adjusted to 32 in stgdict. I created #118233 for that.

@diegorusso
Copy link
Contributor

@kulikjak thanks for the update!

@thesamesam
Copy link
Contributor

Is there a link to the GCC bug? Thanks.

@kulikjak
Copy link
Contributor

Is there a link to the GCC bug? Thanks.

Sure, here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114416

Glyphack pushed a commit to Glyphack/cpython that referenced this issue Sep 2, 2024
Set MAX_STRUCT_SIZE to 32 in stgdict.c when on Arm platforms.
This because on Arm platforms structs with at most 4 elements of any
floating point type values can be passed through registers. If the type
is double the maximum size of the struct is 32 bytes.
On x86-64 Linux, it's maximum 16 bytes hence we need to differentiate.
Glyphack pushed a commit to Glyphack/cpython that referenced this issue Sep 2, 2024
Glyphack pushed a commit to Glyphack/cpython that referenced this issue Sep 2, 2024
…12959)

Fix the same issue of PR python#112604 on PPC64LE platform
Refactor tests to make easier to add more platfroms if needed.
@encukou
Copy link
Member

encukou commented Sep 17, 2024

I just merged the SPARC PR. Sorry for the delay.
As it's an unsupported/untested platform, I'd prefer if you carry backports downstream. (If that's an issue, let's talk after 3.13.0 comes out -- we definitely won't backport it in the release candidate phase.)

@encukou encukou closed this as completed Sep 17, 2024
@kulikjak
Copy link
Contributor

Thank you for the merge!

Since the sparc define check definitely won't break any supported platforms, I think it's safe to backport it to 3.13. That said, I completely understand that it's not something for the release candidate now (and we can certainly patch it ourselves).

gentoo-bot pushed a commit to gentoo/cpython that referenced this issue Sep 19, 2024
…112604) (python#112766)

Set MAX_STRUCT_SIZE to 32 in stgdict.c when on Arm platforms.
This because on Arm platforms structs with at most 4 elements of any
floating point type values can be passed through registers. If the type
is double the maximum size of the struct is 32 bytes.
On x86-64 Linux, it's maximum 16 bytes hence we need to differentiate.

(cherry picked from commit bc68f4a)
Signed-off-by: Michał Górny <[email protected]>
gentoo-bot pushed a commit to gentoo/cpython that referenced this issue Sep 19, 2024
gentoo-bot pushed a commit to gentoo/cpython that referenced this issue Sep 19, 2024
…honGH-112959) (python#113167)

Fix the same issue of PR python#112604 on PPC64LE platform
Refactor tests to make easier to add more platfroms if needed.

(cherry picked from commit 6644ca4)
Change-Id: I1ada30808c0d593a43eca3fa7a628c26bc276310
Signed-off-by: Michał Górny <[email protected]>
gentoo-bot pushed a commit to gentoo/cpython that referenced this issue Sep 19, 2024
…onGH-114753)

(cherry picked from commit a06b606)

Co-authored-by: Diego Russo <[email protected]>
Signed-off-by: Michał Górny <[email protected]>
gentoo-bot pushed a commit to gentoo/cpython that referenced this issue Sep 19, 2024
gentoo-bot pushed a commit to gentoo/cpython that referenced this issue Sep 19, 2024
…112604) (python#112766)

Set MAX_STRUCT_SIZE to 32 in stgdict.c when on Arm platforms.
This because on Arm platforms structs with at most 4 elements of any
floating point type values can be passed through registers. If the type
is double the maximum size of the struct is 32 bytes.
On x86-64 Linux, it's maximum 16 bytes hence we need to differentiate.

(cherry picked from commit bc68f4a)
Signed-off-by: Michał Górny <[email protected]>
gentoo-bot pushed a commit to gentoo/cpython that referenced this issue Sep 19, 2024
gentoo-bot pushed a commit to gentoo/cpython that referenced this issue Sep 19, 2024
…honGH-112959) (python#113167)

Fix the same issue of PR python#112604 on PPC64LE platform
Refactor tests to make easier to add more platfroms if needed.

(cherry picked from commit 6644ca4)
Change-Id: I1ada30808c0d593a43eca3fa7a628c26bc276310
Signed-off-by: Michał Górny <[email protected]>
gentoo-bot pushed a commit to gentoo/cpython that referenced this issue Sep 19, 2024
…onGH-114753)

(cherry picked from commit a06b606)

Co-authored-by: Diego Russo <[email protected]>
Signed-off-by: Michał Górny <[email protected]>
gentoo-bot pushed a commit to gentoo/cpython that referenced this issue Sep 19, 2024
gentoo-bot pushed a commit to gentoo/cpython that referenced this issue Sep 19, 2024
…112604) (python#112766)

Set MAX_STRUCT_SIZE to 32 in stgdict.c when on Arm platforms.
This because on Arm platforms structs with at most 4 elements of any
floating point type values can be passed through registers. If the type
is double the maximum size of the struct is 32 bytes.
On x86-64 Linux, it's maximum 16 bytes hence we need to differentiate.

(cherry picked from commit bc68f4a)
Signed-off-by: Michał Górny <[email protected]>
gentoo-bot pushed a commit to gentoo/cpython that referenced this issue Sep 19, 2024
gentoo-bot pushed a commit to gentoo/cpython that referenced this issue Sep 19, 2024
…honGH-112959) (python#113167)

Fix the same issue of PR python#112604 on PPC64LE platform
Refactor tests to make easier to add more platfroms if needed.

(cherry picked from commit 6644ca4)
Change-Id: I1ada30808c0d593a43eca3fa7a628c26bc276310
Signed-off-by: Michał Górny <[email protected]>
gentoo-bot pushed a commit to gentoo/cpython that referenced this issue Sep 19, 2024
…onGH-114753)

(cherry picked from commit a06b606)

Co-authored-by: Diego Russo <[email protected]>
Signed-off-by: Michał Górny <[email protected]>
gentoo-bot pushed a commit to gentoo/cpython that referenced this issue Sep 19, 2024
savannahostrowski pushed a commit to savannahostrowski/cpython that referenced this issue Sep 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pending The issue will be closed if no feedback is provided topic-ctypes type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

8 participants