-
Notifications
You must be signed in to change notification settings - Fork 792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ffi::cpython::unicodeobject::tests::{ascii,ucs4} unit tests segfault on s390x #1824
Comments
I think I might have found the source of this problem: impl PyASCIIObject {
#[inline]
pub fn interned(&self) -> c_uint {
self.state & 3
}
#[inline]
pub fn kind(&self) -> c_uint {
(self.state >> 2) & 7
}
#[inline]
pub fn compact(&self) -> c_uint {
(self.state >> 5) & 1
}
#[inline]
pub fn ascii(&self) -> c_uint {
(self.state >> 6) & 1
}
#[inline]
pub fn ready(&self) -> c_uint {
(self.state >> 7) & 1
}
} Here, |
This also seems to have a follow-up effect in other tests:
Additionally, This is the error message: Which is kind of obvious now, since the constructed enum variants don't match the expected values on big-endian architectures, and hence, the |
Yikes, thanks for reporting this. TBH, I had this concern in #1777 (comment) when this FFI code was added. In the end, CI passed on all platforms so I was reasonably convinced this would be ok. I didn't consider the point that all platforms tested were little-endian. From further reading just now, it appears to be a really hard problem for Rust to get interactions with C bitfields correct on all platforms: https://users.rust-lang.org/t/c-structs-with-bit-fields-and-ffi/1429 The URLO thread in particular suggests that linking in some C code to achieve this is probably the correct way to solve this problem. I agree, however I don't really want to add complexity to PyO3's build scripts by requiring a C compiler for such a small feature. PyO3 contributors - what do you think we should do here? My personal feeling is that we probably ought to rollback the cc @indygreg - I'm concerned that PyOxidizer should probably not depend on this functionality as-implemented; it presumably causes nasty crashes on certain platforms... |
Yay for C being under-specified :/ Instead of deleting the feature globally, perhaps we could conditionally compile it for configurations where we know it to work using I'm a huge fan of perfect is the enemy of done and it seems that a working implementation on a subset of [very popular] platforms is strictly better than no implementation at all. I think it is also reasonable to escalate our concern to the CPython folks and ask them to expose proper C functions (not macros) for accessing the content behind bitfields so downstream consumers don't have to reason about under-defined C semantics. Unfortunately, it might be too late to get said APIs into CPython 3.10. Do you have a quick take on this @vstinner? Context here is the bitfield in |
FWIW my reading of the situation is that use of bitfields with However, it does appear that on x86_64 compilers that we have test coverage for, the But, all other architectures and compiler combinations are not yet known. We'd probably get lucky making the bit shifts endian aware. But this isn't guaranteed to work on all target arches or compilers. (Since Rust always uses LLVM this isn't a concern for PyO3 today: but it is a concern for whatever compiler built libpython and it could be a concern for future Rust implementations not using LLVM.) Since different compiler toolchains can't agree on the memory layout here, it technically isn't even safe to link libraries built with different compilers since their bit field layout could be different! CPython (or any C library for that matter) shouldn't expose any type with a bit field in its public API and should instead expose function-based APIs for field access and manipulation. How do you feel about making the bit shifts endian aware and moving the FFI definitions and At the end of the day, I don't need |
Thanks for looking into this!
My thoughts exactly :(
This sounds like a good idea, and probably the only "safe" way to handle this (since the compiler that's generating the accessor functions is then always the same as the one that's generating the bitfield). I've also pinged the Red Hat / Fedora CPython maintainers about this. They tell me it's way too late to get new API into Python 3.10 at this point, so it would happen for 3.11 at the earliest. In the meantine, limiting this code to |
Right, having gone away to think about this for a couple days, I've decided that I'm ok to go with the I'm still a little uneasy about it, because C bitfields being not well-defined (even if they're in practice endian-specific) means this might still cause problems on other platforms. However you folks as users seem to be ok with it, and ultimately PyO3 is built for its users. 😄 If we start getting further trouble reported from this API I might revisit this later on. Also we should consider submitting a PR upstream to CPython adding function-based access for this so there's no need to depend on bitfields in the Python public interface. |
I filed https://bugs.python.org/issue45025 so the problems with bit fields in CPython's API are tracked with CPython. |
The 0.14.4 release is now live. How do you folks feel if I yank the 0.14.3 release? |
Looks like in pyo3 0.14.4 there's some missing pieces to making the API unavailable on big-endian architectures:
At the very least, the following two things also need to be scoped to
|
I'm strongly against yanking because it breaks |
🤦 yikes, I'll fix and release 0.14.5 soon - #1850
Me too; my opinion is similarly it's just for security / major soundness holes. As it's a new niche API which had issues, and they probably would lead to loud crashes in most cases, it's likely ok not to yank here. |
Which functions should be exposed? Some macros mostly only exist to be used by other macros. For example, PyUnicode_IS_COMPACT_ASCII() is mostly used by PyUnicode_WSTR_LENGTH() macro (which is now deprecated). |
The only ones we care about at the moment are the ones needed to peek at the underlying byte slices and interpret its storage. So PyUnicode_GET_LENGTH(), PyUnicode_DATA(), PyUnicode_KIND(). |
PyUnicode_GET_LENGTH(): you can use PyUnicode_GetLength(). |
Release 0.14.5 is live - hopefully Fedora is now able to package again on s390x @decathorpe... 🤞 |
Everything went well, update is submitted: https://bodhi.fedoraproject.org/updates/FEDORA-2021-290f13120c Thanks for your help! |
FWIW, you could also write your own C-FFI shims, compiled in |
3015: Implement wrapper for `PyASCIIObject.state` bitfield accesses r=davidhewitt a=decathorpe This is a first draft of my attempt to fix #1824 "properly" by writing a C wrapper for the `PyASCIIObject.state` bitfield accesses, as proposed here: #1824 (comment) --- The original argument for making these functions `unsafe` is still valid, though - bitfield memory layout is not guaranteed to be stable across different C compilers, as it is "implementation defined" in the C standard. However, short of having CPython upstream provide non-inlined public functions to access this bitfield, this is the next best thing, as far as I can tell. I've removed the `#[cfg(target_endian = "little")]` attributes from all things that are un-blocked by fixing this issue on big-endian systems, except for three tests, which look like expected failures considering that they do not take bit/byte order into account (for example, when writing to the bitfield). - `ffi::tests::ascii_object_bitfield` - `types::string::tests::test_string_data_ucs2_invalid` - `types::string::tests::test_string_data_ucs4_invalid` All other tests now pass on both little-endian and big-endian systems. --- I am aware that some parts of this PR are probably not in a state that's acceptable for merging as-is, which is why I'm filing this as a draft. Feedback about how to better integrate this change with pyo3-ffi would be great. :) In particular, I'm unsure whether the `#include` statements in the C files are actually correct across different systems. I have only tested this on Fedora Linux so far. I'm also open to changing the names of the C functions that are implemented in the wrapper. For now I chose the most obvious names that shouldn't cause collisions with other symbols. Co-authored-by: Fabio Valentini <[email protected]>
Building pyo3 0.14.3 on s390x on Fedora Rawhide, the following two unit tests crash:
ffi::cpython::unicodeobject::tests::ascii
ffi::cpython::unicodeobject::tests::ucs4
Because they crash the test runner process, there's no output from those tests, except that they fail.
There are similar issues in the
cpython
crate:dgrunwald/rust-cpython#265
Looks like
PyUnicode_KIND
is a bitfield that is different depending on system endianness (s390x is the only big-endian architecture I have access to).🌍 Environment
rustc --version
): 1.54.0 stable ons390x-unknown-linux-gnu
version = "0.x.y"
withgit = "https://github.com/PyO3/pyo3")?
: no💥 Reproducing
Running
cargo test --release
on a s390x machine causes this issue.The text was updated successfully, but these errors were encountered: