Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use vectorcall (where possible) when calling Python functions #4456

Merged
merged 4 commits into from
Aug 25, 2024

Conversation

ChayimFriedman2
Copy link
Contributor

This works without any changes to user code.

The way it works is by creating a methods on IntoPy to call functions, and specializing them for tuples.

This currently supports only non-kwargs for methods, and kwargs with somewhat slow approach (converting from PyDict) for functions. This can be improved, but that will require additional API.

We may consider adding more impls IntoPy<Py<PyTuple>> that specialize (for example, for arrays and Vec), but this is a good start.

What should I put in the news? There is no perf.

@ChayimFriedman2 ChayimFriedman2 force-pushed the call-vectorcall branch 6 times, most recently from b4ff834 to 8c6658e Compare August 19, 2024 20:49
Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, this is a nice win. changed is the right choice of newsfragment 👍

I think given the upcoming changes for the IntoPyObject trait we need to adjust how this is implemented slightly, might as well figure out in this PR.

Aside from that, I have just a few suggestions which might help with clarity for myself and future readers :)

It would be nice to also be able to use vectorcall for keyword arguments, though that needs a new API designed as per #4414 or similar. Can leave for the future.

Comment on lines +177 to +183
// The following methods are helpers to use the vectorcall API where possible.
// They are overridden on tuples to perform a vectorcall.
// Be careful when you're implementing these: they can never refer to `Bound` call methods,
// as those refer to these methods, so this will create an infinite recursion.
#[doc(hidden)]
#[inline]
fn __py_call_vectorcall1<'py>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than adding to this trait, we should look at its upcoming replacement IntoPyObject and consider how to slot these methods on there or a companion trait. We need to migrate the IntoPy<Py<PyTuple>> bound on the .call functions anyway, so this is a good time to bring this up cc @Icxolu.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how you expect this new trait(s) to look like, but it shouldn't be hard to migrate. I believe it is out of scope for this PR though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think IntoPyObject has a worse API (in some part): it cannot convert one Rust type to multiple Python type, which can especially hurt calls (for example, because it prevents supporting calling with arrays or Vec without an inefficient conversion).

This has advantages - less type annotation, but I think these trait can coexist (with calls using IntoPy).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thanks for the ping David! I have to say upfront I'm not really familiar with these different calling conversions.

Also, I think IntoPyObject has a worse API (in some part): it cannot convert one Rust type to multiple Python type,

Together with fallibility I would considers the the two major advantages of the new API. During the experimentation phase we concluded that there is generally a clear Python target type for any Rust type. The additional complexity would make this overall less ergonomic to while bringing not much benefit in general.

This has advantages - less type annotation, but I think these trait can coexist (with calls using IntoPy).

IMO we should not keep IntoPy around. It has clear problems regarding fallible conversions. Also there should really be one trait responsible for converting Rust value into Python objects. Everything else is way harder to explain and to maintain. For example the implementations could get out of sync and the same value in Rust will be converted differently depending on which API it is given to. (This can already happen with ToPyObject and IntoPy currently, and I think we should get rid of it and not introduce a new form here)

If I understood correctly the problem is that we also want to convert arrays, Vecs, ... to a PyTuple while there normally convert into a PyList. I think we can support that special casing with IntoPyObject as well, using another method that converts Self into a PyTuple "args" object. A quick sketch below with my limited understanding.

pub trait IntoPyObject<'py>: Sized {
    ....
    
    #[doc(hidden)]
    /// Turn `Self` into callable args, can be specialized for tuples, array, ...
    fn into_args(self, py: Python<'py>, _: private::Token) -> PyResult<Bound<'py, PyTuple>>
    where
        PyErr: From<Self::Error>,
    {
        (self,).into_pyobject(py) // for tuples this can then be `self.into_pyobject(py)`
    }

    #[doc(hidden)]
    /// Call `function` with `obj` as `arg`; can use specialized calling conventions
    fn vectorcall(
        obj: Self,
        py: Python<'py>,
        function: Borrowed<'_, 'py, PyAny>,
        token: private::Token,
    ) -> PyResult<Bound<'py, PyAny>>
    where
        PyErr: From<Self::Error>,
    {
        #[inline]
        fn inner<'py>(
            py: Python<'py>,
            function: Borrowed<'_, 'py, PyAny>,
            args: Bound<'py, PyTuple>,
        ) -> PyResult<Bound<'py, PyAny>> {
            use crate::ffi_ptr_ext::FfiPtrExt;
            unsafe {
                ffi::PyObject_Call(function.as_ptr(), args.as_ptr(), std::ptr::null_mut())
                    .assume_owned_or_err(py)
            }
        }
        // make this use `into_args`
        inner(py, function, obj.into_args(py, token)?.into_bound())
    }

}

If I got something wrong, or overlooked something, let me know, but in general I think it should be possible to support this with IntoPyObject as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it is indeed possible to support this with IntoPyObject.

If we are already making a breaking change, I think a better path than adding methods on IntoPyObject is to use another trait for calls, say PyCallArgs. This has the following advantage:

  • Assuming we seal PyCallArgs, this will allow us to easily enable future possibilities, even ones that we cannot predict, around perf and not only.
  • If you take IntoPyObject, you have to check you actually got a tuple. The overhead can be mitigated for known-tuples by specializing methods on them, but it is still not the best API since it does not prevent non-tuples at compile time and doesn't even signal the user their code is going to fail.

Anyway, this is unrelated to this PR. We can land it now, and I expect any changes around calling can be adjusted fairly trivially.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An additional reason I find the different trait approach tempting is that it can be used for both more convenient and more performant approach for kwargs, even without waiting for a pycall! macro - if we choose this path, we can instead of taking kwargs: Option<PyDict> take generic type that can convert to a dict.

For example,

fn call<Args, Kwargs>(&self, args: Args, kwargs: Kwargs)
where
    (Args, Kwargs): PyCallArgs
{ ... }

That already means people can more nicely use kwargs with syntax like call((arg1, arg2, ...), [("a", 1), ("b", 2), ...]). But in addition, we may specialize the impls to instead of converting to PyDict, using the vectorcall API directly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Icxolu, what do you think of doing that (i.e. release 0.23 now as an interim towards a complete switchover for 0.24)?

Generally I'm open to that. I guess that depends a little on how we want to structure/explain the migration. I guess the current state is fairly minimal with the amount of actual breakage. My proposal for the trait bounds migration would have been to provide impl<'a, 'py, T> IntoPyObject<'py> for &'a T where T: ToPyObject {} this blanket, since the vast majority of the APIs are generic of ToPyObject. I would hope that that would keep breakage still low, but it's probably gonna be higher that now. So if you prefer we can definitely delay that to 0.24

On a different note, there is still a bit if bound api cleanup left that I think we should finish before 0.23 and I think #4449 we can also put in 0.23

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My proposal for the trait bounds migration would have been to provide impl<'a, 'py, T> IntoPyObject<'py> for &'a T where T: ToPyObject {}

Hmm, interesting. So I played around with this (and ideas like a blanket-impl of ToPyObject from IntoPyObject, i.e. the reverse direction). TBH, neither felt great. For example implementing IntoPyObject for &'a T where T: ToPyObject will only help when users pass references for their custom types. Having the blanket might just be more confusion.

Having looked at that more, I think that in 0.23 we should just go for it and migrate all trait bounds without a blanket and commit to the bigger breakage. While it's a big (ish) breakage, I think it's actually the easiest state for users to understand, and I think we can make the migration easier for users by adding the derive proposed in #4458. (They might then just be able to switch to the derive and delete code in a lot of cases).

That said, I think we need to cut a 0.22.3 release to resolve #4452 and ship #4396, so I am open to the idea of merging this PR as-is and cherry-picking it as a perf enhancement in 0.22.3. @ChayimFriedman2, if we did that, would you be willing to help work on the follow-up to move this off IntoPy and onto new traits?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChayimFriedman2, if we did that, would you be willing to help work on the follow-up to move this off IntoPy and onto new traits?

Yes. Ping me when you need my help.

I'm actually trying to work now on a pycall!() draft, which will be both the most performant, most capable and most convenient way to call a Python method. Let's see where this'll bring us (it is still worth landing this PR because it benefits user we haven't migrated).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example implementing IntoPyObject for &'a T where T: ToPyObject will only help when users pass references for their custom types. Having the blanket might just be more confusion.

That's true, haven't thought of that. In that case I think I tend to agree, providing any blanket will probably make it worse.

Having looked at that more, I think that in 0.23 we should just go for it and migrate all trait bounds without a blanket and commit to the bigger breakage. While it's a big (ish) breakage, I think it's actually the easiest state for users to understand, and I think we can make the migration easier for users by adding the derive proposed in #4458.

Sure thing, I'll prepare the PR with the trait bounds change and afterwards look into the derive macro.

src/conversion.rs Outdated Show resolved Hide resolved
src/conversion.rs Outdated Show resolved Hide resolved
src/conversion.rs Show resolved Hide resolved
src/types/any.rs Outdated Show resolved Hide resolved
src/types/tuple.rs Outdated Show resolved Hide resolved
pyo3-benches/benches/bench_call.rs Outdated Show resolved Hide resolved
@ChayimFriedman2
Copy link
Contributor Author

Done using the compat functions.

Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, as agreed let's merge this as it's. I'll pick it into 0.22.3 and then let's work out new trait bounds for 0.23.

@davidhewitt
Copy link
Member

Ah, needs a conflict resolved. Sorry for the delay on my part.

This works without any changes to user code.

The way it works is by creating a methods on `IntoPy` to call functions, and specializing them for tuples.

This currently supports only non-kwargs for methods, and kwargs with somewhat slow approach (converting from PyDict) for functions. This can be improved, but that will require additional API.

We may consider adding more impls IntoPy<Py<PyTuple>> that specialize (for example, for arrays and `Vec`), but this i a good start.
@ChayimFriedman2
Copy link
Contributor Author

@davidhewitt Resolved the conflict.

@davidhewitt davidhewitt enabled auto-merge August 24, 2024 18:47
@davidhewitt davidhewitt added this pull request to the merge queue Aug 24, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to a conflict with the base branch Aug 24, 2024
@@ -1516,9 +1514,8 @@ impl<T> Py<T> {
) -> PyResult<PyObject>
where
N: IntoPyObject<'py, Target = PyString>,
A: IntoPyObject<'py, Target = PyTuple>,
A: IntoPy<Py<PyTuple>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Icxolu I reverted the bounds here and on the other call functions given the likely plan is to more these bounds to a separate trait.

@davidhewitt davidhewitt enabled auto-merge August 24, 2024 21:53
auto-merge was automatically disabled August 25, 2024 02:30

Head branch was pushed to by a user without write access

@davidhewitt davidhewitt added this pull request to the merge queue Aug 25, 2024
Merged via the queue into PyO3:main with commit 2e891d0 Aug 25, 2024
43 checks passed
davidhewitt added a commit that referenced this pull request Sep 3, 2024
* Use vectorcall (where possible) when calling Python functions

This works without any changes to user code.

The way it works is by creating a methods on `IntoPy` to call functions, and specializing them for tuples.

This currently supports only non-kwargs for methods, and kwargs with somewhat slow approach (converting from PyDict) for functions. This can be improved, but that will require additional API.

We may consider adding more impls IntoPy<Py<PyTuple>> that specialize (for example, for arrays and `Vec`), but this i a good start.

* Add vectorcall benchmarks

* Fix Clippy (elide a lifetime)

---------

Co-authored-by: David Hewitt <[email protected]>
davidhewitt added a commit that referenced this pull request Sep 3, 2024
* Use vectorcall (where possible) when calling Python functions

This works without any changes to user code.

The way it works is by creating a methods on `IntoPy` to call functions, and specializing them for tuples.

This currently supports only non-kwargs for methods, and kwargs with somewhat slow approach (converting from PyDict) for functions. This can be improved, but that will require additional API.

We may consider adding more impls IntoPy<Py<PyTuple>> that specialize (for example, for arrays and `Vec`), but this i a good start.

* Add vectorcall benchmarks

* Fix Clippy (elide a lifetime)

---------

Co-authored-by: David Hewitt <[email protected]>
davidhewitt added a commit that referenced this pull request Sep 15, 2024
* Use vectorcall (where possible) when calling Python functions

This works without any changes to user code.

The way it works is by creating a methods on `IntoPy` to call functions, and specializing them for tuples.

This currently supports only non-kwargs for methods, and kwargs with somewhat slow approach (converting from PyDict) for functions. This can be improved, but that will require additional API.

We may consider adding more impls IntoPy<Py<PyTuple>> that specialize (for example, for arrays and `Vec`), but this i a good start.

* Add vectorcall benchmarks

* Fix Clippy (elide a lifetime)

---------

Co-authored-by: David Hewitt <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants