Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Pairwise method or symmetric keyword in DataFrame.corr #44671

Closed
maurosilber opened this issue Nov 29, 2021 · 2 comments
Closed

ENH: Pairwise method or symmetric keyword in DataFrame.corr #44671

maurosilber opened this issue Nov 29, 2021 · 2 comments
Labels
Closing Candidate May be closeable, needs more eyeballs Enhancement Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@maurosilber
Copy link
Contributor

Is your feature request related to a problem?

I wish I could use pandas to do pairwise computations.

Describe the solution you'd like

The DataFrame.corr method accepts a callable to compute a user-specified correlation function. Its docstring says:

callable: callable with input two 1d ndarrays returning a float. Note that the returned matrix from corr have 1 along the diagonals and will be symmetric of the callable's behavior.

I'd like to compute pairwise calculations which are not symmetric. I can think of two possible solutions:

  1. Add a symmetric keyword-only argument, changing the signature to:
def corr(
        self,
        method: str | Callable[[np.ndarray, np.ndarray], float] = "pearson",
        min_periods: int = 1,
        *,
        symmetric=True,
    ) -> DataFrame:

It keeps the current behaviour by default, but allows asymmetric callables to be computed. It could also change the behaviour when computing the diagonals, which are now set to 1, independently of the function provided.

  1. Add a new pairwise method.

API breaking implications

The first option shouldn't change the API, as it is adding a keyword-only argument which keeps current behaviour as default.

The second option might not be backwards-compatible with code where columns are named pairwise and called via attribute access.

Additional context

A similar issue was raised here: #25726

@maurosilber maurosilber added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 29, 2021
@jreback
Copy link
Contributor

jreback commented Nov 29, 2021

that issue see discussions in the PR: #43569

@jreback jreback added the Closing Candidate May be closeable, needs more eyeballs label Nov 29, 2021
@mroeschke
Copy link
Member

Thanks for the suggestion, but given the discussion in #43569, agreed that expanding the API of the corr function does not have much support. Closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Closing Candidate May be closeable, needs more eyeballs Enhancement Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

3 participants