Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numpy 2.0 #689

Draft
wants to merge 12 commits into
base: main
Choose a base branch
from
Draft

Numpy 2.0 #689

wants to merge 12 commits into from

Conversation

Armavica
Copy link
Member

@Armavica Armavica commented Apr 3, 2024

Description

Ruff suggested a few things to change but I cannot test because I install both numpy=2.0.0.rc1 and numba (any version including the most recent 0.59.1) (incompatibilities in python abi).

Related Issue

Checklist

Type of change

  • New feature / enhancement
  • Bug fix
  • Documentation
  • Maintenance
  • Other (please specify):

@ricardoV94 ricardoV94 force-pushed the numpy2 branch 3 times, most recently from 28bb96a to 2808634 Compare April 5, 2024 10:50
@ricardoV94
Copy link
Member

ricardoV94 commented Apr 5, 2024

Removed the custom Complex C types... I have no context as to why they were necessary, so that definitely needs careful review.

This is code that seems to have been there mostly from the beginning.

Edit seems to be needed to define operations like * :(

@ricardoV94 ricardoV94 added major NumPy compatibility dependencies Pull requests that update a dependency file C-backend labels Apr 5, 2024
@ricardoV94 ricardoV94 force-pushed the numpy2 branch 2 times, most recently from d83b37d to eca375b Compare April 5, 2024 11:10
@Armavica
Copy link
Member Author

Armavica commented Apr 6, 2024

Edit seems to be needed to define operations like * :(

In what context? If it is in array indexing, then this syntax was only added on Python 3.11 (I found out recently about this).

@ricardoV94
Copy link
Member

ricardoV94 commented Apr 6, 2024

No, this is C(++) code, multiplication of complex scalars.

Maybe @michaelosthege finds this interesting xD?

These are the breaking changes: https://numpy.org/devdocs/numpy_2_0_migration_guide.html#complex-types-underlying-type-changes

maresb added a commit to maresb/pytensor that referenced this pull request Apr 6, 2024
This will help with publishing a repodata patch for conda-forge as per
<pymc-devs#688 (comment)>.
This will no longer be necessary when we are finished with
<pymc-devs#689>.
@maresb maresb mentioned this pull request Apr 6, 2024
11 tasks
ricardoV94 pushed a commit that referenced this pull request Apr 6, 2024
This will help with publishing a repodata patch for conda-forge as per
<#688 (comment)>.
This will no longer be necessary when we are finished with
<#689>.
@maresb
Copy link
Contributor

maresb commented Apr 6, 2024

Friendly note that it will be necessary to revert 1235d08 on the next rebase in order to unpin numpy<2.

@brendan-m-murphy
Copy link

Maybe the answer to this last problem and the overflow problem are in NEP 50: https://numpy.org/neps/nep-0050-scalar-promotion.html

One other part of the tests with many failures is the rewriting tests for ops involving random variables (e.g. test_DimShuffle_lift). I haven't found any obvious problems here. When the rtol is too high, it is usually smaller values causing the problem. It's possible this is also a precision problem, but I haven't looked closely enough to know.

@ricardoV94
Copy link
Member

Do you have a branch with the most recent changes? We can start triaging the failing tests and see what's problematic vs inconsequential

@brendan-m-murphy
Copy link

Here is the most recent version: brendan-m-murphy#2

@brendan-m-murphy
Copy link

brendan-m-murphy commented Aug 7, 2024

I'll make a list of failing tests here (as I look into them):

  1. tests/tensor/special.py several failing tests; the first is TestSoftmax.test_perform. I fixed the error raised by using np.MAXDIMS, but now the values found for axis=None don't agree with those computed by scipy. This test case passes if the function is compiled with the "c" or "py" linkers, but fails on the "cvm" (default) linker. (Although if linker="c", then the test fails for axis = -1. For some reason, when axis=-1, the sums along axis=-1 aren't 1, but if you divide by the sums along axis -1, then you get the correct result.) UPDATE 1: now this fails for the default linker and the c linker for axis = -1. All other tests pass. The change was to update numpy to the latest version from the pinned version. UPDATE 2: the python code (scipy code written in terms of numpy) seems to be faster on a 200 x 300 matrix, although maybe there is an advantage to the C code if many ops are compiled together. UPDATE 3: This might be related to the problem where NPY_RAVEL_AXIS is -1, which should be fixed in the next numpy release.
  2. tests/scalar/test_loop.py: test_inner_composite fails because of an overflow when computing exp of a float16. This particular failure isn't important, but might be a symptom of numpy's changes to casting rules.
  3. tests/scan/test_basic.py: TestScan class, methods test_grad_one_output, test_grad_multple_outs, test_grad_multiple_outs_taps, test_grad_multiple_outs_taps_backwards
  4. tests/sparse/test_var.py: TestSparseVariable::test_unary[argmax/argmin-DenseTensorType-None-None], numpy AxisError, probably related to change from NPY_MAXDIMS to NPY_RAVEL_AXIS to indicate axis = None
  5. tests/sparse/test_var.py: TestSparseVariable::test_unary[cumsum/cumprod-DenseTensorType-None-None], CumOp attribute error for "c_axis". Update: fixed the "c_axis" problem, but now CumSum and CumProd fail due to an "OutputGuard" optimization (according to DebugMode). The error message is "ValueError: provided out is the wrong size for the accumulation". This problem occurs for dense tensors, so it's probably not to do with the sparse code. Update: the code to fix "c_axis" wasn't producing the correct value of NPY_RAVEL_AXIS, so I changed the c_code to use NPY_RAVEL_AXIS if self.axis is None. Now there is an error coming from a ufunc for accumulating sums... the error is due the value of a variable that numpy sets. I have a (slightly ugly) workaround, and I've raised an issue with Numpy. Update: there is a typo in numpy that sets NPY_RAVEL_AXIS to -1, which is causing this problem.
  6. tests/tensor/test_extra_ops.py::TestCumOp:test_grad/test_infer_shape/test_cum_op attribute error, missing "c_axis". Update: fixed "c_axis" error, now have same ValueError as case 5.
  7. tests/tensor/random/rewriting/test_basic.py: for test_Subtensor_lift params 25 and 28 TypeError __str__ returned an AxisError; for params 17, 18, 19, 20, 21, 22, 23, TypeError due to casting (cannot store 1e-6 as float32); for test_DimShuffle_lift, several failing parametric tests
  8. tests/tensor/random: test_random_maker_op in test_op.py, TestRandomGeneratorType in test_type.py has two failing methods, TestSharedRandomStream::test_tutorial is failing in test_utils.py
  9. tests/tensor/test_math_scipy: some scipy ufuncs are now returning float64 instead of float32. I can't find where this change is documented. But the tests that fail here only fail because the expected dtype is now float64 and the result dtype is still float32.

@brendan-m-murphy
Copy link

brendan-m-murphy commented Aug 29, 2024

Do you have a branch with the most recent changes? We can start triaging the failing tests and see what's problematic vs inconsequential

I think based on how the new numpy scalar promotion rules work, if config.cast_policy == "custom" then autocast_float will always return the value as float32. If the value is too big to be represented as float32, there will be an overflow error in the loop on lines 186-189 in scalar/basic.py. I think to get the old behavior back, x would need to be cast to float64 for the comparison of _x and x.

I'm not sure how much this matters. For the loop test, I just increased the precision from 16 to 32, since it didn't matter for that test. And for the autocaster tests, I've just changed the expected value in one case to match what happens with the new numpy scalar promotion rules (https://numpy.org/neps/nep-0050-scalar-promotion.html#impact-on-operators-involving-python-int-float-and-complex). I could revert these and change the autocaster logic to do what happened before numpy 2.0.

UPDATE: this seems to be causing problems elsewhere, since pt.as_tensor_variable(x), where x is a python float, will always return float32 unless floatX is float16. So in tests where a float is converted to a tensor variable x_pt = pt.as_tensor_variable(x).type("x_0") and then given a test value x_pt.tag.test_value = x, there could be an error because x_pt is float32 and x's "true value" will be its float64 value, and this will be compared to its converted value as a float32, since that is x_pt's type.

I think I will change how the autocaster makes comparisons to match how TensorType.filter makes comparisons. This should also restore the old behavior of the numpy autocaster.

@brendan-m-murphy
Copy link

brendan-m-murphy commented Sep 2, 2024

Here is the status of the tests on my fork:

For python 3.10:

  1. tests/scan has four failing tests: test_grad_one_output, test_grad_multiple_outs, test_grad_multiple_outs_taps, test_grad_multiple_outs_taps_backwards. I think the issue (for the first failing test) is that the cost function is regenerating the random samples each time it is called, so the gradient computed by finite differences is huge.
  2. tests/tensor/random/rewriting/test_basic: test_DimShuffle_lift fails for a few cases. The actual and desired answers are sometimes close, although when the values are very small the relative errors can be quite big. I need to investigate further.
  3. tests/tensor/random: test_random_maker_op, TestRandomGeneratorType, TestSharedRandomStream. I haven't looked into these yet.
  4. tets/tensor/test_extra_ops.py::TestUnique::test_infer_shape: it looks like a matrix should (not?) have been flattened. I haven't looked into this yet.
  5. tests/tensor/test_math_scipy.py: some tests are failing because the expected data type has changed. Some functions in scipy are now returning float64 with unit8 inputs; they used to return float32. The pytensor functions that wrap these scipy functions cast the return values to float32. I'm not really sure why the scipy functions have changed. The fix here is to either downcast the expected values computed by scipy, or change how pytensor implements the scipy functions.
  6. numba and jax tests I haven't looked at yet

For python 3.12:

  1. tests/test_config.py::test_config_pickling: it looks like a logging/error message has changed, but I need to check that that is the only problem.
  2. tests/tensor/test_math.py::TestNegBroadcast::test_good: overflow error since -1 is no longer converted to 255 as uint8 by numpy. (Not sure why this isn't a problem for python 3.10)
  3. all of the tests that fail for python 3.10 also fail for python 3.12

I had to pin numpy to 2.0.2, since there are new failing tests with 2.1. The NPY_RAVEL_AXIS bug/typo was fixed on numpy 2.0.2, which fixed a lot of the tests I listed in a previous comment.

@ricardoV94
Copy link
Member

@brendan-m-murphy Thanks so much for the work and sorry for the delay. It sounds like some promising progress. To your last round of observations:

I think the issue (for the first failing test) is that the cost function is regenerating the random samples each time it is called, so the gradient computed by finite differences is huge.

It doesn't make sense that the test would be passing before and failing now?

The actual and desired answers are sometimes close, although when the values are very small the relative errors can be quite big. I need to investigate further.

That sounds fine, let's set a nonzero atol

some tests are failing because the expected data type has changed. Some functions in scipy are now returning float64 with unit8 inputs; they used to return float32.

I think this was addressed in #972 (Have you rebased on top of it?)

it looks like a logging/error message has changed, but I need to check that that is the only problem.

There might have been a recent PR that fixed this?

overflow error since -1 is no longer converted to 255 as uint8 by numpy. (Not sure why this isn't a problem for python 3.10)

These sort of changes are some of the trickiest, but tbh also unlikely to affect any users whatsoever. Feel free to mark it as xfail and open a separate issue for us to figure out what should be done.

it looks like a matrix should (not?) have been flattened. I haven't looked into this yet.

Did the behavior of numpy.unique change? I think when you do unique of a matrix you get a vector back. I didn't find the test in your log (just skimmed). I'll look at it again when the noise is removed from addressing the previous points.

@brendan-m-murphy
Copy link

@ricardoV94 Sorry for the slow response. This is still on my todo list, but I'm pretty busy at the moment. I'll try to rebase my branch onto main soon and try your suggestions.

@ricardoV94
Copy link
Member

Thanks @brendan-m-murphy !

@brendan-m-murphy
Copy link

brendan-m-murphy commented Nov 11, 2024

@ricardoV94 I finally got some time to do the rebase this weekend. It didn't go perfectly, so I had to make some fixes afterwards.

Are we still aiming for compatibility with numpy < 2? I think we'll need conditional import statements to do that. For instance, numpy.core is now numpy._core, and some functions like normalize_axis_tuple have been moved to numpy.lib.array_utils: numpy/numpy#24540. I'm pretty sure I can make the complex scalar stuff backwards compatible. There are a few other places where there is numpy 2.0 specific C code; it's possible the numpy compatibility header file has definitions that will work with older numpy versions though.

There are still a few failing tests (e.g. TestScan and TestSharedRandomStream) where I'm not really sure what is going on, and I probably won't have time to look into them soon. If I can make a backwards compatible PR, maybe it makes sense to put in the fixes I've made so far, and isolate the remaining problems. I will try to relax tolerances for the dim shuffle tests though, and see if anything else looks like a quick fix.

Here is the latest: https://github.com/brendan-m-murphy/pytensor/pull/2/checks. (Some of the mypy failures are related to the import statements I mentioned earlier.)

@ricardoV94
Copy link
Member

Yes backward compat would be nice to have

@brendan-m-murphy
Copy link

brendan-m-murphy commented Jan 27, 2025

@ricardoV94 Sorry for the delay. I started looking at this again last Friday. I'm not sure how I missed this before, but the failing test tests/tensor/random/test_utils:TestSharedRandomStream.test_basics must mean something is wrong with RandomStream, and that is probably causing a lot of the other failures like test_scan.py.

On my branch, if I run this test with numpy 1.26, it passes, and if I run with numpy 2.0, it fails (if I use pytest to run the test). If I run the code in a script, then it passes. It's a bit weirder, because for this test, np.assert.allclose(fn_val0, numpy_val1) is true with numpy 2.0... so random stream is discarding the first four random values (or something like that).

Anyway... I will try to find the exact numpy commit where this happens. If you have any ideas, let me know.

@brendan-m-murphy
Copy link

brendan-m-murphy commented Jan 28, 2025

An update: this is the numpy PR that first causes this behaviour to occur: numpy/numpy@44ba7ca

I don't know why this is happening, but using numpy after this PR causes compute_test_value to increment the state of the bit generator spawned by RandomStream when we call fn = function([], random.uniform(...), ...). Thus the first value generated by fn is thrown away in this step.

Prior to this PR, this doesn't happen: the state of the bit generator when compute_test_value is called is the same as the state when fn is called for the first time.

When running the same code from a script, it seems that config.compute_test_value == "off", while using pytest to run the test we have config.compute_test_value == "warn".

In more detail, with random = RandomStream(...), calling random.uniform(...) spawns a new RNG rng from the random stream, computes out = UniformRV(..., rng=rng).

Prior to the PR I liked above, the state of rng before and after the line out = UniformRV(...) is the same.

However, after that PR, if a test value is computed during out = UniformRV(...), then the state of rng will be different after this call.

It seems that replacing rng = copy(rng) with rng = deepcopy(rng) in RandomVariable.perform fixes some of the failing tests. (This was a problem elsewhere, but it was more obvious in that case. Here is some related discussion: numpy/numpy#24086.) For instance, this fixes the failing tests in test_scan.py and the tests in random/rewriting/test_basic.py. (Again, I don't know exactly why this fix works, since I don't totally understand how the state is saved before computing the test value/how the copies of the RNG are updated correctly through the shared variable mechanism.)

Hopefully there won't be too much left to fix now.

@ricardoV94
Copy link
Member

ricardoV94 commented Jan 28, 2025

Re RNGs, we also had some issues with new numpy stuff on PyMC that @lucianopaz looked at. Not sure if the same, but the sounds like it might.

Hopefully there won't be too much left to fix now.

Awesome. Let us know if you're blocked somewhere!

@brendan-m-murphy
Copy link

@ricardoV94 There are two numba tests related to RNGs that are failing.

The two tests are test_rng_copy and test_rng_non_default_update. I modified the Jax DeepCopyOp to make both tests run for Jax; the first one passes for Jax, but the second fails. Both pass with mode=None.

Before I changed copy to deepcopy in the RandomVariable.perform method, the same issue happens with mode=None. (That is, if you do the first example in docs https://pytensor.readthedocs.io/en/latest/tutorial/prng.html, then the two calls f(rng_val), f(rng_val) will produces different values.)

I'm having trouble figuring out how to get numba to do a similar copy. The impression I got was that, by the time the function is compiled, it is too late to copy the RNG. I don't know much about numba, so maybe there is a way to make the copy happen as python code. I tried to git bisect numpy using the test_rng_copy test to select good and bad commits, but I was getting problems from scipy trying to import from the wrong numpy locations when using the numpy 2.0.0dev commits, so I couldn't figure out what caused this change for numba.

If I run these tests with mode="JAX", the first one (test_rng_copy) works, and the second test (test_rng_non_default_update) raises an error because the Jax DeepCopyOp can't copy dictionaries; if I modify the Jax DeepCopyOp to copy dictionaries by creating a new dictionary with jnp_safe_copy to the values of the dictionary, then the second test fails for Jax, with different values produced in the second and third function calls (so the last assert statement fails).

When running the first test with Jax, it raises a warning saying it will use a copy of the shared RNG, which presumably is what we want to happen with numba. Maybe the code in the Jax linker that replaces shared RNG inputs could be extracted into a helper function (in JITLinker, for instance) and applied to the numba linker as well.

I might try this, but other ideas/advice would be appreciated.

Besides that, the only failing tests are the numba benchmarks, due to a change in the latest numba release (the _mpm_cheap attribute has been removed from CPUContext)

This is the latest run of the tests: https://github.com/brendan-m-murphy/pytensor/actions/runs/13055274796/job/36424753373?pr=2

@ricardoV94
Copy link
Member

That's incredibly close. Perhaps @aseyboldt or @kc611 know how we can get the deepcopy of the RNG state to work with numba

@ricardoV94
Copy link
Member

Actually we're just using obj mode for the copy in Numba?

@overload(copy)
def copy_NumPyRandomGenerator(rng):
def impl(rng):
# TODO: Open issue on Numba?
with numba.objmode(new_rng=types.npy_rng):
new_rng = copy(rng)
return new_rng
return impl

Can't you swap that to use deepcopy instead?

@ricardoV94
Copy link
Member

ricardoV94 commented Jan 31, 2025

@brendan-m-murphy just to see if I understand, the problem is that while this was true in numpy<2, it is no longer true now?

import numpy as np

rng = np.random.default_rng(123)
rng_copy = copy(rng)

assert np.isclose(rng.uniform(), rng_copy.uniform())

@brendan-m-murphy
Copy link

brendan-m-murphy commented Jan 31, 2025

@brendan-m-murphy just to see if I understand, the problem is that while this was true in numpy<2, it is no longer true now?

import numpy as np

rng = np.random.default_rng(123)
rng_copy = copy(rng)

assert np.isclose(rng.uniform(), rng_copy.uniform())

Yes exactly. I'm not sure they intended this change, but it seems like copy uses some methods used by pickle, and there were some changes to those methods in numpy 2.0 due to some pickling issue (pickling and unpickling a RNG caused a change in state before numpy 2.0 (I think).

@brendan-m-murphy
Copy link

Actually we're just using obj mode for the copy in Numba?

@overload(copy)
def copy_NumPyRandomGenerator(rng):
def impl(rng):
# TODO: Open issue on Numba?
with numba.objmode(new_rng=types.npy_rng):
new_rng = copy(rng)
return new_rng
return impl

Can't you swap that to use deepcopy instead?

Awesome, that works!

@brendan-m-murphy
Copy link

That's incredibly close. Perhaps @aseyboldt or @kc611 know how we can get the deepcopy of the RNG state to work with numba

Can we ignore the benchmarks for now? I can try to fix it, but it seems like it would be better for that to be a different PR, since I don't think it depends on the numpy version.

I haven't implemented backwards compatibility everywhere, but I can work on it. I made some of the imports backwards compatible and added a warning if the old imports are being used (mostly for debugging, but it would be a hint to anyone who happens to upgrade pytensor without upgrading numpy).

Logistically, should I force push to this PR? Or open a new one? Also the commit history is a bit of a mess on my branch; I can try to clean it up by rebasing.

@ricardoV94
Copy link
Member

New PR sounds like the best. We can definitely ignore benchmarks, but are they failing to run or just slow?

@brendan-m-murphy
Copy link

brendan-m-murphy commented Jan 31, 2025

New PR sounds like the best. We can definitely ignore benchmarks, but are they failing to run or just slow?

They're raising errors due to pytensor.link.numba.dispatch.basic.use_optimized_cheap_pass. The type of the context variable there no longer has a _mpm_cheap attribute.

@ricardoV94
Copy link
Member

ricardoV94 commented Jan 31, 2025

New PR sounds like the best. We can definitely ignore benchmarks, but are they failing to run or just slow?

They're raising errors due to pytensor.link.numba.dispatch.basic.use_optimized_cheap_pass. The type of the context variable there no longer has a _mpm_cheap attribute.

Ah that's not related to your changes. We should see them in our CI. Unless because of our numpy pin, we also end up pinning numba to a lower version?

@ricardoV94
Copy link
Member

@brendan-m-murphy the numba incompat should be fixed by #1186

@brendan-m-murphy
Copy link

brendan-m-murphy commented Feb 5, 2025

Okay, PR is open here: #1194!

@ricardoV94 I added numpy 1.26 and 2.1 to CI, so the workflow needs approval.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-backend dependencies Pull requests that update a dependency file major NumPy compatibility
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Test on numpy 2.0
7 participants