Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make PyTensor compatible with numpy 2.0 #1194

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

brendan-m-murphy
Copy link

@brendan-m-murphy brendan-m-murphy commented Feb 5, 2025

Description

This PR makes PyTensor compatible with numpy versions >= 1.26 and < 2.2. (Numba requires numpy < 2.2.)

These changes include:

  • Unpinning numpy in environment files and pyproject
  • Adding a numpy 1.26 test job to CI
    • runs on python 3.10, float32 0, fast-compile 0
    • skips doctests, since it doesn't seem to easy to make these conditional on numpy version, and the numpy repr for numerical types has changed (e.g. in numpy < 2.0, 3 will print, but in numpy >= 2.0, np.int64(3) will print)
  • General numpy deprecations and namespace changes:
    • updated imports due to changed name spaces (e.g. numpy.core is now numpy._core, and many functions have been moved from core to new public locations like numpy.lib; also AxisError needs to be imported from numpy.exceptions now). These changes are conditional on numpy version number, except the AxisError import, which is compatible with numpy 1.26.
    • deprecations: the main change is that np.cast is deprecated; its replacement is np.asarray(..., dtype=...)
  • The return value of the inverse indices from np.unique has changed when axis is None; this required changes in the Unique Op. (This change is conditional on numpy version number.)
  • Changes in how overflows and type conversions are handled: to explicitly change the type of a numpy array, you must use .astype. Conversions are no longer handled automatically; for instance np.asarray(-1, dtype="unit8") will raise an OverflowError.
    • TensorType.filter uses this new conversion method if allow_downcast is true, which preserves the existing behavior
    • Several tests have been changes to either expect OverflowErrors (for numpy >= 2.0, or TypeError for numpy < 2.0), or use equivalent but valid values (e.g. using 255 for a uint8, instead of -1).
  • Changes to python type promotion (NEP 50)
    • NEP 50 outlines changes in how python types are compared with and converted to numpy types. These changes were optional before numpy 2.0, but are now default. Essentially, the new rule is: if a python float is used in an operation with a numpy float, the type of the numpy float will always be used.
    • The NumpyAutocaster has been changes to explicitly convert values to numpy types using np.asarray, which preserves the existing behavior. (The reason this preserves the behavior is that this is how the comparison is done in TensorType.filter, where np.asarray(data) is compared to converted_data = np.asarray(data, self.dtype).)
  • Changes to random number generators:
    • The numpy PR numpy/numpy@44ba7ca
      changed methods of numpy.random.Generator that are used by copy and pickle.
    • To get a copy of a numpy Generator with independent state, you must use deepcopy now, instead of copy
    • This PR also changed the return value of Generator.__getstate__() to None. To get the state now, you must use Generator.bit_generator.state.
  • Changes due to changes in the numpy C-API
    • Some minor changes with straightforward updates, e.g. replace ->elsize by PyArray_ITEMSIZE
    • Changes to complex scalars in ScalarType. The numpy implementation of complex numbers has been changed from a struct with real and imaginary values to the native C-99 complex types. On disk, these are equivalent, but the real and imaginary parts C-99 complex types cannot be accessed using pointers. Numpy provides some macros to make accessing real and imaginary parts uniform across pre and post 2.0 version. Since these are implemented in terms of the types npy_cfloat, npy_cdouble, npy_clongdouble, some generic functions were added to the C code so that we do not need to explicitly translation bit size aliases ilke npy_complex64 to these types.
    • The constant np.MAXDIMS was removed from the public API. This value was a common flag used to indicate that axis=None has been passed. Now there is an explicitly flag NPY_RAVEL_AXIS. Implementing this change was a bit variable across the affected code. A compatibility header was added to pytensor/npy_2_compat.py to make NPY_RAVEL_AXIS available for numpy < 2.0.
    • MapIter was removed from the public numpy C-API, so it was not possible to adapt the C-code for AdvancedIncSubtensor1; instead a NotImplementedError is raised, so this Op defaults to the python implementation, which uses np.add.at.
  • Dropped support for Python 2 in C code.

Related Issue

Checklist

Type of change

  • New feature / enhancement
  • Bug fix
  • Documentation
  • Maintenance
  • Other (please specify):

📚 Documentation preview 📚: https://pytensor--1194.org.readthedocs.build/en/1194/

@brendan-m-murphy brendan-m-murphy mentioned this pull request Feb 5, 2025
11 tasks
Copy link

codecov bot commented Feb 5, 2025

Codecov Report

Attention: Patch coverage is 95.03106% with 8 lines in your changes missing coverage. Please review.

Project coverage is 82.32%. Comparing base (4ac1e63) to head (c489137).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pytensor/link/numba/dispatch/elemwise.py 40.00% 3 Missing ⚠️
pytensor/sparse/rewriting.py 75.00% 2 Missing ⚠️
pytensor/link/numba/dispatch/random.py 50.00% 1 Missing ⚠️
pytensor/tensor/basic.py 87.50% 1 Missing ⚠️
pytensor/tensor/type.py 85.71% 0 Missing and 1 partial ⚠️

❌ Your patch check has failed because the patch coverage (95.03%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1194      +/-   ##
==========================================
+ Coverage   82.29%   82.32%   +0.03%     
==========================================
  Files         186      187       +1     
  Lines       48034    48122      +88     
  Branches     8629     8639      +10     
==========================================
+ Hits        39528    39615      +87     
- Misses       6348     6350       +2     
+ Partials     2158     2157       -1     
Files with missing lines Coverage Δ
pytensor/link/c/basic.py 87.78% <100.00%> (+0.30%) ⬆️
pytensor/link/jax/dispatch/random.py 93.49% <100.00%> (ø)
pytensor/npy_2_compat.py 100.00% <100.00%> (ø)
pytensor/scalar/basic.py 80.63% <100.00%> (+0.09%) ⬆️
pytensor/sparse/basic.py 82.57% <100.00%> (ø)
pytensor/tensor/blas.py 73.95% <100.00%> (ø)
pytensor/tensor/blas_headers.py 70.78% <100.00%> (ø)
pytensor/tensor/conv/abstract_conv.py 76.07% <100.00%> (+0.02%) ⬆️
pytensor/tensor/einsum.py 97.11% <100.00%> (+0.10%) ⬆️
pytensor/tensor/elemwise.py 89.01% <100.00%> (+0.04%) ⬆️
... and 14 more

@ricardoV94
Copy link
Member

Numba requires numpy < 2.2.

Numba is an optional dependency for us so no need to pin, we can just make sure the CI runs with that version for the numba + benchmark tests (they are already separate)

@ricardoV94
Copy link
Member

PS: this is amazing 😍 I'll try to review soon

@ricardoV94
Copy link
Member

ricardoV94 commented Feb 5, 2025

We'll need to advertise slower advanced indexing operation on the C backend. I think that's the only regression?

@brendan-m-murphy
Copy link
Author

We'll need to advertise slower advanced indexing operation on the C backend. I think that's the only regression?

Yes, it np.add.at is possibly slower in some cases: aesara-devs/aesara#1506. There is an open issue tracking enhancements: numpy/numpy#23176

I haven't looked at any benchmarks for this change, so I'm not sure what the impact is.

I'm making some changes to unpin numpy and make sure the CI runs with 2.1 for numba, I'll push those today.

brendan-m-murphy and others added 19 commits February 6, 2025 11:03
Also added ruff numpy2 transition rule.
Remaining tests now run on latest numpy,
except for Numba jobs, which need numpy 2.1.0
- replaced np.AxisError with np.exceptions.AxisError
- the `numpy.core` submodule has been renamed to `numpy._core`
- some parts of `numpy.core` have been moved to `numpy.lib.array_utils`

Except for `AxisError`, the updated imports are conditional on
the version of numpy, so the imports should work for numpy >= 1.26.
- Replace np.cast with np.asarray: in numpy 2.0,
  `np.cast[new_dtype](arr)` is deprecated.
  The literal replacement is `np.asarray(arr, dtype=new_dtype)`.

- Replace np.sctype2char and np.obj2sctype.
  Added try/except to handle change in behavior
  of `np.dtype`

- Replace np.find_common_type with np.result_type
Some macros were removed from npy_3k_compat.h.
Following numpy, I updated the affected functions
to the Python 3 names, and removed support for Python 2.

Also updated lazylinker_c version to indicate substantial changes to the
C code.
- replace `->elsize` by `PyArray_ITEMSIZE`
- don't use deprecated PyArray_MoveInto
This is done using C++ generic functions
to get/set the real/imag parts of complex numbers.

This gives us an easy way to support Numpy v < 2.0,
and allows the type underlying the bit width types,
like pytensor_complex128, to be correctly inferred
from the numpy complex types they inherit from.

Updated pytensor_complex struct to use get/set real/imag
aliases defined above. Also updated operators such as
`Abs` to use get_real, get_imag.

Note: redefining the complex arithmetic here means that we
aren't treating NaNs and infinities as carefully as the C99
standard suggets (see Appendix G of the standard).

The code has been like this since it was added to Theano,
so we're keeping the existing behavior.
MapIter was removed from the public numpy C-API in
version 2.0, so we raise a not implemented error to
default to the python code for the AdvancedInSubtensor1.

The python version, defined in `AdvancedInSubtensor1.perform`
calls `np.add.at`, which uses `MapIter` behind the scenes.
There is active development on Numpy to improve the efficiency
of `np.add.at`.

To skip the C implementation and use the Python implementation,
we raise a NotImplementedError for this op's c code if numpy>=2.0.
This was done for the python linker and numba linker.

deepcopy seems to be the recommended method for
copying a numpy Generator.

After this numpy PR:
numpy/numpy@44ba7ca
`copy` didn't seem to actually make an independent copy of
the `np.random.Generator` objects spawned by `RandomStream`.

This was causing the "test values" computed by e.g.
`RandomStream.uniform` to increment the RNG state, which
was causing tests that rely on `RandomStream` to fail.

Here is some related discussion:
numpy/numpy#24086

I didn't see any official documentation about
a change in numpy that would make copy stop
working.
numpy.random.Generator.__getstate__()
now returns none; to see the state of
the bit generator, you need to use
Generator.bit_generator.state.

This change affects `RandomGeneratorType`, and
several of the random tests (including some for Jax.)
`np.MAXDIMS` was removed from the public API and
no replacement is given in the migration docs.

In numpy <= 1.26, the value of `np.MAXDIMS` was 32.
This was often used as a flag to mean `axis=None`.

In numpy >= 2.0, the maximum number of dims of an
array has been increased to 64; simultaneously, a
constant `NPY_RAVEL_AXIS` was added to the C-API to
indicate that `axis=None`.

In most cases, the use of `np.MAXDIMS` to
check for `axis=None` can be replaced by the
new constant `NPY_RAVEL_AXIS`.

To make this constant accessible when using numpy <= 1.26,
I added a function to insert `npy_2_compat.h` into the support
code for the affected ops.
In numpy 2.0, -1 as uint8 is out of bounds, whereas
previously it would be converted to 255.

This affected the test helper function `reduced_bitwise_and`.
The helper function was changed to use 255 instead of -1 if
the dtype was uint8, since this is what is needed to match the
behavior of the "bitwise and" op.

`reduced_bitwise_and` was only used by `TestCAReduce`
in `tests/tensor/test_elemwise.py`, so it was moved
there from `tests/tensor/test_math.py`
1. Changed autocaster due to new promotion rules

With "weak promotion" of python types in Numpy 2.0,
the statement `1.1 == np.asarray(1.1).astype('float32')` is True,
whereas in Numpy 1.26, it was false.

However, in numpy 1.26, `1.1 == np.asarray([1.1]).astype('float32')`
was true, so the scalar behavior and array behavior are the same
in Numpy 2.0, while they were different in numpy 1.26.

Essentially, in Numpy 2.0, if python floats are used in operations
with numpy floats or arrays, then the type of the numpy object will
be used (i.e. the python value will be treated as the type of the numpy
objects).

To preserve the behavior of `NumpyAutocaster` from numpy <= 1.26, I've
added an explicit conversion of the value to be converted to a numpy
type using `np.asarray` during the check that decides what dtype to
cast to.

2. Updates due to new numpy conversion rules for out-of-bounds python
ints

In numpy 2.0, out of bounds python ints will not be automatically
converted, and will raise an `OverflowError` instead.
For instance, converting 255 to int8 will raise an error, instead of
returning -1.

To explicitly force conversion, we must use
`np.asarray(value).astype(dtype)`, rather than
`np.asarray(value, dtype=dtype)`.

The code in `TensorType.filter` has been changed to the new recommended
way to downcast, and the error type caught by some tests has been
changed to OverflowError from TypeError
I was getting a NameError from the list
comprehensions saying that e.g. `pytensor_scalar`
was not defined. I'm not sure why, but this is another
(more verbose) way to do the same thing.
From numpy PR numpy/numpy#22449,
the repr of scalar values has changed, e.g. from "1" to
"np.int64(1)", which caused two doctests to fail.

The doctest for tensor.extra_ops.Unique was failing because
the output shape for the inverse indices has changed when
axis is None: numpy/numpy#20638
In numpy 2.0, if axis=None, then np.unique
does not flatten the inverse indices returned
if return_inverse=True

The output shape has been changed to match numpy,
dependent on the version of numpy.
Due to changes in numpy conversion rules (NEP 50),
overflows are not ignored; in particular, negating
a unsigned int causes an overflow error.

The test for `neg` has been changed to check that
this error is raised.
I split this test up to test uint64 separately, since
this is the case discussed in Issue pymc-devs#770. I also added
a test for the exact example used in that issue.

The uint dtypes with lower precision should pass.

The uint64 case started passing for me locally on Mac OSX,
but still fails on CI. I'm not sure why this is, but at
least the test will be more specific now if it fails in the future.
@brendan-m-murphy
Copy link
Author

brendan-m-murphy commented Feb 6, 2025

@ricardoV94 Okay, unpinned numpy and changed the CI to match. I ran the tests on my fork and they passed. There was some random mypy issue, so I've fixed that. Not sure why it was raised now.

I'm not sure why this was raised now.
@metrizable
Copy link

@brendan-m-murphy Thanks for this pull request! Looking forward to it!

Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, some small comments / questions

.github/workflows/test.yml Show resolved Hide resolved
pytensor/npy_2_compat.py Show resolved Hide resolved
@@ -664,7 +763,7 @@ def c_init_code(self, **kwargs):
return ["import_array();"]

def c_code_cache_version(self):
return (13, np.__version__)
return (18, np.version.git_revision)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why .git_revision?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was so I could do a git bisect against numpy. I'll change it to (14, np.__version__).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incidentally, should c_code_cache_version have Hashable as a return type? (Or maybe tuple[Hashable, ...]?). My editor complains about adding strings to the c_code_cache_version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes to the better type hint

@@ -3125,7 +3224,7 @@ def L_op(self, inputs, outputs, gout):
else:
return [x.zeros_like()]

return (gz * exp2(x) * log(np.cast[x.type](2)),)
return (gz * exp2(x) * log(np.asarray(2, dtype=x.type)),)
Copy link
Member

@ricardoV94 ricardoV94 Feb 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should all these be np.array(...) not np.asarray(...) since we know they are not arrays to start with?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you're right. Those might have been ruff auto-fixes for np.cast. I'll fix them.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've noticed that some of the code like this is redundant now. E.g. x * np.array(math.log(2.0)).astype(x.dtype) is now the same as x * math.log(2.0) because numpy now uses the type of x here. For backwards compatibility, maybe we should leave the astype conversion for now, but once numpy 1.26 support is dropped, this won't be needed.

Comment on lines +19 to +24
try:
from numpy.lib.array_utils import normalize_axis_index, normalize_axis_tuple
except ModuleNotFoundError:
# numpy < 2.0
from numpy.core.multiarray import normalize_axis_index
from numpy.core.numeric import normalize_axis_tuple
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put this in pytensor.utils and import from there to avoid defining it every time?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's a good idea.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The module docstring for pytensor.utils says its for functions that only depend on the standard library. I could put this in pytensor.npy_2_compat instead.

outputs.append(typ())
if NumpyVersion(np.__version__) >= "2.0.0rc1":
if axis is None:
inverse_shape = TensorType(dtype="int64", shape=x.type.shape)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: name this inverse_type not inverse_shape

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll change that.

if self.axis is not None:
shape = (x_shape[axis],)
elif NumpyVersion(np.__version__) >= "2.0.0rc1":
shape = x_shape
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worried about the behavior changing depending on the numpy version. Can we keep the old 1.x behavior (which means we need to ravel this output in perform?). We can open an issue to add the 2.x behavior, which I think should be done with a sentinel flag that the user has to opt-in, at least until we deprecate numpy < 2.0 support

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll change it to keep the old behavior (I think you're right that it should just require ravelling the output). Should I rebase + force push with these changes? If so, maybe I'll put the 2.x behavior changes on another branch before I do that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm is the git history clean in that you would suggest we merge without squash? If not, don't worry, if yes a rebase and clean stuff would be great.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been rebasing to try to keep all related changed in one commit, but there is still about 20 commits, and some are kind of small like "Fix for NameError in test" and "Mypy fix". I could squash the smaller bits into a "misc" commit.

This is what it looks like now (without the changes for the review):

8c17e93 * origin/numpy-2-updates Mypy fix
631f397 * Split up TestMinMax::test_uint
7090c1c * Backwards compat for complex types C code
cb4dec7 * Fix test for neg on unsigned
c926d41 * Updated Unique inverse output shape
b4ad058 * Updated doctests
bc719e2 * Fix for NameError in test
f641835 * Changes due to new numpy scalar promotion rules
b694013 * Fixed failed test due to uint8 overflow
625dd67 * Replace use of np.MAXDIMS
279e3cf * Change rng.getstate to rng.bit_generator.state
3e77bd6 * Changed copy to deepcopy for rng
256f31d * Use Python implementation for AdvancedInSubtensor1
e27fe32 * Make complex scalars work with numpy 2.0
838d4d6 * Changes for deprecations in numpy 2.0 C-API
ca2e26e * Updated lazylinker C code
8e9e3e1 * Changes for numpy 2.0 deprecations
c759a0b * Update numpy deprecated imports
3b904d6 * Added numpy 1.26.* to CI

I could probably make some smaller groups like "numpy unpin + deprecations", "numpy C code changes", "rng changes", and "numpy casting/overflow changes".

The code I normally work on has a pretty messy commit history, so I don't really know what is ideal. I can definitely squash the changes for the review so that the commit history looks about the same as the one I copy-pasted above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't speak for @ricardoV94, but I think your commit style is absolutely magnificent! I wish all the PRs I review were structured this nicely. There's a lot of info in the extended commit messages. I think it should go in without squashing.

Copy link
Author

@brendan-m-murphy brendan-m-murphy Feb 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maresb thanks!

@ricardoV94 This ended up being slightly more complicated than ravelling, so I added a helper function to pytensor/npy_2_compat.py to mimic the old behavior of np.unique.

Also, apparently, the behavior for np.unique was different for all values of axis in numpy 2.0.0, but from numpy 2.0.1 on, the inverse indices will have the same shape as the input array only when axis is None. So, perhaps when numpy < 2.0 support is deprecated, the minimum version of numpy should be at least 2.0.1, or else there will be more cases to handle for Unique.

@@ -101,7 +101,7 @@ def __init__(
if str(dtype) == "floatX":
self.dtype = config.floatX
else:
if np.obj2sctype(dtype) is None:
if np.dtype(dtype).type is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a try/except now? np.dtype does not return None for invalid types, it raises

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it looks like this was done in one case, but I missed this one.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to adding the try/except, I changed this around a bit, since np.dtype(dtype).type is a subclass of np.generic, while the HasDataType protocol says that self.dtype should be a string. The value self.dtype = str(np.dtype(dtype)) works the same as np.dtype(dtype).type, so I've switched it to use that.

assert max(n).dtype == dtype
i_max = eval_outputs(max(n))
assert i_max == itype.max

@pytest.mark.xfail(reason="Fails due to #770")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can have it combined with the parametrize above and skip just uint64 if you use pytest.params(...). You'll find other uses in the codebase

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I didn't know you could do that.

assert i_max == data.max()

@pytest.mark.xfail(reason="Fails due to #770")
def test_uint64_special_value(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Context for this new test?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the actual example from issue #770. I split up the previous test because it was only failing for uint64, but I wanted to check if the example from that issue was failing as well. I can add a docstring to the test, or remove it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to keep it. This doesn't make sense to merge with a pytest.param because it's built differently than the previous tests right?

@ricardoV94 ricardoV94 changed the title Numpy 2.0 updates Make PyTensor compatible with numpy 2.0 Feb 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Test on numpy 2.0
5 participants