Incorrect answers given for ranks of large matrices. #35885

elarson314 · 2023-07-02T22:10:52Z

Is there an existing issue for this?

I have searched the existing issues for a bug report that matches the one I want to file, without success.

Did you read the documentation and troubleshoot guide?

I have read the documentation and troubleshoot guide

Environment

- **OS**: RedHat EL 7.7
- **Sage Version**: 9.5

Steps To Reproduce

On a machine with ample RAM, compute the rank of a matrix with more than 2^32 entries. For example, a Vandermonde matrix like so:

def vandermonde_rank(x, y):
   F = GF(8388593)
   M = matrix(F, x, y)

   for i in range(x):
      row = []
      i = F(i)
      ij = F(1)
      for j in range(y):
         row.append(ij)
         ij *= i
      M[i] = row

   M.echelonize()
   return M.rank()

print(vandermonde_rank(65537, 65537))

Expected Behavior

The rank should be computed correctly (or, failing that, an error should be raised saying that the matrix is too large).

Actual Behavior

The rank is computed incorrectly once there are more than 2^32 entries. For example, the above code prints 65536 for the rank of the 65537 x 65537 Vandermonde matrix. The user is not warned that the answer might not be correct.

Additional Information

No response

The text was updated successfully, but these errors were encountered:

jhpalmieri · 2023-10-19T21:51:29Z

Here's something:

sage: M = identity_matrix(GF(8388593), 10)
sage: type(M.rank())
<class 'sage.rings.integer.Integer'>
sage: M.echelonize()
sage: type(M.rank())
<class 'int'>

The echelonize method caches the rank as part of its computations, but it caches a Python int rather than a Sage Integer.

schmittj · 2025-03-11T11:31:47Z

In joint work with S. Canning and H. Larson we also ran into the issue above. Investigating a bit, I do have some more evidence on what is happening. To see the issue, run the following modified code, which just returns the Vandermonde matrix:

def vandermonde_rank(x, y):
   F = GF(8388593)
   M = matrix(F, x, y)

   for i in range(x):
      print(i)
      row = []
      i = F(i)
      ij = F(1)
      for j in range(y):
         row.append(ij)
         ij *= i
      M[i] = row
   return M

M = vandermonde_rank(65537, 65537)

Then if we just print a few entries of M, we can see something strange happening:

sage: M
65537 x 65537 dense matrix over Finite Field of size 8388593 (use the '.str()' method to see the entries)
sage: indexes = range(5)
sage: for i in indexes:
....:     for j in indexes:
....:         print(i, j, M[i,j])
....:
0 0 1
0 1 0
0 2 0
0 3 0
0 4 0
1 0 65536
1 1 7680
1 2 900
1 3 262249
1 4 6912000
2 0 1
2 1 2
2 2 4
2 3 8
2 4 16
3 0 1
3 1 3
3 2 9
3 3 27
3 4 81
4 0 1
4 1 4
4 2 16
4 3 64
4 4 256
sage: max(j for j in range(65537) if M[1,j] != 1)
65535
sage: M[1,65535]
8388589
sage: M[1,65536]
1
sage: max(j for j in range(65537) if M[1,j] != 1)
65535
sage: M[1,65535]
8388589
sage: M[1,65536]
1
sage: GF(8388593)(65536)^2
7680
sage: GF(8388593)(65536)^65536
8388589

It appears that entries of the matrix M get overwritten (essentially the last row of the matrix gets written into its row with index 1). Possibly the problem is related to using C int variables inside
https://github.com/sagemath/sage/blob/develop/src/sage/matrix/matrix0.pyx

I am not super familiar with Cython, but that's what Claude seems to think: https://claude.ai/share/7eb823ff-44da-4a6a-b155-2495a0de77e1

@videlec Do you know what the best way is to proceed? This seems like a rather fundamental issue.

schmittj · 2025-03-11T11:35:54Z

Oh, for a minimal example of the error (not waiting an hour for the Vandermonde), just run the following code:

sage: F = GF(8388593)
sage: M = matrix(F, 65537, 65537)
sage: M[1,0]
0
sage: M[65536,1]=1
sage: M[1,0]
1

videlec · 2025-03-11T11:45:07Z

Indeed, there is possibly a too naive cast in __getitem__ lines 948-949

        cdef int nrows = self._nrows
        cdef int ncols = self._ncols

Both _nrows and _ncols attributes are of type Py_ssize_t which is supposed to be safer with respect to overflow. Though it feels very strange to me that on your machine the C type int can not handle numbers larger than 65536=2^16 (which is furthermore a super weird limit for a signed integer). My feeling is that it might rather be a problem coming from the underlying C implementation of matrices over GF(8388593).

Though, I can not reproduce your example on my machine where sage 10.5 is installed. I got instead the following error message

sage: F = GF(8388593)
sage: M = matrix(F, 65537, 65537)
Traceback (most recent call last):
...
MemoryError: failed to allocate 4295098369 * 8 bytes

schmittj · 2025-03-11T11:48:39Z

About your experiment: you need a machine with 34 GB of free RAM, the above just says that you don't have enough memory available I think.

What Claude was suspecting is that some get_unsafe or set_unsafe method essentially calls something like

return underlying_C_array[i * ncols + j]

If the number i*ncols + j is a C int and greater than 2^32, this would create problems. I did not immediately find where the code of get_unsafe or set_unsafe for these types of matrices is stored.

videlec · 2025-03-11T16:13:15Z

There is such a call in matrix_generic_dense.pyx lines 99-100, though there does not seem to be dangerous casts here and I don't think your matrix is of this type. Could you share the output of

sage: type(matrix(GF(8388593), 65537, 65537))

videlec · 2025-03-11T16:18:03Z

If your matrix happen to be a Matrix_modn_dense, one possible culprit is the snippet at 452-457 of sage/matrix/matrix_modn_dense_template.pxi

        cdef unsigned int k
        cdef Py_ssize_t i
        k = 0
        for i in range(self._nrows):
            self._matrix[i] = self._entries + k
            k = k + self._ncols

The k should have been declared as something safer than unsigned int (and actually there is no need for k at all here). I can provide a patch but I have no easy way to test it. Do you know how to compile a sage branch locally?

schmittj · 2025-03-11T16:38:15Z

Hey, here is the output of the requested command:

sage: sage: F = GF(8388593)
sage: sage: M = matrix(F, 65537, 65537)
sage: sage: type(matrix(GF(8388593), 65537, 65537))
<class 'sage.matrix.matrix_modn_dense_double.Matrix_modn_dense_double'>

I have compiled a sage branch from scratch before - if you provide a patch branch I can probably manage again. The large server I have access to runs a recent version of Fedora, so probably that should be manageable. So if you provide the fix, I'll try to test it!

I guess you no longer have access to the MPI-Bonn servers?

videlec · 2025-03-11T16:50:19Z

The PR is at #39671 (I do not know yet whether it compiles, so maybe you can wait for that).

sagemathgh-39671: fix overflow in Matrix_modn_dense Replace use of `unsigned int` in matrix indices in favor of `Py_ssize_t` to avoid overflow. Fixes sagemath#35885 URL: sagemath#39671 Reported by: Vincent Delecroix Reviewer(s): Dima Pasechnik, schmittj, Vincent Delecroix

elarson314 added the t: bug label Jul 2, 2023

elarson314 changed the title ~~<title>~~ Incorrect answers given for ranks of large matrices. Jul 2, 2023

mkoeppe added c: linear algebra c: finite rings labels Mar 2, 2024

videlec mentioned this issue Mar 11, 2025

fix overflow in Matrix_modn_dense #39671

Merged

vbraun closed this as completed in 0d90b5c Mar 22, 2025

vbraun closed this as completed in #39671 Mar 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect answers given for ranks of large matrices. #35885

Incorrect answers given for ranks of large matrices. #35885

elarson314 commented Jul 2, 2023 •

edited

Loading

jhpalmieri commented Oct 19, 2023

schmittj commented Mar 11, 2025

schmittj commented Mar 11, 2025

videlec commented Mar 11, 2025

schmittj commented Mar 11, 2025

videlec commented Mar 11, 2025

videlec commented Mar 11, 2025

schmittj commented Mar 11, 2025

videlec commented Mar 11, 2025

Incorrect answers given for ranks of large matrices. #35885

Incorrect answers given for ranks of large matrices. #35885

Comments

elarson314 commented Jul 2, 2023 • edited Loading

Is there an existing issue for this?

Did you read the documentation and troubleshoot guide?

Environment

Steps To Reproduce

Expected Behavior

Actual Behavior

Additional Information

jhpalmieri commented Oct 19, 2023

schmittj commented Mar 11, 2025

schmittj commented Mar 11, 2025

videlec commented Mar 11, 2025

schmittj commented Mar 11, 2025

videlec commented Mar 11, 2025

videlec commented Mar 11, 2025

schmittj commented Mar 11, 2025

videlec commented Mar 11, 2025

elarson314 commented Jul 2, 2023 •

edited

Loading