Skip to content

Incorrect answers given for ranks of large matrices. #35885

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 tasks done
elarson314 opened this issue Jul 2, 2023 · 9 comments · Fixed by #39671
Closed
2 tasks done

Incorrect answers given for ranks of large matrices. #35885

elarson314 opened this issue Jul 2, 2023 · 9 comments · Fixed by #39671

Comments

@elarson314
Copy link
Contributor

elarson314 commented Jul 2, 2023

Is there an existing issue for this?

  • I have searched the existing issues for a bug report that matches the one I want to file, without success.

Did you read the documentation and troubleshoot guide?

  • I have read the documentation and troubleshoot guide

Environment

- **OS**: RedHat EL 7.7
- **Sage Version**: 9.5

Steps To Reproduce

On a machine with ample RAM, compute the rank of a matrix with more than 2^32 entries. For example, a Vandermonde matrix like so:

def vandermonde_rank(x, y):
   F = GF(8388593)
   M = matrix(F, x, y)

   for i in range(x):
      row = []
      i = F(i)
      ij = F(1)
      for j in range(y):
         row.append(ij)
         ij *= i
      M[i] = row

   M.echelonize()
   return M.rank()

print(vandermonde_rank(65537, 65537))

Expected Behavior

The rank should be computed correctly (or, failing that, an error should be raised saying that the matrix is too large).

Actual Behavior

The rank is computed incorrectly once there are more than 2^32 entries. For example, the above code prints 65536 for the rank of the 65537 x 65537 Vandermonde matrix. The user is not warned that the answer might not be correct.

Additional Information

No response

@elarson314 elarson314 changed the title <title> Incorrect answers given for ranks of large matrices. Jul 2, 2023
@jhpalmieri
Copy link
Member

Here's something:

sage: M = identity_matrix(GF(8388593), 10)
sage: type(M.rank())
<class 'sage.rings.integer.Integer'>
sage: M.echelonize()
sage: type(M.rank())
<class 'int'>

The echelonize method caches the rank as part of its computations, but it caches a Python int rather than a Sage Integer.

@schmittj
Copy link
Contributor

In joint work with S. Canning and H. Larson we also ran into the issue above. Investigating a bit, I do have some more evidence on what is happening. To see the issue, run the following modified code, which just returns the Vandermonde matrix:

def vandermonde_rank(x, y):
   F = GF(8388593)
   M = matrix(F, x, y)

   for i in range(x):
      print(i)
      row = []
      i = F(i)
      ij = F(1)
      for j in range(y):
         row.append(ij)
         ij *= i
      M[i] = row
   return M

M = vandermonde_rank(65537, 65537)

Then if we just print a few entries of M, we can see something strange happening:

sage: M
65537 x 65537 dense matrix over Finite Field of size 8388593 (use the '.str()' method to see the entries)
sage: indexes = range(5)
sage: for i in indexes:
....:     for j in indexes:
....:         print(i, j, M[i,j])
....:
0 0 1
0 1 0
0 2 0
0 3 0
0 4 0
1 0 65536
1 1 7680
1 2 900
1 3 262249
1 4 6912000
2 0 1
2 1 2
2 2 4
2 3 8
2 4 16
3 0 1
3 1 3
3 2 9
3 3 27
3 4 81
4 0 1
4 1 4
4 2 16
4 3 64
4 4 256
sage: max(j for j in range(65537) if M[1,j] != 1)
65535
sage: M[1,65535]
8388589
sage: M[1,65536]
1
sage: max(j for j in range(65537) if M[1,j] != 1)
65535
sage: M[1,65535]
8388589
sage: M[1,65536]
1
sage: GF(8388593)(65536)^2
7680
sage: GF(8388593)(65536)^65536
8388589

It appears that entries of the matrix M get overwritten (essentially the last row of the matrix gets written into its row with index 1). Possibly the problem is related to using C int variables inside
https://github.com/sagemath/sage/blob/develop/src/sage/matrix/matrix0.pyx

I am not super familiar with Cython, but that's what Claude seems to think: https://claude.ai/share/7eb823ff-44da-4a6a-b155-2495a0de77e1

@videlec Do you know what the best way is to proceed? This seems like a rather fundamental issue.

@schmittj
Copy link
Contributor

Oh, for a minimal example of the error (not waiting an hour for the Vandermonde), just run the following code:

sage: F = GF(8388593)
sage: M = matrix(F, 65537, 65537)
sage: M[1,0]
0
sage: M[65536,1]=1
sage: M[1,0]
1

@videlec
Copy link
Contributor

videlec commented Mar 11, 2025

Indeed, there is possibly a too naive cast in __getitem__ lines 948-949

        cdef int nrows = self._nrows
        cdef int ncols = self._ncols

Both _nrows and _ncols attributes are of type Py_ssize_t which is supposed to be safer with respect to overflow. Though it feels very strange to me that on your machine the C type int can not handle numbers larger than 65536=2^16 (which is furthermore a super weird limit for a signed integer). My feeling is that it might rather be a problem coming from the underlying C implementation of matrices over GF(8388593).

Though, I can not reproduce your example on my machine where sage 10.5 is installed. I got instead the following error message

sage: F = GF(8388593)
sage: M = matrix(F, 65537, 65537)
Traceback (most recent call last):
...
MemoryError: failed to allocate 4295098369 * 8 bytes

@schmittj
Copy link
Contributor

About your experiment: you need a machine with 34 GB of free RAM, the above just says that you don't have enough memory available I think.

What Claude was suspecting is that some get_unsafe or set_unsafe method essentially calls something like

return underlying_C_array[i * ncols + j]

If the number i*ncols + j is a C int and greater than 2^32, this would create problems. I did not immediately find where the code of get_unsafe or set_unsafe for these types of matrices is stored.

@videlec
Copy link
Contributor

videlec commented Mar 11, 2025

There is such a call in matrix_generic_dense.pyx lines 99-100, though there does not seem to be dangerous casts here and I don't think your matrix is of this type. Could you share the output of

sage: type(matrix(GF(8388593), 65537, 65537))

@videlec
Copy link
Contributor

videlec commented Mar 11, 2025

If your matrix happen to be a Matrix_modn_dense, one possible culprit is the snippet at 452-457 of sage/matrix/matrix_modn_dense_template.pxi

        cdef unsigned int k
        cdef Py_ssize_t i
        k = 0
        for i in range(self._nrows):
            self._matrix[i] = self._entries + k
            k = k + self._ncols

The k should have been declared as something safer than unsigned int (and actually there is no need for k at all here). I can provide a patch but I have no easy way to test it. Do you know how to compile a sage branch locally?

@schmittj
Copy link
Contributor

Hey, here is the output of the requested command:

sage: sage: F = GF(8388593)
sage: sage: M = matrix(F, 65537, 65537)
sage: sage: type(matrix(GF(8388593), 65537, 65537))
<class 'sage.matrix.matrix_modn_dense_double.Matrix_modn_dense_double'>

I have compiled a sage branch from scratch before - if you provide a patch branch I can probably manage again. The large server I have access to runs a recent version of Fedora, so probably that should be manageable. So if you provide the fix, I'll try to test it!

I guess you no longer have access to the MPI-Bonn servers?

@videlec
Copy link
Contributor

videlec commented Mar 11, 2025

The PR is at #39671 (I do not know yet whether it compiles, so maybe you can wait for that).

vbraun pushed a commit to vbraun/sage that referenced this issue Mar 19, 2025
sagemathgh-39671: fix overflow in Matrix_modn_dense
    
Replace use of `unsigned int` in matrix indices in favor of `Py_ssize_t`
to avoid overflow.

Fixes sagemath#35885
    
URL: sagemath#39671
Reported by: Vincent Delecroix
Reviewer(s): Dima Pasechnik, schmittj, Vincent Delecroix
@vbraun vbraun closed this as completed in 0d90b5c Mar 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants