Skip to content

Computing echelon form of smaller matrix takes longer than larger matrix? #2129

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
user202729 opened this issue Dec 25, 2024 · 10 comments · Fixed by #2144 or sagemath/sage#39204
Closed

Comments

@user202729
Copy link
Contributor

user202729 commented Dec 25, 2024

Use flint's algorithm through SageMath's interface (which uses fmpq_mat_rref under the hood).

sage: entry_size = 10000
....: num_col = 20
....: num_row = 20
....: A = matrix(QQ, [[randint(1, 2^entry_size) for col in range(num_col)] for row in range(num_row)
....: ])
....: t = walltime()
....: A.echelonize(algorithm="flint")
....: t = walltime(t)
....: t
1.3925843238830566
sage: entry_size = 10000
....: num_col = 40
....: num_row = 40
....: A = matrix(QQ, [[randint(1, 2^entry_size) for col in range(num_col)] for row in range(num_row)
....: ])
....: t = walltime()
....: A.echelonize(algorithm="flint")
....: t = walltime(t)
....: t
0.022356271743774414

Why is it that in the first case with a 20 × 20 matrix it takes >1 second while in the second case it is instant?

In both cases the matrix is invertible, thus the echelon form is identity.

(larger context: sagemath/sage#39197 , if flint is the fastest in all cases it would be easiest to just switch to flint, but this is not the case at the moment)

@albinahlback
Copy link
Collaborator

Can you replicate this in C, that is, make a MWE? I am not familiar with how Python works under the hood.

@albinahlback
Copy link
Collaborator

Perhaps the tuning parameters could be changed for fmpz_mat_rref. Nonetheless, they should probably be tuned on a hardware-level.

@user202729
Copy link
Contributor Author

Sure:

#include <flint/flint.h>
#include <flint/fmpz.h>
#include <flint/fmpz_mat.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

void random_fmpz_mat(fmpz_mat_t mat, slong entry_size) {
    slong i, j;
    flint_rand_t state;
    flint_randinit(state);
    for (i = 0; i < fmpz_mat_nrows(mat); i++) {
        for (j = 0; j < fmpz_mat_ncols(mat); j++) {
            fmpz* tmp = fmpz_mat_entry(mat, i, j);
            fmpz_randbits(tmp, state, entry_size);
            fmpz_add_ui(tmp, tmp, 1);
        }
    }
}

int main() {
    slong entry_size = 10000;
    slong num_col = 20, num_row = 20;
    //slong num_col = 40, num_row = 40;

    fmpz_mat_t A;
    fmpz_mat_init(A, num_row, num_col);

    random_fmpz_mat(A, entry_size);

    clock_t start = clock();
    fmpz den;
    fmpz_init(&den);
    fmpz_mat_rref(A, &den, A);
    clock_t end = clock();

    double time_taken = (double)(end - start) / CLOCKS_PER_SEC;
    printf("Time taken: %f seconds\n", time_taken);

    fmpz_mat_clear(A);

    return 0;
}

Change the parameter between 20 and 40 at the commented out line.

@albinahlback
Copy link
Collaborator

I would guess fmpz_mat_rref is suboptimal in the sense that it does not take the entry sizes into consideration. I don't know about the complexity of the operation, but it looks like the the entry sizes where not taken into consideration.

@albinahlback
Copy link
Collaborator

I suppose @fredrik-johansson is the expert here. Perhaps he can answer when he is back from vacation and has time.

@fredrik-johansson
Copy link
Collaborator

The algorithm selection in fmpz_mat_rref for switching between rref_fflu and rref_mul currently only looks at the dimension of the matrix and doesn't account for big input entries. More importantly, it doesn't account for small output, e.g. the identity matrix, where a single mod p suffices to determine the rref. Note that if you take 41 instead of 40 columns so that the rref is nontrivial, things are much slower.

@user202729
Copy link
Contributor Author

(for what it's worth, the context is the current algorithm in SageMath is slow in certain cases sagemath/sage#39204 and I think the easiest way is to switch to flint entirely — but flint's algorithm is slower sometimes)

@edgarcosta
Copy link
Member

I have the idea that one of the best approaches for rref over the rationals (or a number field) is a multimodular one.
This might be easier to prototype in sage than in flint.

@fredrik-johansson
Copy link
Collaborator

fmpz_mat_rref_mul is a multimodular algorithm.

@edgarcosta
Copy link
Member

oopsidaisy :)

Nonetheless, they should probably be tuned on a hardware-level.

While we should certainly try to provide this option I think most users using flint use it via some distribution system, and thus one can't tune the parameters depending on the hardware of the final user.

vbraun pushed a commit to vbraun/sage that referenced this issue Apr 4, 2025
sagemathgh-39204: Speed up multimodular algorithm in bad case
    
**Edit**: Now flint has fixed
flintlib/flint#2129 , it should be better to
just switch to flint entirely — according to
flintlib/flint#2129 (comment) ,
one of the possible algorithms by flint is multimodular, which should be
faster than or equal to what we're having now. If it is slower in any
case, bug can be reported upstream.

------

Related to sagemath#39197.

Technically the algorithm doesn't deviate from @williamstein  's
original book; however the original book doesn't say *how many*
additional primes to add each time.

The original implementation roughly consider 3 more primes each time.
This can be highly inefficient when there are more columns than rows,
which makes the result's height much higher than the guess.

This increases the length of `M` by roughly a factor of `1.2` each time.
Worst case it makes the algorithm slower by a (hopefully small) constant
factor.

For the added test case, it appears to improve the performance.
(Originally takes 40s, now takes <10s on my machine)

### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [x] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#39204
Reported by: user202729
Reviewer(s): Travis Scrimshaw
vbraun pushed a commit to vbraun/sage that referenced this issue Apr 5, 2025
sagemathgh-39204: Speed up multimodular algorithm in bad case
    
**Edit**: Now flint has fixed
flintlib/flint#2129 , it should be better to
just switch to flint entirely — according to
flintlib/flint#2129 (comment) ,
one of the possible algorithms by flint is multimodular, which should be
faster than or equal to what we're having now. If it is slower in any
case, bug can be reported upstream.

------

Related to sagemath#39197.

Technically the algorithm doesn't deviate from @williamstein  's
original book; however the original book doesn't say *how many*
additional primes to add each time.

The original implementation roughly consider 3 more primes each time.
This can be highly inefficient when there are more columns than rows,
which makes the result's height much higher than the guess.

This increases the length of `M` by roughly a factor of `1.2` each time.
Worst case it makes the algorithm slower by a (hopefully small) constant
factor.

For the added test case, it appears to improve the performance.
(Originally takes 40s, now takes <10s on my machine)

### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [x] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#39204
Reported by: user202729
Reviewer(s): Travis Scrimshaw
vbraun pushed a commit to vbraun/sage that referenced this issue Apr 7, 2025
sagemathgh-39204: Speed up multimodular algorithm in bad case
    
**Edit**: Now flint has fixed
flintlib/flint#2129 , it should be better to
just switch to flint entirely — according to
flintlib/flint#2129 (comment) ,
one of the possible algorithms by flint is multimodular, which should be
faster than or equal to what we're having now. If it is slower in any
case, bug can be reported upstream.

------

Related to sagemath#39197.

Technically the algorithm doesn't deviate from @williamstein  's
original book; however the original book doesn't say *how many*
additional primes to add each time.

The original implementation roughly consider 3 more primes each time.
This can be highly inefficient when there are more columns than rows,
which makes the result's height much higher than the guess.

This increases the length of `M` by roughly a factor of `1.2` each time.
Worst case it makes the algorithm slower by a (hopefully small) constant
factor.

For the added test case, it appears to improve the performance.
(Originally takes 40s, now takes <10s on my machine)

### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [x] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#39204
Reported by: user202729
Reviewer(s): Travis Scrimshaw
vbraun pushed a commit to vbraun/sage that referenced this issue Apr 10, 2025
sagemathgh-39204: Speed up multimodular algorithm in bad case
    
**Edit**: Now flint has fixed
flintlib/flint#2129 , it should be better to
just switch to flint entirely — according to
flintlib/flint#2129 (comment) ,
one of the possible algorithms by flint is multimodular, which should be
faster than or equal to what we're having now. If it is slower in any
case, bug can be reported upstream.

------

Related to sagemath#39197.

Technically the algorithm doesn't deviate from @williamstein  's
original book; however the original book doesn't say *how many*
additional primes to add each time.

The original implementation roughly consider 3 more primes each time.
This can be highly inefficient when there are more columns than rows,
which makes the result's height much higher than the guess.

This increases the length of `M` by roughly a factor of `1.2` each time.
Worst case it makes the algorithm slower by a (hopefully small) constant
factor.

For the added test case, it appears to improve the performance.
(Originally takes 40s, now takes <10s on my machine)

### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [x] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#39204
Reported by: user202729
Reviewer(s): Travis Scrimshaw
vbraun pushed a commit to vbraun/sage that referenced this issue Apr 13, 2025
sagemathgh-39204: Speed up multimodular algorithm in bad case
    
**Edit**: Now flint has fixed
flintlib/flint#2129 , it should be better to
just switch to flint entirely — according to
flintlib/flint#2129 (comment) ,
one of the possible algorithms by flint is multimodular, which should be
faster than or equal to what we're having now. If it is slower in any
case, bug can be reported upstream.

------

Related to sagemath#39197.

Technically the algorithm doesn't deviate from @williamstein  's
original book; however the original book doesn't say *how many*
additional primes to add each time.

The original implementation roughly consider 3 more primes each time.
This can be highly inefficient when there are more columns than rows,
which makes the result's height much higher than the guess.

This increases the length of `M` by roughly a factor of `1.2` each time.
Worst case it makes the algorithm slower by a (hopefully small) constant
factor.

For the added test case, it appears to improve the performance.
(Originally takes 40s, now takes <10s on my machine)

### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [x] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#39204
Reported by: user202729
Reviewer(s): Travis Scrimshaw
vbraun pushed a commit to vbraun/sage that referenced this issue Apr 13, 2025
sagemathgh-39733: Make rational matrix rref default to flint_multimodular, add suboptions for flint algorithm
    
Because one of the algorithms used by flint is multimodular, it ought to
be faster than the implementation in Python.

At least after we upgrade to a version after
flintlib/flint#2129 .

(p/s: if someone uses the old version, the current choice of flint might
be slower in some cases, see the linked issue. An alternative which is
likely always faster is to explicitly use the multimodular algorithm in
flint. Do you think the current implementation is fine, or should we
provide an explicit `flint_multimodular` option instead?)

Fixes sagemath#39197

### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->


sagemath#39204
    
URL: sagemath#39733
Reported by: user202729
Reviewer(s): Travis Scrimshaw
vbraun pushed a commit to vbraun/sage that referenced this issue Apr 18, 2025
sagemathgh-39204: Speed up multimodular algorithm in bad case
    
**Edit**: Now flint has fixed
flintlib/flint#2129 , it should be better to
just switch to flint entirely — according to
flintlib/flint#2129 (comment) ,
one of the possible algorithms by flint is multimodular, which should be
faster than or equal to what we're having now. If it is slower in any
case, bug can be reported upstream.

------

Related to sagemath#39197.

Technically the algorithm doesn't deviate from @williamstein  's
original book; however the original book doesn't say *how many*
additional primes to add each time.

The original implementation roughly consider 3 more primes each time.
This can be highly inefficient when there are more columns than rows,
which makes the result's height much higher than the guess.

This increases the length of `M` by roughly a factor of `1.2` each time.
Worst case it makes the algorithm slower by a (hopefully small) constant
factor.

For the added test case, it appears to improve the performance.
(Originally takes 40s, now takes <10s on my machine)

### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [x] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#39204
Reported by: user202729
Reviewer(s): Travis Scrimshaw
vbraun pushed a commit to vbraun/sage that referenced this issue Apr 18, 2025
sagemathgh-39733: Make rational matrix rref default to flint_multimodular, add suboptions for flint algorithm
    
Because one of the algorithms used by flint is multimodular, it ought to
be faster than the implementation in Python.

At least after we upgrade to a version after
flintlib/flint#2129 .

(p/s: if someone uses the old version, the current choice of flint might
be slower in some cases, see the linked issue. An alternative which is
likely always faster is to explicitly use the multimodular algorithm in
flint. Do you think the current implementation is fine, or should we
provide an explicit `flint_multimodular` option instead?)

Fixes sagemath#39197

### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->


sagemath#39204
    
URL: sagemath#39733
Reported by: user202729
Reviewer(s): Travis Scrimshaw
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants