Computing echelon form of smaller matrix takes longer than larger matrix? #2129

user202729 · 2024-12-25T07:10:42Z

Use flint's algorithm through SageMath's interface (which uses fmpq_mat_rref under the hood).

sage: entry_size = 10000
....: num_col = 20
....: num_row = 20
....: A = matrix(QQ, [[randint(1, 2^entry_size) for col in range(num_col)] for row in range(num_row)
....: ])
....: t = walltime()
....: A.echelonize(algorithm="flint")
....: t = walltime(t)
....: t
1.3925843238830566
sage: entry_size = 10000
....: num_col = 40
....: num_row = 40
....: A = matrix(QQ, [[randint(1, 2^entry_size) for col in range(num_col)] for row in range(num_row)
....: ])
....: t = walltime()
....: A.echelonize(algorithm="flint")
....: t = walltime(t)
....: t
0.022356271743774414

Why is it that in the first case with a 20 × 20 matrix it takes >1 second while in the second case it is instant?

In both cases the matrix is invertible, thus the echelon form is identity.

(larger context: sagemath/sage#39197 , if flint is the fastest in all cases it would be easiest to just switch to flint, but this is not the case at the moment)

The text was updated successfully, but these errors were encountered:

albinahlback · 2024-12-25T17:06:49Z

Can you replicate this in C, that is, make a MWE? I am not familiar with how Python works under the hood.

albinahlback · 2024-12-25T17:10:48Z

Perhaps the tuning parameters could be changed for fmpz_mat_rref. Nonetheless, they should probably be tuned on a hardware-level.

user202729 · 2024-12-26T02:27:22Z

Sure:

#include <flint/flint.h>
#include <flint/fmpz.h>
#include <flint/fmpz_mat.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

void random_fmpz_mat(fmpz_mat_t mat, slong entry_size) {
    slong i, j;
    flint_rand_t state;
    flint_randinit(state);
    for (i = 0; i < fmpz_mat_nrows(mat); i++) {
        for (j = 0; j < fmpz_mat_ncols(mat); j++) {
            fmpz* tmp = fmpz_mat_entry(mat, i, j);
            fmpz_randbits(tmp, state, entry_size);
            fmpz_add_ui(tmp, tmp, 1);
        }
    }
}

int main() {
    slong entry_size = 10000;
    slong num_col = 20, num_row = 20;
    //slong num_col = 40, num_row = 40;

    fmpz_mat_t A;
    fmpz_mat_init(A, num_row, num_col);

    random_fmpz_mat(A, entry_size);

    clock_t start = clock();
    fmpz den;
    fmpz_init(&den);
    fmpz_mat_rref(A, &den, A);
    clock_t end = clock();

    double time_taken = (double)(end - start) / CLOCKS_PER_SEC;
    printf("Time taken: %f seconds\n", time_taken);

    fmpz_mat_clear(A);

    return 0;
}

Change the parameter between 20 and 40 at the commented out line.

albinahlback · 2024-12-26T16:28:56Z

I would guess fmpz_mat_rref is suboptimal in the sense that it does not take the entry sizes into consideration. I don't know about the complexity of the operation, but it looks like the the entry sizes where not taken into consideration.

albinahlback · 2024-12-26T16:30:04Z

I suppose @fredrik-johansson is the expert here. Perhaps he can answer when he is back from vacation and has time.

fredrik-johansson · 2024-12-30T11:19:53Z

The algorithm selection in fmpz_mat_rref for switching between rref_fflu and rref_mul currently only looks at the dimension of the matrix and doesn't account for big input entries. More importantly, it doesn't account for small output, e.g. the identity matrix, where a single mod p suffices to determine the rref. Note that if you take 41 instead of 40 columns so that the rref is nontrivial, things are much slower.

user202729 · 2025-01-02T02:48:37Z

(for what it's worth, the context is the current algorithm in SageMath is slow in certain cases sagemath/sage#39204 and I think the easiest way is to switch to flint entirely — but flint's algorithm is slower sometimes)

edgarcosta · 2025-01-06T18:39:57Z

I have the idea that one of the best approaches for rref over the rationals (or a number field) is a multimodular one.
This might be easier to prototype in sage than in flint.

fredrik-johansson · 2025-01-06T20:16:38Z

fmpz_mat_rref_mul is a multimodular algorithm.

edgarcosta · 2025-01-06T21:43:35Z

oopsidaisy :)

Nonetheless, they should probably be tuned on a hardware-level.

While we should certainly try to provide this option I think most users using flint use it via some distribution system, and thus one can't tune the parameters depending on the hardware of the final user.

@williamstein

sagemathgh-39204: Speed up multimodular algorithm in bad case **Edit**: Now flint has fixed flintlib/flint#2129 , it should be better to just switch to flint entirely — according to flintlib/flint#2129 (comment) , one of the possible algorithms by flint is multimodular, which should be faster than or equal to what we're having now. If it is slower in any case, bug can be reported upstream. ------ Related to sagemath#39197. Technically the algorithm doesn't deviate from @williamstein 's original book; however the original book doesn't say *how many* additional primes to add each time. The original implementation roughly consider 3 more primes each time. This can be highly inefficient when there are more columns than rows, which makes the result's height much higher than the guess. This increases the length of `M` by roughly a factor of `1.2` each time. Worst case it makes the algorithm slower by a (hopefully small) constant factor. For the added test case, it appears to improve the performance. (Originally takes 40s, now takes <10s on my machine) ### 📝 Checklist  - [x] The title is concise and informative. - [x] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. - [x] I have created tests covering the changes. - [ ] I have updated the documentation and checked the documentation preview. ### ⌛ Dependencies    URL: sagemath#39204 Reported by: user202729 Reviewer(s): Travis Scrimshaw

@williamstein

sagemathgh-39204: Speed up multimodular algorithm in bad case **Edit**: Now flint has fixed flintlib/flint#2129 , it should be better to just switch to flint entirely — according to flintlib/flint#2129 (comment) , one of the possible algorithms by flint is multimodular, which should be faster than or equal to what we're having now. If it is slower in any case, bug can be reported upstream. ------ Related to sagemath#39197. Technically the algorithm doesn't deviate from @williamstein 's original book; however the original book doesn't say *how many* additional primes to add each time. The original implementation roughly consider 3 more primes each time. This can be highly inefficient when there are more columns than rows, which makes the result's height much higher than the guess. This increases the length of `M` by roughly a factor of `1.2` each time. Worst case it makes the algorithm slower by a (hopefully small) constant factor. For the added test case, it appears to improve the performance. (Originally takes 40s, now takes <10s on my machine) ### 📝 Checklist  - [x] The title is concise and informative. - [x] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. - [x] I have created tests covering the changes. - [ ] I have updated the documentation and checked the documentation preview. ### ⌛ Dependencies    URL: sagemath#39204 Reported by: user202729 Reviewer(s): Travis Scrimshaw

@williamstein

sagemathgh-39204: Speed up multimodular algorithm in bad case **Edit**: Now flint has fixed flintlib/flint#2129 , it should be better to just switch to flint entirely — according to flintlib/flint#2129 (comment) , one of the possible algorithms by flint is multimodular, which should be faster than or equal to what we're having now. If it is slower in any case, bug can be reported upstream. ------ Related to sagemath#39197. Technically the algorithm doesn't deviate from @williamstein 's original book; however the original book doesn't say *how many* additional primes to add each time. The original implementation roughly consider 3 more primes each time. This can be highly inefficient when there are more columns than rows, which makes the result's height much higher than the guess. This increases the length of `M` by roughly a factor of `1.2` each time. Worst case it makes the algorithm slower by a (hopefully small) constant factor. For the added test case, it appears to improve the performance. (Originally takes 40s, now takes <10s on my machine) ### 📝 Checklist  - [x] The title is concise and informative. - [x] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. - [x] I have created tests covering the changes. - [ ] I have updated the documentation and checked the documentation preview. ### ⌛ Dependencies    URL: sagemath#39204 Reported by: user202729 Reviewer(s): Travis Scrimshaw

@williamstein

sagemathgh-39204: Speed up multimodular algorithm in bad case **Edit**: Now flint has fixed flintlib/flint#2129 , it should be better to just switch to flint entirely — according to flintlib/flint#2129 (comment) , one of the possible algorithms by flint is multimodular, which should be faster than or equal to what we're having now. If it is slower in any case, bug can be reported upstream. ------ Related to sagemath#39197. Technically the algorithm doesn't deviate from @williamstein 's original book; however the original book doesn't say *how many* additional primes to add each time. The original implementation roughly consider 3 more primes each time. This can be highly inefficient when there are more columns than rows, which makes the result's height much higher than the guess. This increases the length of `M` by roughly a factor of `1.2` each time. Worst case it makes the algorithm slower by a (hopefully small) constant factor. For the added test case, it appears to improve the performance. (Originally takes 40s, now takes <10s on my machine) ### 📝 Checklist  - [x] The title is concise and informative. - [x] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. - [x] I have created tests covering the changes. - [ ] I have updated the documentation and checked the documentation preview. ### ⌛ Dependencies    URL: sagemath#39204 Reported by: user202729 Reviewer(s): Travis Scrimshaw

@williamstein

sagemathgh-39204: Speed up multimodular algorithm in bad case **Edit**: Now flint has fixed flintlib/flint#2129 , it should be better to just switch to flint entirely — according to flintlib/flint#2129 (comment) , one of the possible algorithms by flint is multimodular, which should be faster than or equal to what we're having now. If it is slower in any case, bug can be reported upstream. ------ Related to sagemath#39197. Technically the algorithm doesn't deviate from @williamstein 's original book; however the original book doesn't say *how many* additional primes to add each time. The original implementation roughly consider 3 more primes each time. This can be highly inefficient when there are more columns than rows, which makes the result's height much higher than the guess. This increases the length of `M` by roughly a factor of `1.2` each time. Worst case it makes the algorithm slower by a (hopefully small) constant factor. For the added test case, it appears to improve the performance. (Originally takes 40s, now takes <10s on my machine) ### 📝 Checklist  - [x] The title is concise and informative. - [x] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. - [x] I have created tests covering the changes. - [ ] I have updated the documentation and checked the documentation preview. ### ⌛ Dependencies    URL: sagemath#39204 Reported by: user202729 Reviewer(s): Travis Scrimshaw

sagemathgh-39733: Make rational matrix rref default to flint_multimodular, add suboptions for flint algorithm Because one of the algorithms used by flint is multimodular, it ought to be faster than the implementation in Python. At least after we upgrade to a version after flintlib/flint#2129 . (p/s: if someone uses the old version, the current choice of flint might be slower in some cases, see the linked issue. An alternative which is likely always faster is to explicitly use the multimodular algorithm in flint. Do you think the current implementation is fine, or should we provide an explicit `flint_multimodular` option instead?) Fixes sagemath#39197 ### 📝 Checklist  - [x] The title is concise and informative. - [x] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. - [ ] I have created tests covering the changes. - [ ] I have updated the documentation and checked the documentation preview. ### ⌛ Dependencies    sagemath#39204 URL: sagemath#39733 Reported by: user202729 Reviewer(s): Travis Scrimshaw

@williamstein

sagemathgh-39204: Speed up multimodular algorithm in bad case **Edit**: Now flint has fixed flintlib/flint#2129 , it should be better to just switch to flint entirely — according to flintlib/flint#2129 (comment) , one of the possible algorithms by flint is multimodular, which should be faster than or equal to what we're having now. If it is slower in any case, bug can be reported upstream. ------ Related to sagemath#39197. Technically the algorithm doesn't deviate from @williamstein 's original book; however the original book doesn't say *how many* additional primes to add each time. The original implementation roughly consider 3 more primes each time. This can be highly inefficient when there are more columns than rows, which makes the result's height much higher than the guess. This increases the length of `M` by roughly a factor of `1.2` each time. Worst case it makes the algorithm slower by a (hopefully small) constant factor. For the added test case, it appears to improve the performance. (Originally takes 40s, now takes <10s on my machine) ### 📝 Checklist  - [x] The title is concise and informative. - [x] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. - [x] I have created tests covering the changes. - [ ] I have updated the documentation and checked the documentation preview. ### ⌛ Dependencies    URL: sagemath#39204 Reported by: user202729 Reviewer(s): Travis Scrimshaw

sagemathgh-39733: Make rational matrix rref default to flint_multimodular, add suboptions for flint algorithm Because one of the algorithms used by flint is multimodular, it ought to be faster than the implementation in Python. At least after we upgrade to a version after flintlib/flint#2129 . (p/s: if someone uses the old version, the current choice of flint might be slower in some cases, see the linked issue. An alternative which is likely always faster is to explicitly use the multimodular algorithm in flint. Do you think the current implementation is fine, or should we provide an explicit `flint_multimodular` option instead?) Fixes sagemath#39197 ### 📝 Checklist  - [x] The title is concise and informative. - [x] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. - [ ] I have created tests covering the changes. - [ ] I have updated the documentation and checked the documentation preview. ### ⌛ Dependencies    sagemath#39204 URL: sagemath#39733 Reported by: user202729 Reviewer(s): Travis Scrimshaw

fredrik-johansson mentioned this issue Jan 10, 2025

Improve algorithm selection in fmpz_mat_rref #2144

Merged

fredrik-johansson closed this as completed in #2144 Jan 10, 2025

user202729 mentioned this issue Feb 13, 2025

Speed up multimodular algorithm in bad case sagemath/sage#39204

Merged

5 tasks

user202729 mentioned this issue Mar 18, 2025

Make rational matrix rref default to flint_multimodular, add suboptions for flint algorithm sagemath/sage#39733

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Computing echelon form of smaller matrix takes longer than larger matrix? #2129

Computing echelon form of smaller matrix takes longer than larger matrix? #2129

user202729 commented Dec 25, 2024 •

edited

Loading

albinahlback commented Dec 25, 2024

albinahlback commented Dec 25, 2024

user202729 commented Dec 26, 2024

albinahlback commented Dec 26, 2024

albinahlback commented Dec 26, 2024

fredrik-johansson commented Dec 30, 2024

user202729 commented Jan 2, 2025

edgarcosta commented Jan 6, 2025

fredrik-johansson commented Jan 6, 2025

edgarcosta commented Jan 6, 2025

Computing echelon form of smaller matrix takes longer than larger matrix? #2129

Computing echelon form of smaller matrix takes longer than larger matrix? #2129

Comments

user202729 commented Dec 25, 2024 • edited Loading

albinahlback commented Dec 25, 2024

albinahlback commented Dec 25, 2024

user202729 commented Dec 26, 2024

albinahlback commented Dec 26, 2024

albinahlback commented Dec 26, 2024

fredrik-johansson commented Dec 30, 2024

user202729 commented Jan 2, 2025

edgarcosta commented Jan 6, 2025

fredrik-johansson commented Jan 6, 2025

edgarcosta commented Jan 6, 2025

user202729 commented Dec 25, 2024 •

edited

Loading