Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve brillig execution speed by unrolling small loops #6470

Closed
TomAFrench opened this issue Nov 7, 2024 · 0 comments · Fixed by #6505
Closed

Improve brillig execution speed by unrolling small loops #6470

TomAFrench opened this issue Nov 7, 2024 · 0 comments · Fixed by #6505
Assignees

Comments

@TomAFrench
Copy link
Member

TomAFrench commented Nov 7, 2024

unconstrained fn __validate_gt_remainder(a_u60: [u64; 6]) -> [u64; 6] {
    let mut result_u60: [u64; 6] = [0; 6];

    for i in 0..6 {
        result_u60[i] = a_u60[i] + 1;
    }

    result_u60
}

Here we've got a for-loop with a small and simple body, this function compiles to:

brillig(inline) fn __validate_gt_remainder f2 {
  b0(v0: [u64; 6]):
    inc_rc v0
    inc_rc [u64 0, u64 0, u64 0, u64 0, u64 0, u64 0]
    v4 = allocate
    store [u64 0, u64 0, u64 0, u64 0, u64 0, u64 0] at v4
    jmp b1(u32 0)                                                 // loop boilerplate
  b1(v1: u32):
    v7 = lt v1, u32 6                                             // loop boilerplate
    jmpif v7 then: b3, else: b2                                   // loop boilerplate
  b3():
    v9 = load v4                                                  // loop boilerplate
    v10 = array_get v0, index v1                                  // logic
    v12 = add v10, u64 1                                          // logic
    v13 = array_set v9, index v1, value v12                       // logic
    v15 = add v1, u32 1                                           // loop boilerplate
    store v13 at v4                                               // loop boilerplate
    v16 = add v1, u32 1                                           // duplicate instruction
    jmp b1(v16)                                                   // loop boilerplate
  b2():
    v8 = load v4                                                  // loop boilerplate
    dec_rc v0
    return v8
}

We've then got 3 instructions of actual logic and 8 of boilerplate and when it comes to execution we have 3*num_iterations instructions of actual logic and 2+6*num_iterations of boilerplate. This gives us a tradeoff between bytecode size and execution speed.

We should aim to unroll any loops where loop_instructions * num_iterations <= loop_instructions + 8 => loop_instructions * (num_iterations - 1) <= 8 as this will reduce bytecode sizes. There's a decent chance that we'd get benefits for unrolling larger loops as well due to constant folding.

@github-project-automation github-project-automation bot moved this to 📋 Backlog in Noir Nov 7, 2024
@github-project-automation github-project-automation bot moved this from 📋 Backlog to ✅ Done in Noir Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants