Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Horner evaluation ops #1656

Open
wants to merge 10 commits into
base: next
Choose a base branch
from
Open

Conversation

Al-Kindi-0
Copy link
Collaborator

Closes #1600

Copy link
Contributor

@plafer plafer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Left nits & questions


![rcomb_base](../../assets/design/stack/crypto_ops/RCOMBBASE.png)
![horner_eval_base](../../assets/design/stack/crypto_ops/HORNERBASE.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file doesn't exist yet right? I still see RCOMBASE.png but no HORNERBASE.png

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added them now

Comment on lines +264 to +279
#rcomb_base
push.0 drop
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming you'll update this after we merge #1658?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am now working, in a separate PR against this branch, on updating Falcon and it should be done today. I think it is better if we hold off on merging this one until then.

@@ -68,7 +68,7 @@ impl AuxTraceBuilder {

debug_assert_eq!(*t_chip.last().unwrap(), E::ONE);
// TODO: Fix and re-enable after testing with miden-base
// debug_assert_eq!(*b_chip.last().unwrap(), E::ONE);
debug_assert_eq!(*b_chip.last().unwrap(), E::ONE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean miden-base tests now pass? Was the bug only with the RCOMBBASE instruction then?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All tests passed with the above line uncommented for RCOMBBASE and I uncommented that line to test for the correctness of the implementation for HORNER* ops.
I will put the comment back as I haven't tested with miden-base

processor/src/operations/horner_ops.rs Outdated Show resolved Hide resolved
processor/src/operations/horner_ops.rs Outdated Show resolved Hide resolved
docs/src/design/stack/crypto_ops.md Outdated Show resolved Hide resolved
The effect on the rest of the stack is:
* **No change.**

> TODO: add detailed constraint descriptions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we doing this in this PR?

Copy link
Collaborator Author

@Al-Kindi-0 Al-Kindi-0 Feb 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#1660
I am thinking this might be implemented directly in AirScript

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I would remove this TODO from the docs, as they tend to stick around, and it makes for cleaner docs not to have any in there

processor/src/operations/horner_ops.rs Outdated Show resolved Hide resolved
processor/src/chiplets/aux_trace/mod.rs Outdated Show resolved Hide resolved
processor/src/operations/horner_ops.rs Outdated Show resolved Hide resolved
@Al-Kindi-0 Al-Kindi-0 force-pushed the al-introduce-horner-eval-ops branch from 86cddd7 to 3c6f139 Compare February 11, 2025 13:02
Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you! I left some a couple of questions and a few small comments inline.

Comment on lines +42 to +51
/// +------+------+------+------+------+------+------+------+---+---+---+---+---+----------+------+------+
/// | c0 | c1 | c2 | c3 | c4 | c5 | c6 | c7 | - | - | - | - | - |alpha_addr| acc1 | acc0 |
/// +------+------+------+------+------+------+------+------+---+---+---+---+---+----------+------+------+
///
///
/// Output:
///
/// +------+------+------+------+------+------+------+------+---+---+---+---+---+----------+-------+-------+
/// | c0 | c1 | c2 | c3 | c4 | c5 | c6 | c7 | - | - | - | - | - |alpha_addr| acc1' | acc0' |
/// +------+------+------+------+------+------+------+------+---+---+---+---+---+----------+-------+-------+
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: the order of coefficients seems to be reversed here as compared as how we usually put data on the stack (i.e., least significant element is deep in the stack) - does this matter here?

Specifically, let's say we have memory populated as follows starting at address 0:

[c0, c1, c2, ... c15]

Thus, c15 is the value stored at memory address 15.

So, executing mem_stream operation would give us the following results

# => empty stack

mem_stream
# => [c7, c6, c5, c4, c3, c2, c1, c0, 0, 0, 0, 0, 8, ...]

mem_stream
# => [c15, c14, c13, c12, c11, c10, c9, c8, 0, 0, 0, 0, 16, ...]

Does this arrangement still work fine if c0, ..., c15 contain polynomial coefficients (probably in reverse order)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we discussed this here and in other places, but the general idea is that for Horner evaluation to work we have to have the coefficients in reverse order. Using your example, and for both possible layouts for each batch of 8 base coefficients, there is no way, to me at least, to make a single instruction working on chunks of 8 coefficients that can compute the evaluation of that polynomial in the natural order.

The solution we came up with is to accept that in almost all situations the order of data is [c0, c1, c2, ... c15] , and hence not the right order for Horner evaluation of the polynomial $P(X) = c_{15} X^{15} + \cdots + c_0$, but it is the right Horner-order to evaluate $P(X) = c_{0} X^{15} + \cdots + c_{15}$ (at $\alpha^{-1}$).

The cost for this mismatch is very minimal, a multiplication and an inversion. For an example of this, see the Falcon improvement PR using horner_eval_base.

Comment on lines +110 to +119
/// +------+------+------+------+------+------+------+------+---+---+---+---+---+----------+------+------+
/// | c0_1 | c0_0 | c1_1 | c1_0 | c2_1 | c2_0 | c3_1 | c3_0 | - | - | - | - | - |alpha_addr| acc1 | acc0 |
/// +------+------+------+------+------+------+------+------+---+---+---+---+---+----------+------+------+
///
///
/// Output:
///
/// +------+------+------+------+------+------+------+------+---+---+---+---+---+----------+-------+-------+
/// | c0_1 | c0_0 | c1_1 | c1_0 | c2_1 | c2_0 | c3_1 | c3_0 | - | - | - | - | - |alpha_addr| acc1' | acc0' |
/// +------+------+------+------+------+------+------+------+---+---+---+---+---+----------+-------+-------+
Copy link
Contributor

@bobbinth bobbinth Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as above re the order of elements on the stack.

Comment on lines 194 to 195
/// Returns randomness.
fn get_evaluation_point(&mut self) -> Result<QuadFelt, ExecutionError> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably change the comment here because evaluation point does not need to be a random value (e.g., for some use cases, we may be just evaluating a polynomial at non-random values).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed that in the last feedback round. Fixed now

Comment on lines 65 to 86
// --- read the values of the coefficients, over the base field, from the stack -----------
let coef = self.get_coeff_as_base_elements();

// --- read the evaluation point alpha from memory ----------------------------------------
let alpha = self.get_evaluation_point()?;

// --- read the accumulator values from stack ---------------------------------------------
let acc_old = self.get_accumulator();

// --- compute the updated accumulator values ---------------------------------------------
let acc_new =
coef.iter().rev().fold(acc_old, |acc, coef| QuadFelt::from(*coef) + alpha * acc);

// --- copy the stack ---------------------------------------------------------
self.stack.copy_state(0);

// --- update the accumulators ------------------------------------------------------------
self.stack.set(ACC_HIGH_INDEX, acc_new.to_base_elements()[1]);
self.stack.set(ACC_LOW_INDEX, acc_new.to_base_elements()[0]);

// --- set the helper registers -----------------------------------------------------------
self.populate_helper_registers(alpha);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: given the relative simplicity of this function, I probably wouldn't use dashes in the comments and maybe would even skip some of the trivial ones. For example, it could look something like this:

// read the values of the coefficients, over the base field, from the stack
let coef = self.get_coeff_as_base_elements();

let alpha = self.get_evaluation_point()?;
let acc_old = self.get_accumulator();

// compute the updated accumulator values
let acc_new =
    coef.iter().rev().fold(acc_old, |acc, coef| QuadFelt::from(*coef) + alpha * acc);

// copy over the stack state to the next cycle changing only the accumulator values
self.stack.copy_state(0);
self.stack.set(ACC_HIGH_INDEX, acc_new.to_base_elements()[1]);
self.stack.set(ACC_LOW_INDEX, acc_new.to_base_elements()[0]);

// set the helper registers
self.populate_helper_registers(alpha);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

Comment on lines 133 to 153
// --- read the values of the coefficients, over the extension field, from the stack ------
let coef = self.get_coeff_as_quad_ext_elements();

// --- read the evaluation point alpha from memory ----------------------------------------
let alpha = self.get_evaluation_point()?;

// --- read the accumulator values from stack ---------------------------------------------
let acc_old = self.get_accumulator();

// --- compute the updated accumulator values ---------------------------------------------
let acc_new = coef.iter().rev().fold(acc_old, |acc, coef| *coef + alpha * acc);

// --- copy the stack ---------------------------------------------------------
self.stack.copy_state(0);

// --- update the accumulators ------------------------------------------------------------
self.stack.set(ACC_HIGH_INDEX, acc_new.to_base_elements()[1]);
self.stack.set(ACC_LOW_INDEX, acc_new.to_base_elements()[0]);

// --- set the helper registers -----------------------------------------------------------
self.populate_helper_registers(alpha);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar nit re simplify the comments as above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

alphas,
row,
eval_point_ptr,
[eval_point_0, eval_point_1, ZERO, ZERO],
Copy link
Contributor

@plafer plafer Feb 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An insight gained from #1664 is that the RCOMBBASE request would fail to match the response by the memory chiplet when the positions 2 and 3 here are not ZERO, ZERO. This happened specifically in the falcon signature code. I didn't investigate more, but my assumption is that memory was reused (probably with the use of procedure locals), and that there are leftover garbage values in those positions.

Hence the fix is to fully read the memory word in the helper registers and include it in the bus message. This learning applies generally too: when we read a word, we must read it entirely and include it all in the bus message, since the "zeroed-out memory" assumption doesn't always hold.

Ideally, we'd add a test also to check that when there are non-zero values in positions 2 & 3, that the bus doesn't fail.

EDIT: Upon closer examination, I don't believe the values in memory from the miden-base are garbage. Since this code is changing with #1661, I will stop investigating; we can see if somehow the problem was fixed when switching to horner base eval.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, thinking more, the larger problem in miden-base seems more to be that the data in the advice provider is wrong, such that the pipe doesn't insert 0s where there should be some.

Regardless, it's probably safer to fully read the word anyway instead of assuming the memory was zero'd in those positions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants