-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce Horner evaluation ops #1656
base: next
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Left nits & questions
|
||
![rcomb_base](../../assets/design/stack/crypto_ops/RCOMBBASE.png) | ||
![horner_eval_base](../../assets/design/stack/crypto_ops/HORNERBASE.png) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file doesn't exist yet right? I still see RCOMBASE.png
but no HORNERBASE.png
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added them now
#rcomb_base | ||
push.0 drop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming you'll update this after we merge #1658?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am now working, in a separate PR against this branch, on updating Falcon and it should be done today. I think it is better if we hold off on merging this one until then.
@@ -68,7 +68,7 @@ impl AuxTraceBuilder { | |||
|
|||
debug_assert_eq!(*t_chip.last().unwrap(), E::ONE); | |||
// TODO: Fix and re-enable after testing with miden-base | |||
// debug_assert_eq!(*b_chip.last().unwrap(), E::ONE); | |||
debug_assert_eq!(*b_chip.last().unwrap(), E::ONE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean miden-base
tests now pass? Was the bug only with the RCOMBBASE
instruction then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All tests passed with the above line uncommented for RCOMBBASE
and I uncommented that line to test for the correctness of the implementation for HORNER*
ops.
I will put the comment back as I haven't tested with miden-base
docs/src/design/stack/crypto_ops.md
Outdated
The effect on the rest of the stack is: | ||
* **No change.** | ||
|
||
> TODO: add detailed constraint descriptions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we doing this in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#1660
I am thinking this might be implemented directly in AirScript
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I would remove this TODO
from the docs, as they tend to stick around, and it makes for cleaner docs not to have any in there
86cddd7
to
3c6f139
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thank you! I left some a couple of questions and a few small comments inline.
/// +------+------+------+------+------+------+------+------+---+---+---+---+---+----------+------+------+ | ||
/// | c0 | c1 | c2 | c3 | c4 | c5 | c6 | c7 | - | - | - | - | - |alpha_addr| acc1 | acc0 | | ||
/// +------+------+------+------+------+------+------+------+---+---+---+---+---+----------+------+------+ | ||
/// | ||
/// | ||
/// Output: | ||
/// | ||
/// +------+------+------+------+------+------+------+------+---+---+---+---+---+----------+-------+-------+ | ||
/// | c0 | c1 | c2 | c3 | c4 | c5 | c6 | c7 | - | - | - | - | - |alpha_addr| acc1' | acc0' | | ||
/// +------+------+------+------+------+------+------+------+---+---+---+---+---+----------+-------+-------+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: the order of coefficients seems to be reversed here as compared as how we usually put data on the stack (i.e., least significant element is deep in the stack) - does this matter here?
Specifically, let's say we have memory populated as follows starting at address 0:
[c0, c1, c2, ... c15]
Thus, c15
is the value stored at memory address 15.
So, executing mem_stream
operation would give us the following results
# => empty stack
mem_stream
# => [c7, c6, c5, c4, c3, c2, c1, c0, 0, 0, 0, 0, 8, ...]
mem_stream
# => [c15, c14, c13, c12, c11, c10, c9, c8, 0, 0, 0, 0, 16, ...]
Does this arrangement still work fine if c0, ..., c15
contain polynomial coefficients (probably in reverse order)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we discussed this here and in other places, but the general idea is that for Horner evaluation to work we have to have the coefficients in reverse order. Using your example, and for both possible layouts for each batch of 8 base coefficients, there is no way, to me at least, to make a single instruction working on chunks of 8 coefficients that can compute the evaluation of that polynomial in the natural order.
The solution we came up with is to accept that in almost all situations the order of data is [c0, c1, c2, ... c15]
, and hence not the right order for Horner evaluation of the polynomial
The cost for this mismatch is very minimal, a multiplication and an inversion. For an example of this, see the Falcon improvement PR using horner_eval_base
.
/// +------+------+------+------+------+------+------+------+---+---+---+---+---+----------+------+------+ | ||
/// | c0_1 | c0_0 | c1_1 | c1_0 | c2_1 | c2_0 | c3_1 | c3_0 | - | - | - | - | - |alpha_addr| acc1 | acc0 | | ||
/// +------+------+------+------+------+------+------+------+---+---+---+---+---+----------+------+------+ | ||
/// | ||
/// | ||
/// Output: | ||
/// | ||
/// +------+------+------+------+------+------+------+------+---+---+---+---+---+----------+-------+-------+ | ||
/// | c0_1 | c0_0 | c1_1 | c1_0 | c2_1 | c2_0 | c3_1 | c3_0 | - | - | - | - | - |alpha_addr| acc1' | acc0' | | ||
/// +------+------+------+------+------+------+------+------+---+---+---+---+---+----------+-------+-------+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as above re the order of elements on the stack.
/// Returns randomness. | ||
fn get_evaluation_point(&mut self) -> Result<QuadFelt, ExecutionError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably change the comment here because evaluation point does not need to be a random value (e.g., for some use cases, we may be just evaluating a polynomial at non-random values).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed that in the last feedback round. Fixed now
// --- read the values of the coefficients, over the base field, from the stack ----------- | ||
let coef = self.get_coeff_as_base_elements(); | ||
|
||
// --- read the evaluation point alpha from memory ---------------------------------------- | ||
let alpha = self.get_evaluation_point()?; | ||
|
||
// --- read the accumulator values from stack --------------------------------------------- | ||
let acc_old = self.get_accumulator(); | ||
|
||
// --- compute the updated accumulator values --------------------------------------------- | ||
let acc_new = | ||
coef.iter().rev().fold(acc_old, |acc, coef| QuadFelt::from(*coef) + alpha * acc); | ||
|
||
// --- copy the stack --------------------------------------------------------- | ||
self.stack.copy_state(0); | ||
|
||
// --- update the accumulators ------------------------------------------------------------ | ||
self.stack.set(ACC_HIGH_INDEX, acc_new.to_base_elements()[1]); | ||
self.stack.set(ACC_LOW_INDEX, acc_new.to_base_elements()[0]); | ||
|
||
// --- set the helper registers ----------------------------------------------------------- | ||
self.populate_helper_registers(alpha); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: given the relative simplicity of this function, I probably wouldn't use dashes in the comments and maybe would even skip some of the trivial ones. For example, it could look something like this:
// read the values of the coefficients, over the base field, from the stack
let coef = self.get_coeff_as_base_elements();
let alpha = self.get_evaluation_point()?;
let acc_old = self.get_accumulator();
// compute the updated accumulator values
let acc_new =
coef.iter().rev().fold(acc_old, |acc, coef| QuadFelt::from(*coef) + alpha * acc);
// copy over the stack state to the next cycle changing only the accumulator values
self.stack.copy_state(0);
self.stack.set(ACC_HIGH_INDEX, acc_new.to_base_elements()[1]);
self.stack.set(ACC_LOW_INDEX, acc_new.to_base_elements()[0]);
// set the helper registers
self.populate_helper_registers(alpha);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed
// --- read the values of the coefficients, over the extension field, from the stack ------ | ||
let coef = self.get_coeff_as_quad_ext_elements(); | ||
|
||
// --- read the evaluation point alpha from memory ---------------------------------------- | ||
let alpha = self.get_evaluation_point()?; | ||
|
||
// --- read the accumulator values from stack --------------------------------------------- | ||
let acc_old = self.get_accumulator(); | ||
|
||
// --- compute the updated accumulator values --------------------------------------------- | ||
let acc_new = coef.iter().rev().fold(acc_old, |acc, coef| *coef + alpha * acc); | ||
|
||
// --- copy the stack --------------------------------------------------------- | ||
self.stack.copy_state(0); | ||
|
||
// --- update the accumulators ------------------------------------------------------------ | ||
self.stack.set(ACC_HIGH_INDEX, acc_new.to_base_elements()[1]); | ||
self.stack.set(ACC_LOW_INDEX, acc_new.to_base_elements()[0]); | ||
|
||
// --- set the helper registers ----------------------------------------------------------- | ||
self.populate_helper_registers(alpha); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar nit re simplify the comments as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed
alphas, | ||
row, | ||
eval_point_ptr, | ||
[eval_point_0, eval_point_1, ZERO, ZERO], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An insight gained from #1664 is that the RCOMBBASE
request would fail to match the response by the memory chiplet when the positions 2 and 3 here are not ZERO, ZERO
. This happened specifically in the falcon signature code. I didn't investigate more, but my assumption is that memory was reused (probably with the use of procedure locals), and that there are leftover garbage values in those positions.
Hence the fix is to fully read the memory word in the helper registers and include it in the bus message. This learning applies generally too: when we read a word, we must read it entirely and include it all in the bus message, since the "zeroed-out memory" assumption doesn't always hold.
Ideally, we'd add a test also to check that when there are non-zero values in positions 2 & 3, that the bus doesn't fail.
EDIT: Upon closer examination, I don't believe the values in memory from the miden-base
are garbage. Since this code is changing with #1661, I will stop investigating; we can see if somehow the problem was fixed when switching to horner base eval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, thinking more, the larger problem in miden-base
seems more to be that the data in the advice provider is wrong, such that the pipe
doesn't insert 0
s where there should be some.
Regardless, it's probably safer to fully read the word anyway instead of assuming the memory was zero'd in those positions.
Closes #1600