Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stage 0 public reference #2556

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open

Stage 0 public reference #2556

wants to merge 14 commits into from

Conversation

qwang98
Copy link
Collaborator

@qwang98 qwang98 commented Mar 17, 2025

This PR is a step towards passing stage-0 public witgen to the backend. It focuses on modifying the pipeline to separate witness values from public values. Additional changes to pass stage-1 public witgen is needed before public values are fully passed to the backend.

@qwang98 qwang98 requested a review from georgwiese March 17, 2025 08:57
@georgwiese
Copy link
Collaborator

I think the reason is that the second-stage witgen is run by SecondStageMachine which currently ignores updated_data.publics? If you put a panic here and run the test with RUST_BACKTRACE=1, you'll see that it finds the public while running SecondStageMachine::process. So the solution would be to collect the public there and return it in take_public_values I think, like is already done for block machine and dynamic machine.

But one other thing to note is that this code is only run in this path, when we use automatic witgen for the second phase. In practice (when we use the bus linker and are on Goldilocks or BabyBear), it would use the hand-written bus witgen, which would have to implement extracting any second-stage publics again.

So maybe the first step would be to pass stage-0 publics to the backend first?

@qwang98
Copy link
Collaborator Author

qwang98 commented Mar 18, 2025

this path

I implemented passing stage-0 publics to backend in #2502, which unfortunately can't pass CI due to some stage-1 publics tests.

For extracting second stage publics in the non-bus linker case, I assume that I would need to:

  1. Append hand-written bus witgen to stage-0 witness (as is already done).
  2. Pass this new witness to witness generator.
  3. Generate the publics using the witness generator. (There should be no new stage-1 witgen as it's just done by the hand-written bus witgen).

Am I understanding it correctly?

@georgwiese
Copy link
Collaborator

I'm a bit confused. Do you mean "second stage publics in he bus linker case" (not non-bus)?

There are two paths see this switch:

  • We are using the bus linker and are not on BN254: This is the "hand-written" bus witgen. I'm not sure if we currently even have publics in that case. But in principle, in that case, we should be returning the last row of the accumulator column as a second-stage public (so that the verifier can check whether they sum to 0 across machines). So, I think in that path we'd just add to the hand-written witgen to also return the publics.
  • For the other path, I think you're on a good track with this PR, but you'll need to do something similar to what's currently happening in block machines (for example) in SecondStageMachine.

Let me know if you think a call would be helpful :)

@qwang98
Copy link
Collaborator Author

qwang98 commented Mar 18, 2025

see this switch

Ahh yes. Sorry I meant the opposite. Your description makes sense so I'll take a stab tomorrow and request a call if I get stuck. :) Thanks!

Comment on lines 247 to +249
// Run main machine and extract columns from all machines.
let columns = MutableState::new(machines.into_iter(), &self.query_callback).run();
let (columns, _publics) =
MutableState::new(machines.into_iter(), &self.query_callback).run();
Copy link
Collaborator Author

@qwang98 qwang98 Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that _publics is not passed to the backend yet till I also implement stage-1 publics. However, the pipeline for passing publics to the backend is ready (rest of the PR).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this confuses me still: run() returns some publics, but then they are ignored and instead we use extract_publics(&columns, self.analyzed).

Is it because at this point we'd expect them to be the same?

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Benchmarks'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.

Benchmark suite Current: f2d01a8 Previous: f5b059b Ratio
executor-benchmark/keccak 25304196723 ns/iter (± 340555011) 20837156242 ns/iter (± 342822186) 1.21

This comment was automatically generated by workflow using github-action-benchmark.

@qwang98 qwang98 changed the title Stage 1 public reference Stage 0 public reference Mar 20, 2025
@qwang98 qwang98 marked this pull request as ready for review March 20, 2025 03:35
Copy link
Collaborator

@georgwiese georgwiese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Seems like a bit of an intermediate step, but I think it makes sense and looks like we're quite close to making the switch to scalar publics everywhere.

match self {
KnownMachine::BlockMachine(m) => m.take_public_values(),
KnownMachine::DynamicMachine(m) => m.take_public_values(),
_ => BTreeMap::new(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to enumerate all machines here? And maybe change the default implementation to return an empty map?

@@ -538,6 +545,8 @@ impl<T: FieldElement> Pipeline<T> {
Pipeline {
artifact: Artifacts {
witness: Some(Arc::new(witness)),
// need to set publics to Some, or `compute_witness` will run auto witgen
publics: Some(Arc::new(BTreeMap::new())),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here we can change the function so that both needs to be provided, right? Seems reasonable that anyone who knows the witness should also know the publics.

@@ -520,6 +525,8 @@ impl<T: FieldElement> Pipeline<T> {
Ok(Pipeline {
artifact: Artifacts {
witness: Some(Arc::new(witness)),
// need to set publics to Some, or `compute_witness` will run auto witgen
publics: Some(Arc::new(BTreeMap::new())),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I guess we need to export publics along with the witness then? Problem for a later time, but good to keep in mind.

Comment on lines +53 to +54
pub type Witness<T> = Vec<(String, Vec<T>)>;
pub type Publics<T> = BTreeMap<String, Option<T>>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like it would make sense to introduce a struct for both? Whenever you need the witness, you'll also need the publics. So then the pipeline would have a field witness_and_publics: Option<Arc<WitnessAndPublics<T>>> instead of two fields.

Comment on lines 247 to +249
// Run main machine and extract columns from all machines.
let columns = MutableState::new(machines.into_iter(), &self.query_callback).run();
let (columns, _publics) =
MutableState::new(machines.into_iter(), &self.query_callback).run();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this confuses me still: run() returns some publics, but then they are ignored and instead we use extract_publics(&columns, self.analyzed).

Is it because at this point we'd expect them to be the same?

acc_witness
.entry(key)
.or_insert_with(Vec::new)
.append(&mut vec_vals);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to concatenate columns of different machines! I think this should be a assert!(acc_witness.set(key, vec_vals).is_none()).

})
.unwrap();
let columns = machine.take_witness_col_values(&self).into_iter();
let (witness_columns, public_values, _processed) = self.machines.iter().fold(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, might be a matter of taste, but I find the fold API a bit confusing, and it seems like this works as well:

        let witness_columns = {
            // We keep the already processed machines mutably borrowed so that
            // "later" machines do not try to create new rows in already processed
            // machines.
            let mut processed = vec![];
            self.machines
                .iter()
                .flat_map(|machine| {
                    let mut machine = machine.try_borrow_mut().unwrap_or_else(|_| {
                        panic!("Recursive machine dependencies while finishing machines.");
                    });
                    let columns = machine.take_witness_col_values(&self).into_iter();
                    processed.push(machine);
                    columns
                })
                .collect()
        };

        let public_values = self
            .machines
            .iter()
            .flat_map(|machine| {
                let mut machine = machine.try_borrow_mut().unwrap_or_else(|_| {
                    panic!("Recursive machine dependencies while finishing machines.");
                });
                machine.take_public_values().into_iter()
            })
            .collect();

It's less of a change to before and keeps the processed shenanigans local to the call to take_witness_col_values (which can in principle call other machines, whereas take_public_values can't, because it doesn't have access to MutableState).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants