Skip to content

Commit

Permalink
Perform BarrierBeforeFinalMeasurements analysis in parallel (#13411)
Browse files Browse the repository at this point in the history
* Use OnceLock instead of OnceCell

OnceLock is a thread-safe version of OnceCell that enables us to use
PackedInstruction from a threaded environment. There is some overhead
associated with this, primarily in memory as the OnceLock is a larger
type than a OnceCell. But the tradeoff is worth it to start leverage
multithreading for circuits.

Fixes #13219

* Update twirling too

* Perform BarrierBeforeFinalMeasurements analysis in paralle

With #13410 removing the non-threadsafe structure from our circuit
representation we're now able to read and iterate over a DAGCircuit from
multiple threads. This commit is the first small piece doing this, it
moves the analysis portion of the BarrierBeforeFinalMeasurements pass to
execure in parallel. The pass checks every node to ensure all it's
decendents are either a measure or a barrier before reaching the end of
the circuit. This commit iterates over all the nodes and does the check
in parallel.

* Remove allocation for node scan

* Refactor pass to optimize search and set parallel threshold

This commit updates the logic in the pass to simplify the search
algorithm and improve it's overall efficiency. Previously the pass would
search the entire dag for all barrier and measurements and then did a
BFS from each found node to check that all descendants are either
barriers or measurements. Then with the set of nodes matching that
condition a full topological sort of the dag was run, then the
topologically ordered nodes were filtered for the matching set. That
sorted set is then used for filtering

This commit refactors this to do a reverse search from the output
nodes which reduces the complexity of the algorithm. This new algorithm
is also conducive for parallel execution because it does a search
starting from each qubit's output node. Doing a test with a quantum
volume circuit from 10 to 1000 qubits which scales linearly in depth
and number of qubits a crossover point between the parallel and serial
implementations was found around 150 qubits.

* Update crates/circuit/src/dag_circuit.rs

Co-authored-by: Raynel Sanchez <[email protected]>

* Rework logic to check using StandardInstruction

* Add comments explaining the search function

* Update crates/circuit/src/dag_circuit.rs

Co-authored-by: Raynel Sanchez <[email protected]>

---------

Co-authored-by: Raynel Sanchez <[email protected]>
  • Loading branch information
mtreinish and raynelfss authored Feb 12, 2025
1 parent fb7648e commit 9ca951c
Show file tree
Hide file tree
Showing 2 changed files with 129 additions and 32 deletions.
135 changes: 106 additions & 29 deletions crates/accelerate/src/barrier_before_final_measurement.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,17 @@
// copyright notice, and modified files need to carry a notice indicating
// that they have been altered from the originals.

use hashbrown::HashSet;
use pyo3::prelude::*;
use rayon::prelude::*;
use rustworkx_core::petgraph::stable_graph::NodeIndex;

use qiskit_circuit::circuit_instruction::ExtraInstructionAttributes;
use qiskit_circuit::dag_circuit::{DAGCircuit, NodeType};
use qiskit_circuit::operations::{Operation, StandardInstruction};
use qiskit_circuit::operations::{OperationRef, StandardInstruction};
use qiskit_circuit::packed_instruction::{PackedInstruction, PackedOperation};
use qiskit_circuit::Qubit;

static FINAL_OP_NAMES: [&str; 2] = ["measure", "barrier"];
const PARALLEL_THRESHOLD: usize = 150;

#[pyfunction]
#[pyo3(signature=(dag, label=None))]
Expand All @@ -29,39 +29,116 @@ pub fn barrier_before_final_measurements(
dag: &mut DAGCircuit,
label: Option<String>,
) -> PyResult<()> {
let is_exactly_final = |inst: &PackedInstruction| FINAL_OP_NAMES.contains(&inst.op.name());
let final_ops: HashSet<NodeIndex> = dag
.op_nodes(true)
.filter_map(|(node, inst)| {
if !is_exactly_final(inst) {
return None;
}
dag.bfs_successors(node)
.all(|(_, child_successors)| {
child_successors.iter().all(|suc| match dag[*suc] {
NodeType::Operation(ref suc_inst) => is_exactly_final(suc_inst),
// Get a list of the node indices which are final measurement or barriers that are ancestors
// of a given qubit's output node.
let find_final_nodes = |[_in_index, out_index]: &[NodeIndex; 2]| -> Vec<NodeIndex> {
// Next nodes is the stack of parent nodes to investigate. It starts with any predecessors
// of a qubit's output node that are Barrier or Measure
let mut next_nodes: Vec<NodeIndex> = dag
.quantum_predecessors(*out_index)
.filter(|index| {
let node = &dag[*index];
match node {
NodeType::Operation(inst) => {
if let OperationRef::StandardInstruction(op) = inst.op.view() {
if matches!(
op,
StandardInstruction::Measure | StandardInstruction::Barrier(_)
) {
dag.bfs_successors(*index).all(|(_, child_successors)| {
child_successors.iter().all(|suc| match &dag[*suc] {
NodeType::Operation(suc_inst) => match suc_inst.op.view() {
OperationRef::StandardInstruction(suc_op) => {
matches!(
suc_op,
StandardInstruction::Measure
| StandardInstruction::Barrier(_)
)
}
_ => false,
},
_ => true,
})
})
} else {
false
}
} else {
false
}
}
_ => false,
}
})
.collect();
let mut nodes: Vec<NodeIndex> = Vec::new();
// Reverse traverse the dag from next nodes until we encounter no more barriers or measures
while let Some(node_index) = next_nodes.pop() {
// If node on the stack is a barrier or measure we can add it to the output list
if node_index != *out_index
&& dag.bfs_successors(node_index).all(|(_, child_successors)| {
child_successors.iter().all(|suc| match &dag[*suc] {
NodeType::Operation(suc_inst) => match suc_inst.op.view() {
OperationRef::StandardInstruction(suc_op) => matches!(
suc_op,
StandardInstruction::Measure | StandardInstruction::Barrier(_)
),
_ => false,
},
_ => true,
})
})
.then_some(node)
})
.collect();
{
nodes.push(node_index);
}
// For this node if any parent nodes are barrier or measure add those to the stack
for pred in dag.quantum_predecessors(node_index) {
match &dag[pred] {
NodeType::Operation(inst) => {
if let OperationRef::StandardInstruction(op) = inst.op.view() {
if matches!(
op,
StandardInstruction::Measure | StandardInstruction::Barrier(_)
) {
next_nodes.push(pred)
}
}
}
_ => continue,
}
}
}
nodes.reverse();
nodes
};

let final_ops: Vec<NodeIndex> =
if dag.num_qubits() >= PARALLEL_THRESHOLD && crate::getenv_use_multiple_threads() {
dag.qubit_io_map()
.par_iter()
.flat_map(find_final_nodes)
.collect()
} else {
dag.qubit_io_map()
.iter()
.flat_map(find_final_nodes)
.collect()
};

if final_ops.is_empty() {
return Ok(());
}
let ordered_node_indices: Vec<NodeIndex> = dag
.topological_op_nodes()?
.filter(|node| final_ops.contains(node))
.collect();
let final_packed_ops: Vec<PackedInstruction> = ordered_node_indices
let final_packed_ops: Vec<PackedInstruction> = final_ops
.into_iter()
.map(|node| {
let NodeType::Operation(ref inst) = dag[node] else {
unreachable!()
};
let res = inst.clone();
dag.remove_op_node(node);
res
.filter_map(|node| match dag.dag().node_weight(node) {
Some(weight) => {
let NodeType::Operation(_) = weight else {
return None;
};
let res = dag.remove_op_node(node);
Some(res)
}
None => None,
})
.collect();
let qargs: Vec<Qubit> = (0..dag.num_qubits() as u32).map(Qubit).collect();
Expand Down
26 changes: 23 additions & 3 deletions crates/circuit/src/dag_circuit.rs
Original file line number Diff line number Diff line change
Expand Up @@ -60,8 +60,8 @@ use rustworkx_core::petgraph::visit::{
};
use rustworkx_core::petgraph::Incoming;
use rustworkx_core::traversal::{
ancestors as core_ancestors, bfs_successors as core_bfs_successors,
descendants as core_descendants,
ancestors as core_ancestors, bfs_predecessors as core_bfs_predecessors,
bfs_successors as core_bfs_successors, descendants as core_descendants,
};

use std::cmp::Ordering;
Expand Down Expand Up @@ -4830,6 +4830,12 @@ def _format(operand):
}

impl DAGCircuit {
/// Returns an immutable view of the qubit io map
#[inline(always)]
pub fn qubit_io_map(&self) -> &[[NodeIndex; 2]] {
&self.qubit_io_map
}

/// Returns an immutable view of the inner StableGraph managed by the circuit.
#[inline(always)]
pub fn dag(&self) -> &StableDiGraph<NodeType, Wire> {
Expand Down Expand Up @@ -5639,7 +5645,11 @@ impl DAGCircuit {
/// Remove an operation node n.
///
/// Add edges from predecessors to successors.
pub fn remove_op_node(&mut self, index: NodeIndex) {
///
/// # Returns
///
/// The removed [PackedInstruction] is returned
pub fn remove_op_node(&mut self, index: NodeIndex) -> PackedInstruction {
let mut edge_list: Vec<(NodeIndex, NodeIndex, Wire)> = Vec::new();
for (source, in_weight) in self
.dag
Expand All @@ -5664,6 +5674,7 @@ impl DAGCircuit {
Some(NodeType::Operation(packed)) => {
let op_name = packed.op.name();
self.decrement_op(op_name);
packed
}
_ => panic!("Must be called with valid operation node!"),
}
Expand All @@ -5688,6 +5699,15 @@ impl DAGCircuit {
core_bfs_successors(&self.dag, node).filter(move |(_, others)| !others.is_empty())
}

/// Returns an iterator of tuples of (DAGNode, [DAGNodes]) where the DAGNode is the current node
/// and [DAGNode] is its predecessors in BFS order.
pub fn bfs_predecessors(
&self,
node: NodeIndex,
) -> impl Iterator<Item = (NodeIndex, Vec<NodeIndex>)> + '_ {
core_bfs_predecessors(&self.dag, node).filter(move |(_, others)| !others.is_empty())
}

fn pack_into(&mut self, py: Python, b: &Bound<PyAny>) -> Result<NodeType, PyErr> {
Ok(if let Ok(in_node) = b.downcast::<DAGInNode>() {
let in_node = in_node.borrow();
Expand Down

0 comments on commit 9ca951c

Please sign in to comment.