Running Benchmarks
Tessera ships with a benchmark suite that compares its transpiler output against Qiskit's across a set of standard circuits. It measures gate count, circuit depth, transpile time, and simulation correctness. Results are tracked in benchmark_store.json and historical runs are documented in benchmarks/benchmarks.md.
Prerequisites
Benchmarks require dev dependencies including qiskit-aerfor simulation. If you haven't already:
pip install -e .[dev]Running the Benchmarks
Dry run. View results without writing anything:
python benchmarks/benchmarks.pyWrite results to benchmark_store.json:
python benchmarks/benchmarks.py --writeRefuses to write if any metric on any circuit regressed against the stored ceiling, or if any simulation distribution mismatched Qiskit.
Write results even if some metrics regressed:
python benchmarks/benchmarks.py --allow-loosenAccepts intentional regressions, for example trading gate count for circuit depth. The best_max_* watermarks are still preserved as monotonic minimums so the prior best is never lost.
Understanding the Output
Per-circuit results table:
Metric Tessera Qiskit
---------------------------------------------
Gate Count 31 27
Circuit Depth 15 11
Transpile Time (s) 0.0011 0.0060
Simulation Match TrueSummary table: After all circuits run, a summary table shows all four circuits side by side with gate count, depth, and simulation match at a glance.
Diff chart: Compares this run against the stored ceilings and best-seen watermarks in benchmark_store.json:
Circuit Metric Current Stored Best Delta Status
----------------------------------------------------------------------
QFT-like Gates 31 31 31 0 same
Stress Test Gates 35 35 35 0 sameStatus values:
| Status | Meaning |
|---|---|
same | Matches the stored ceiling exactly |
tightened | Better than the ceiling but not a new best |
NEW BEST! | Better than the stored best watermark |
REGRESSED | Worse than the stored ceiling |
no baseline | No entry in benchmark_store.json yet |
benchmark_store.json
benchmarks/benchmark_store.json tracks two values per metric per circuit:
max_gates/max_depth: current ceiling.--writeupdates these to the latest run's numbers.--allow-loosenupdates them even if they went up.best_max_gates/best_max_depth: best ever seen. These only ever ratchet downward. They are never overwritten with a worse value regardless of which flag you use.
If you need to fully reset the store, for example after fixing a bug that changes the baseline, clear the file to {} and rerun with --write. The store will be populated fresh from the current run with no prior history to compare against.
Benchmark Circuits
The suite currently runs four circuits against the IBM backend on FakeNairobiV2 (7 qubits):
| Circuit | Qubits | Gates In | Description |
|---|---|---|---|
| Bell State | 2 | 4 | H + CX + measure. Baseline sanity check. |
| GHZ State | 3 | 6 | H + chain of CX gates. Tests linear routing. |
| QFT-like | 5 | 22 | Rotation-heavy with intentional duplicate Rz gates to exercise MergeRotationsPass. |
| Stress Test | 5 | 18 | Mixed gate set with frequent non-adjacent interactions. Exercises the full pipeline under load. |
Adding a New Benchmark Circuit
Open benchmarks/benchmarks.py and add a circuit builder function following the same pattern as the existing ones:
def make_my_circuit():
qc = QuantumCircuit(3, 3)
qc.h(0)
qc.cx(0, 1)
qc.cx(1, 2)
qc.measure(list(range(3)), list(range(3)))
return qcThen add it to the circuits list in the main block:
circuits = [
("Bell State", make_bell_state()),
("GHZ State", make_ghz_state()),
("QFT-like", make_qft_like()),
("Stress Test", make_stress_test()),
("My Circuit", make_my_circuit()), # add here
]Run once without flags to see the results, then use --write to add it to the store. Document it in benchmarks/benchmarks.md under a new run section.
Documenting a Run
After any run that changes the stored ceilings, add a new entry to benchmarks/benchmarks.md under a new ### Run N section. Include the date, what changed since the last run, the full results table, and any relevant notes. See the existing run entries in that file for the format to follow.