Hardware ReviewPart 2 of 6

The Night We Woke Up a Dragon: Benchmarking Wukong

A 30-day deep benchmark of Origin Quantum's 72-qubit superconducting processor — finding 48 golden qubits and building a 3-layer error mitigation stack.

ALLONE Lab

Founder & Lead Researcher

February 24, 202615 min read

First Contact with Real Quantum Hardware

After establishing the thermodynamic case for quantum AI in Chapter 1, the next question was practical: what hardware do we actually use? Google's Willow and IBM's Heron lead in raw performance, but they're gated behind enterprise agreements and US export considerations. For a research lab in Tbilisi, we needed something accessible.

Origin Quantum's Wukong — a 72-qubit superconducting processor available through their cloud platform — became our machine. What followed was a 30-day benchmark campaign that taught us more about quantum noise than any textbook could.

The 72-Qubit Landscape

Wukong's specifications read well on paper:

Qubit Count: 72 computational qubits
Coherence (T2): 2.23 microseconds (NISQ baseline)
Gate Fidelity: ~99.2% single-qubit, ~97.8% two-qubit CNOT
Connectivity: Heavy-hexagonal topology with nearest-neighbor coupling

But specifications and reality diverge on quantum hardware. The T2 time of 2.23 microseconds is an average — individual qubits ranged from 0.8 to 3.1 microseconds. The 97.8% two-qubit fidelity means that in a 10-gate circuit, you've already accumulated roughly 20% error probability. At 50 gates, your signal is buried in noise.

Finding the Golden Qubits

Our first major finding: not all qubits are equal. We ran Bell state circuits — the simplest possible entanglement test — across every qubit pair on the chip. The ideal result is a clean 50/50 split between |00⟩ and |11⟩ states. What we measured ranged from near-perfect (49.2/50.8) on the best pairs to severely degraded (38/34/14/14) on the worst.

Readout error rates varied from 1.2% to 4.7% across the chip, with edge qubits consistently showing higher noise. Our calibration protocol identified 48 "golden qubits" with T2 above 2.0 microseconds and readout error below 2.5%. These form the usable subset for any serious experiment.

"Quantum computing in 2026 is like aviation in 1910 — the hardware works, but you need to understand every rivet in the engine."

The Error Mitigation Stack

This became our most important engineering contribution. Raw Wukong output is too noisy for meaningful computation beyond ~12 circuit layers. We built a three-layer mitigation stack that extended this to 18 effective layers:

Layer 1 — Readout Error Mitigation (REM): We calibrate per-qubit measurement error by preparing known states (|0⟩ and |1⟩) and measuring the flip rates. This builds a confusion matrix per qubit, which we invert and apply to all subsequent measurements. Improvement: +15.2% fidelity on average.

Layer 2 — Dynamical Decoupling (XY4): When qubits sit idle while other qubits are being operated on, they decohere. XY4 dynamical decoupling inserts X-Y-X-Y gate sequences on idle qubits, which compose to identity but suppress dephasing noise. On Wukong's T2=2.23μs qubits, this extended effective coherence by 2-3x.

Layer 3 — Zero-Noise Extrapolation (ZNE): Using Mitiq, we run each circuit at 3 artificially inflated noise levels (1x, 2x, 3x), then extrapolate back to zero noise using Richardson extrapolation. This costs 3x the circuit evaluations but recovers signal that would otherwise be lost.

How Wukong Compares

Metric	Wukong	Google Willow	IBM Heron
T2 (avg)	2.23 μs	~100 μs	~200 μs
2Q Fidelity	97.8%	99.7%	99.5%
Access	Free cloud	Enterprise	Enterprise
Framework	QPanda	Cirq	Qiskit

Wukong trails in coherence and fidelity, but the free cloud access and QPanda framework made it uniquely accessible for our research timeline. For prototyping hybrid quantum-classical models, the platform proved sufficient — especially with our error mitigation stack compensating for the hardware gaps.

What We Learned

The most important lesson: quantum hardware in 2026 is a noise management problem, not a qubit count problem. Having 72 qubits means nothing if noise renders 50 of them unreliable. The teams that will win are not the ones with the most qubits, but the ones who best understand and mitigate their noise.

This insight directly shaped our subsequent work on tensor compression (Chapter 4) and Born machines (Chapter 3) — we designed all circuits to stay within the 12-18 layer sweet spot we identified here.

ALLONE Lab

Founder & Lead Researcher

Founder of ALLONE, quantum AI researcher from Tbilisi. Building the bridge between quantum physics and practical AI.