1 of 38

MantraQubit Movement-Optimized Program Generation�on Zoned Neutral Atom Processors

Enhyeok Jang (Yonsei University), Youngmin Kim (Yonsei University), Hyungseok Kim (Yonsei University), �Seungwoo Choi (Yonsei University), Yipeng Huang (Rutgers University), and Won Woo Ro (Yonsei University)

E-mail: {enhyeok.jang, wro}@yonsei.ac.kr, yipeng.haung@rutgers.edu��Proceedings of the 23rd ACM/IEEE International Symposium on Code Generation and Optimization,

March 1-5, 2025, Las Vegas, NV, USA

�Session: Quantum Computing (2) at Room Willow, 11:40 – 12:00

Tuesday, Mar 4th, 2025

2 of 38

Contents

  • Background and Motivation
  • Mantra Design
  • Experimental Methodology
  • Results and Analyses
  • Conclusion

3 of 38

Contents

  • Background and Motivation
  • Mantra Design
  • Experimental Methodology
  • Results and Analyses
  • Conclusion

4 of 38

[Target Architecture]

  • Zoned Architecture [1] (Isolating Gate Area)
    • Long coherence times (~100s)
    • High Reconfigurability

4

[1] Dolev Bluvstein et al. "Logical quantum processor based on reconfigurable atom arrays." Nature 626.7997 (2024): 58-65.

< Zoned Architecture >

5 of 38

[Target Architecture]

  • Zoned Architecture [1] (Isolating Gate Area)
    • Long coherence times (~100s)
    • High Reconfigurability

[Potential Concerns]

  • 1Q and 2Q gates are interleaved frequently
    • Inter-zone movements of atoms !
  • Trap speed of neutral atom: 0.55 m/s (for trapped ions: 10 - 30 m/s)

5

[1] Dolev Bluvstein et al. "Logical quantum processor based on reconfigurable atom arrays." Nature 626.7997 (2024): 58-65.

< Zoned Architecture >

6 of 38

[Target Architecture]

  • Zoned Architecture [1] (Isolating Gate Area)
    • Long coherence times (~100s)
    • High Reconfigurability

[Potential Concerns]

  • 1Q and 2Q gates are interleaved frequently
    • Inter-zone movements of atoms !
  • Trap speed of neutral atom: 0.55 m/s (for trapped ions: 10 - 30 m/s)
  • Inter-zone travel overhead: Average of 78%, Up to 89%

6

[1] Dolev Bluvstein et al. "Logical quantum processor based on reconfigurable atom arrays." Nature 626.7997 (2024): 58-65.

< Zoned Architecture >

< Runtime Breakdown on Zoned Arch. >

7 of 38

[Target Architecture]

  • Zoned Architecture [1] (Isolating Gate Area)
    • Long coherence times (~100s)
    • High Reconfigurability

[Potential Concerns]

  • 1Q and 2Q gates are interleaved frequently
    • Inter-zone movements of atoms !
  • Trap speed of neutral atom: 0.55 m/s (for trapped ions: 10 - 30 m/s)
  • Inter-zone travel overhead: Average of 78%, Up to 89%

[Focus]

  • Reducing Inter-zone movements of atoms

7

[1] Dolev Bluvstein et al. "Logical quantum processor based on reconfigurable atom arrays." Nature 626.7997 (2024): 58-65.

< Zoned Architecture >

< Runtime Breakdown on Zoned Arch. >

8 of 38

[Atom movement is not completely free.]

8

1L

0L

1L

0L

1L

0L

9 of 38

[Let us SWAP 2 Logical Qubits in Moving Traps …]

9

1L

0L

0L

1L

0L

1L

1L

0L

1L

0L

1L

0L

10 of 38

[Let us SWAP 2 Logical Qubits in Moving Traps …]

10

1L

0L

0L

1L

0L

1L

1L

0L

150 us

1L

0L

(40 um)

/�(. 55 m/s)

=

72.8 us

150 us

(20 um)

/�(. 55 m/s)

=

36.4 us

11 of 38

[Atom movement is not completely free.]

  • Moving Traps cannot cross [2] each other.
  • Transversal CZ is realized between 2 atoms in very close.
    • One of atoms is in a moving trap (AOD) �and the other in a static trap (SLM) [3].
  • Atom transfer time between AOD and SLM takes long: 150 us�(~ hundreds times of 2Q gate pulse, 380 ns) [4].

11

[2] Chen Huang et al. "ZAP: Zoned Architecture and Parallelizable Compiler for Field Programmable Atom Array." arXiv preprint arXiv:2411.14037 (2024).�[3] Wan-Hsuan Lin, Daniel Bochen Tan, and Jason Cong. "Reuse-aware compilation for zoned quantum architectures based on neutral atoms." arXiv preprint arXiv:2411.11784 (2024).

[4] Jixuan Ruan et al. "PowerMove: Optimizing Compilation for Neutral Atom Quantum Computers with Zoned Architecture." arXiv preprint arXiv:2411.12263 (2024).

12 of 38

[Atom movement is not completely free.]

  • Moving Traps cannot cross [2] each other.
  • Transversal CZ is realized between 2 atoms in very close.
    • One of atoms is in a moving trap (AOD) �and the other in a static trap (SLM) [3].
  • Atom transfer time between AOD and SLM takes long: 150 us�(~ hundreds times of 2Q gate pulse, 380 ns) [4].

[These constraints may be difficult to address sufficiently by low-level compilation alone.]

=> Rewrite quantum programs to make these constraints less happened in the first place.

12

[2] Chen Huang et al. "ZAP: Zoned Architecture and Parallelizable Compiler for Field Programmable Atom Array." arXiv preprint arXiv:2411.14037 (2024).�[3] Wan-Hsuan Lin, Daniel Bochen Tan, and Jason Cong. "Reuse-aware compilation for zoned quantum architectures based on neutral atoms." arXiv preprint arXiv:2411.11784 (2024).

[4] Jixuan Ruan et al. "PowerMove: Optimizing Compilation for Neutral Atom Quantum Computers with Zoned Architecture." arXiv preprint arXiv:2411.12263 (2024).

13 of 38

Contents

  • Background and Motivation
  • Mantra Design
  • Experimental Methodology
  • Results and Analyses
  • Conclusion

14 of 38

[Rewriting Motivations and Designs]

  • Exploiting high flexibility of unitary decomposition�(e.g., fermionic system simulation kernels).

V-chain

  • has been typically adopted for super-conducting qubits.
  • has frequent interleaving of 1Q and 2Q gates in CZ-based decomposition.

=> requires frequent inter-zone movement when running on ZA. ☹

14

Fountain-Shaped�CZ-Tree Chain

ZZ-Interaction Protocol w/o 1Q Gate

Preemptive Identical-Zoned Gate Sched.

< Existing Simulation Kernel and Its Translation >

15 of 38

[Rewriting Motivations and Designs]

  • Exploiting high flexibility of unitary decomposition�(e.g., fermionic system simulation kernels).

Rewrite them as Fountain-chain

  • Hadamard is self-adjoint.
  • Applying them in even numbers => Identity Operation
  • Gate cancellation => Interleaving between CZs is resolved
  • Only one qubit (0L) is moved to compute all CZs.
  • No trap transfer required.

  • Cleverly decompose the Hamiltonian so that�inter-zone movements would not be frequent �from the begining ☺.

15

Fountain-Shaped�CZ-Tree Chain

ZZ-Interaction Protocol w/o 1Q Gate

Preemptive Identical-Zoned Gate Sched.

< Existing Simulation Kernel and Its Translation >

< Proposed Simulation Kernel and Its Translation >

16 of 38

[Inter-Zone Movement Complexity]

V-chain

  • 4 * ((# of Qubits) – 1) * (# of Paulis)

Fountain-chain

  • 4 * 4 * (# of Paulis)

16

Fountain-Shaped�CZ-Tree Chain

ZZ-Interaction Protocol w/o 1Q Gate

Preemptive Identical-Zoned Gate Sched.

< Proposed Simulation Kernel and Its Translation >

< Existing Simulation Kernel and Its Translation >

17 of 38

[Rewriting Motivations and Designs]

  • Exploiting native Rydberg gates
  • A bunch of RZZs
    • Cost Hamiltonian in QAOA
    • QNN Ansatz

17

Fountain-Shaped�CZ-Tree Chain

ZZ-Interaction Protocol w/o 1Q Gate

Preemptive Identical-Zoned Gate Sched.

< Existing Simulation Kernel and Its Translation >

18 of 38

[Rewriting Motivations and Designs]

  • Exploiting native Rydberg gates
  • A bunch of RZZs
    • Cost Hamiltonian in QAOA
    • QNN Ansatz
    • New Gate Protocol

18

Fountain-Shaped�CZ-Tree Chain

ZZ-Interaction Protocol w/o 1Q Gate

Preemptive Identical-Zoned Gate Sched.

< Existing Simulation Kernel and Its Translation >

19 of 38

[Inter-Zone Movement Complexity]

CX-Based RZZ Decomp.

  • 2-Local: 8 * (# of QAOA Layers)
  • SK: 4 * (# of Qubits) * (# of QAOA Layers)

Proposed Protocol

  • 2 * (# of QAOA Layers), No Trap Transfer

19

Fountain-Shaped�CZ-Tree Chain

ZZ-Interaction Protocol w/o 1Q Gate

Preemptive Identical-Zoned Gate Sched.

< Existing Simulation Kernel and Its Translation >

20 of 38

[Rewriting Motivations and Designs]

  • Intentionally execute some gates earlier/later
  • Basic Principle: �Greedy execution of all gates that can be�processed in the Zone currently located.
  • Gates has computational dependency�with each other.

=> Executing as many gates as possible �at each moment can ensure optimal �execution globally as well.

20

Fountain-Shaped�CZ-Tree Chain

ZZ-Interaction Protocol w/o 1Q Gate

Preemptive Identical-Zoned Gate Sched.

< Applying Gate Sched. on GHZ w/ Parallel CXs >

< Applying Gate Sched. on GHZ w/ Fountain CXs >

21 of 38

Contents

  • Background and Motivation
  • Mantra Design
  • Experimental Methodology
  • Results and Analyses
  • Conclusion

22 of 38

Modeling Zoned Architecture

    • Hardware Parameters from ZA [1]

    • Additional Execution Policies
      • Transferring Traps in Advance:�While Q_a is coming from the storage zone, �Q_b is pre-transferred to the SLM trap.
      • Placing Qubits near Zone Borders

    • Program Run Scenarios
      • Standard (No Opt.)
      • NALAC [4]
      • Mantra (Proposed)�

22

[1] Dolev Bluvstein et al. "Logical quantum processor based on reconfigurable atom arrays." Nature 626.7997 (2024): 58-65.

[4] Yannick Stade et al. 2024. “An Abstract Model and Efficient Routing for Logical Entangling Gates on Zoned Neutral Atom Architectures.” arXiv preprint arXiv:2405.08068 (2024).

23 of 38

Contents

  • Background and Motivation
  • Mantra Design
  • Experimental Methodology
  • Results and Analyses
  • Conclusion

24 of 38

Molecular Simulation Circuits (UCCSD Ansatz)

    • Perf. of NALAC over Standard: 36% (Total Runtime), 33% (LD/ST Time)
    • Perf. of Mantra over Standard: 79% (Total Runtime), 86% (LD/ST Time)
    • LD/ST Time Ratio in Total Runtime
      • Standard (92%), NALAC (96%), Mantra (61%)

24

25 of 38

Molecular Simulation Circuits (UCCSD Ansatz)

    • Perf. of NALAC over Standard: 36% (Total Runtime), 33% (LD/ST Time)
    • Perf. of Mantra over Standard: 79% (Total Runtime), 86% (LD/ST Time)
    • LD/ST Time Ratio in Total Runtime
      • Standard (92%), NALAC (96%), Mantra (61%)
    • Mantra: Constant LD/ST per Pauli-String regardless of molecules.

25

26 of 38

QAOA Circuits (Maximum Cut Problem)

    • PL: Perf. of Mantra over NALAC: average of 79% (62%-89%)
      • Runtime Scalabilty by # of Qubits: Nearly direct
      • Runtime Scalabilty by # of QAOA Layers: Direct
    • SK: Perf. of Mantra over NALAC: average of 86% (75%-91%)
      • Runtime Scalabilty by # of Qubits: Nearly quadratic
      • Runtime Scalabilty by # of QAOA Layers: Direct

26

Power-Law

Graph

SK-model

Graph

27 of 38

Contents

  • Background and Motivation
  • Mantra Design
  • Experimental Methodology
  • Results and Analyses
  • Conclusion

28 of 38

Conclusion

    • ZA is an emerging architecture that realizes high-precision quantum computing.

    • If existing program structures are applied naively to ZA, �most of the execution may be devoted to traveling between zones.

    • Existing CX-based representations for static topologies can cause frequently gate-interleaving on ZA.

    • To reduce 1Q/2Q gate interleaving, we propose Mantra.
      • Fountain-Shaped Tree Chain => Good for Inter-Zone Moves, Trap Transfer
      • ZZ-Interaction Protocol w/o 1Q Gate => Exploiting Native Ops of ZA
      • Preemptive Identical-Zoned Gate Sched. => Greedy Run for Same-Zone Gates

    • w/o hardware modifications, Mantra reduces �inter-zone moves by 68%, # of gates by 35%, and improves fidelity by 17%

28

29 of 38

29

Mantra Contributors

Enhyeok Jang

Seungwoo Choi

Youngmin Kim

Hyungseok Kim

Prof. Yipeng Huang

Prof. Won Woo Ro

30 of 38

Thank You for Your Time!��Q & A

30

31 of 38

Supplementary

    • Logical Qubit Preparation

31

32 of 38

Gate-Based SWAP in ZA

32

33 of 38

[Previous Neutral Atom Compilers]

=> Not effective enough for zoned architectures.

  • Exploiting long-distance entangling [5]: �But, ZA adopt Transversal CZ.
  • Reducing intra-zone movements efficiently ☺ [6]:�But, designed not for reducing inter-zone movements
  • Designing for ZA [7]:�But, could not reduce inter-zone movements

33

[5] Jonathan Baker et al. 2021. “Exploiting longdistance interactions and tolerating atom loss in neutral atom quantum architectures.” In 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA). IEEE, 818–831.

[6] Hanrui Wang et al. 2024. “Atomique: A quantum compiler for reconfigurable neutral atom arrays.” In 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA). IEEE, 293–309.

[7] Yannick Stade et al. 2024. “An Abstract Model and Efficient Routing for Logical Entangling Gates on Zoned Neutral Atom Architectures.” arXiv preprint arXiv:2405.08068 (2024).

34 of 38

[Previous Neutral Atom Compilers]

=> Not effective enough for zoned architectures.

  • Exploiting long-distance entangling [5]: �But, ZA adopt Transversal CZ.
  • Reducing intra-zone movements efficiently ☺ [6]:�But, designed not for reducing inter-zone movements
  • Designing for ZA [7]: �But, could not reduce inter-zone movements

=> Existing compilations optimize movements in zones,�But do not reduce movement between zones.

Need different optimizations than before.

34

[5] Jonathan Baker et al. 2021. “Exploiting longdistance interactions and tolerating atom loss in neutral atom quantum architectures.” In 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA). IEEE, 818–831.

[6] Hanrui Wang et al. 2024. “Atomique: A quantum compiler for reconfigurable neutral atom arrays.” In 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA). IEEE, 293–309.

[7] Yannick Stade et al. 2024. “An Abstract Model and Efficient Routing for Logical Entangling Gates on Zoned Neutral Atom Architectures.” arXiv preprint arXiv:2405.08068 (2024).

35 of 38

[Which GHZ Structure is good for ZA?]

GHZ w/ Fountain CXs

  • Depth Complexity: (Q + 1) ☹
  • LD/ST Complexity: 2 ☺

GHZ w/ Parallel CXs

  • Depth Complexity: Log(Q) ☺
  • LD/ST Complexity: ~ Log(Q) ☹

35

Fountain-Shaped�CZ-Tree Chain

ZZ-Interaction Protocol w/o 1Q Gate

Preemptive Identical-Zoned Gate Sched.

Entangling

Zone

Storage

Zone

LD

ST

Entangling Zone

Storage Zone

LD

ST

36 of 38

[Hardware Design Considerations]

Square Array

  • Good flexibility of intra-zone movements ☺
  • Ld/ST overhead is hard to scalable. ☹

36

Fountain-Shaped�CZ-Tree Chain

ZZ-Interaction Protocol w/o 1Q Gate

Preemptive Identical-Zoned Gate Sched.

LD

ST

Efficient Design & Placement of Zones

37 of 38

[Hardware Design Considerations]

Square Array

  • Good flexibility of intra-zone movements ☺
  • Ld/ST overhead is hard to scalable. ☹

Thin-Rectangular Array

  • Low flexibility of intra-zone movements ☹
  • Ld/ST overhead is constant. ☺

37

Fountain-Shaped�CZ-Tree Chain

ZZ-Interaction Protocol w/o 1Q Gate

Preemptive Identical-Zoned Gate Sched.

LD

ST

Efficient Design & Placement of Zones

LD

ST

38 of 38

[Hardware Design Considerations]

Square Array

  • Good flexibility of intra-zone movements ☺
  • Ld/ST overhead is hard to scalable. ☹

Thin-Rectangular Array

  • Low flexibility of intra-zone movements ☹
  • Ld/ST overhead is constant. ☺

3D-Stacked Square Array

  • Good flexibility of intra-zone movements ☺
  • Ld/ST overhead is constant. ☺

38

Fountain-Shaped�CZ-Tree Chain

ZZ-Interaction Protocol w/o 1Q Gate

Preemptive Identical-Zoned Gate Sched.

Efficient Design & Placement of Zones

LD

ST

LD

ST

LD

ST