1 of 16

CSLib: A Foundation for �Computer Science in Lean 4

Clark Barrett

Stanford & Amazon

Swarat Chaudhuri

Google DeepMind & UT Austin

Leonardo de Moura

LeanFRO & Amazon

Jim Grundy

Amazon

Pushmeet Kohli

Google DeepMind

2 of 16

What is CSLib?

CSLib: An open-source Lean 4 library, built through a community effort, formalizing a significant subset of undergraduate-level computer science.

Educators who want to inject formal methods into computer science curricula.

Target users

AI researchers who want to build AI tools for programming and reasoning

Formal methods researchers who want to verify algorithms and systems in Lean

*All clip art generated with Gemini

3 of 16

CSLib ecosystem

1. Boole: A verification-friendly intermediate language, embedded as a DSL in Lean

  • Member of Amazon’s MIDI language family
  • Comes with semantics of resource consumption (time and space complexity)
  • Can be extended in principle to handle concurrency, semantics of heaps, etc.

2. CSLib: Boole and Lean formalizations of all undergraduate-level CS knowledge

  • Algorithms and data structures in Boole, specs and proofs in Lean
  • CS-specific mathematics (e.g., complexity theory) in Lean

3. Language embeddings: Trusted translations from Rust, Python, C,... to Boole.

All code released under a permissive open-source license

4 of 16

Usage scenario: Adding to CS foundations

Theorems

Lean formalization

Lean �proof

Theorem-proving

Formalization

Models of computation, CS-relevant math, �metatheorems about classes of algorithms,...

CSLib API

5 of 16

Usage scenario: Reasoning about code

Boole code

Lean specifications

Lean �proof

Theorem-proving

Programming &

specification

Data structures, algorithms, theorems about correctness, �theorems about resource use,...

CSLib API

6 of 16

Usage scenario: Reasoning about code

Lean �proof

Python/Rust/… code + assertions

Trusted translation

Boole code

Lean verification conditions

Verification condition generation�

Theorem-proving

Programming &�specification

CSLib API

Data structures, algorithms, theorems about �correctness, theorems about resource use,...

7 of 16

CSLib knowledge base: Organization

  • All directories set up under the LeanFRO Github repo
  • Subdirectories for all main undergraduate subjects

MIT’s undergrad CS requirements

Sorting and order statistics

Data structures:

  • Elementary: Trees, queues, hash tables, BSTs,...
  • Advanced: B-trees, Fibonacci heaps,...

Dynamic programming

Greedy algorithms

Graph algorithms

Advanced topics:

  • Multithreading, Matrix algorithms, number-theoretic algorithms, approximation algorithms,...

CLRS: Partial table of contents

  • First key milestone: Formalize most of the CLRS text on algorithms and data structures

8 of 16

Example: Dijkstra's algorithm in Boole

var v : Vertex;

var Visited : [Vertex] bool;

var SP, oldSP : [Vertex] int;

var maxD: int; // Max distance

var nc: int; // Node count

const initSP: int = nc * (maxD + 1);

// Initialize `SP`.

for (x : Vertex) {

  if (x = Source) then

    SP[x] := 0

  else

    SP[x] := initSP;

}

 

// Initialize `Visited`.

for (x : Vertex) {

  Visited[x] := false;

}

 

while (∃x:Vertex : ¬Visited[x] ∧ SP[x] < initSP) {

  // Pick an unvisited vertex `v` that has the

// smallest distance from the source.

  havoc v;

  assume ¬Visited[v];

  assume SP[v] < initSP;

  assume (∀ x : Vertex :: ¬Visited[x] 

==> SP[v] ≤ SP[x]);

  Visited[v] := true;

  oldSP := SP; 

  // Update distances for neighbors of `v`.

  for (u : Vertex) {

    if (Graph[v,u] < maxDist ∧ 

oldSP[v] + Graph[v,u] < oldSP[u]) then

      SP[u] := oldSP[v] + Graph[v,u]

    else

      SP[u] := oldSP[u];

  }

}

Credit: Joe Hendrix and Shilpi Goel

9 of 16

Dijkstra's algorithm: Lean Specification

structure Graph where

/-- Node count -/

nc : Nat

/-- Maximum allowed distance -/

maxDist : Nat

/– Edge distances. Any edge greater than `maxDist` � is treated as not present.

-/

mat : Vector (Vector Nat nc) nc

deriving Inhabited

abbrev Graph.Vertex (g : Graph):= Fin g.nc

variable {g : Graph}

def Graph.weight (u v : g.Vertex) : Nat := g.mat[u][v]

notation "d[" i ", " j "]" => Graph.weight i j

/-- Constant guaranteed to be larger than any maximum �edge path. -/

def Graph.inf (g : Graph) : Nat := g.maxDist + 1

/-- Constant guaranteed to be larger than any shortest path �in the graph -/

def Graph.infPath (g : Graph) : Nat := g.nc * g.inf

variable (hg₀ : 0 < g.nc)

variable (hg₁ : ∀ i j : g.Vertex, i ≠ j → d[i, j] > 0)

variable (hg₂ : ∀ i j : g.Vertex, i = j → d[i, j] = 0)

variable (SOURCE : g.Vertex)

structure ShortestPath (g : Graph) where

vec : Vector Nat g.nc

deriving Inhabited

def ShortestPath.get (SP : @ShortestPath g) (v : g.Vertex) := SP.vec[v.val]

variable (SP : ShortestPath g)

notation SP "[" v "]" => ShortestPath.get SP v

class ShortestPath.WellFormed where

source_correct : SP[SOURCE] = 0

is_metric : ∀ z y : g.Vertex,

(SP[y] < g.infPath ∧ d[y, z] < g.inf) →

SP[z] ≤ SP[y] + d[y, z]

is_consistent :

∀ z : g.Vertex, z ≠ SOURCE ∧ SP[z] < g.infPath →

∃ y : g.Vertex, y ≠ z ∧ SP[z] = SP[y] + d[y, z]

10 of 16

CSLib and AI

Autoformalization

“Dijkstra’s shortest-path�Algorithm in Lean 4”

Early access to Alphaproof and related tools from GDM

Proof synthesis

def is_dijkstra_solution (sp_map : Vertex → ℝ) : Prop :=

(sp_map SOURCE = 0) ∧

(∀ z y : Vertex, (sp_map y < INF ∧ d y z < INF) → � sp_map z ≤ sp_map y + d y z) ∧

(∀ z : Vertex, z ≠ SOURCE ∧ sp_map z < INF → � ∃ y : Vertex, y ≠ z ∧ sp_map z = sp_map y + d y z)

theorem dijkstra_implementation_is_correct

(h_nonneg_weights : ∀ u v, d u v ≥ 0) :

is_dijkstra_solution(run_dijkstra d SOURCE) := sorry

proof

11 of 16

AI-guided workflow

CSLib

  • Use CSLib to seed AI tools for formalization and proof
  • Use formalization and proof tools to grow CSLib
  • Expert human vetting to ensure quality at all points.

12 of 16

Community engagement

We aim to build a community that brings together…

  • Industry researchers
  • Professors and students
  • National labs
  • Lean, formal methods, and AI safety enthusiasts.

First steps

  • Start conversations on the CSLib Zulip
  • Onboard two technical leads (postdoc-level scientists) who will drive the development
  • Create a detailed set of milestones for the next six months
  • Create working groups that focus on specific subtasks

13 of 16

Funding

Amazon and Deepmind have expressed interest in supporting two technical leads for CSLib.

  • Hosted at academic labs/nonprofits
  • Tasks: coding up CSLib, collaborating with the SC and participants across the world
  • Selection:
    • Host PIs send short nomination letters and candidate resumes to the CSLib SC by July 31, 2025
    • Selection through a review by the SC and interviews
    • Public call for technical leads soon.

In the medium term, fundraising from government agencies, philanthropists, and numerous companies.

14 of 16

Collaboration possibilities

CSLib for teaching

CSLib for research

New techniques for autoformalization, proof synthesis, proof and formalization debugging, proof optimization

Scaling up formal verification to large real-world systems

Automating theoretical computer science

Grounding AI reasoning in formal methods

HCI research on productivity boosts from automation in verification

CS ed research on the effectiveness of formal methods in teaching

15 of 16

Proposed timeline

July: Onboarding of academic and industrial partners

Early August: Definition of initial version of Boole

Mid-August: Coding kickoff!

January 2026: Release of CSLib v0.1.

16 of 16

Discussion!