1 of 39

Equational Theories Project

Daniel Weber, Vlad Tsyrklevich

Jan 14, 2025

2 of 39

Contributors

3 of 39

Origin of the project

  • Sept 25, 2024: Terence Tao announces a new project on his blog:

4 of 39

Definitions

  • A magma is a set M with a closed binary operation ◇ : M×M → M
  • We consider magmas with an additional constraint, an equational law such as ∀ x, y ∈ M : x ◇ y = y ◇ x
  • The project studies magmas satisfying a single equational law
  • Some example equational laws:
    • Equation 1: x = x
    • Equation 2: x = y
    • Equation 46: x ◇ y = z ◇ w
    • Equation 4512: (x ◇ y) ◇ z = x ◇ (y ◇ z)
  • We say Equation A implies Equation B, if all models of A are also models of B
  • Otherwise Equation A refutes Equation B

5 of 39

Example

  • The model on the right satisfies Equation 3: x = x ◇ x
    • 0 = 0 ◇ 0
    • 1 = 1 ◇ 1
  • The model does not satisfy Equation 4: x = x ◇ y
    • 0 0 ◇ 1
  • Hence, Equation 3 refutes Equation 4

  • However, Equation 4 implies Equation 3:
    • Proof: If x = x ◇ y for all y, set y = x to derive x = x ◇ x

0

1

0

0

1

1

1

1

6 of 39

Goal of the project

  • Goal: Determine the implication graph for all equational laws up to 4 operations
  • There are 4,694 equations with up to 4 operations (up to normalization)
  • 4694² ≈ 22 million total implications/refutations between equations of order 4
  • In practice, we require far fewer proofs to resolve the full graph
    • Many equations are equivalent, so we can reason about ~1.4k equivalence classes
    • Furthermore, our list include dual equations, which can be ‘translated’ by switching the order of the operation
    • The full implication graph is determined by ~10k theorems

7 of 39

Project organization

  • Used familiar tools: Zulip, GitHub, CI, blueprint
  • Terence Tao’s personal log
  • Project management tooling by Pietro Monticone and Shreyas Srinivas
  • @[equational_result] attribute to track results
  • Implication status: unknown → conjecture → theorem
  • Some custom tools developed for the project:

8 of 39

Dashboard

9 of 39

Equation

Explorer

10 of 39

Graphiti

11 of 39

12 of 39

Results

  • Formal proof for all implications completed on Day 9 (Oct 4)
  • Refutations were informally resolved on Day 57 (Nov 21)
  • Implication graph is conjecturally, e.g. informally, complete
    • Awaiting 3 formalizations

13 of 39

Finite graph

  • Some implications/refutations depend on whether the underlying magma is finite or infinite
  • Original aim of the project: Determine the implication graph for all magmas
  • Also possible to study the implication graph for finite magmas
  • The graphs are almost identical: <1k differences
    • All differences are where a finite implication is an infinite refutation
  • Finding finite implications is no longer first order
    • Most finite implications automatically proven using injectivity ⇔ surjectivity
    • The only implications that required human proofs were finite, using the fact that finite functions must be eventually periodic

14 of 39

Finite graph

  • A number of refutations originally determined using infinite constructions had finite counter-examples
    • Some difficult cases that required new methods to construct finite counter-examples
    • The largest finite counter-example found the project, of order 232, was originally resolved using infinite methods

15 of 39

Results: finite graph

Open question:

Does Equation 677 x = y ◇ (x ◇ ((y ◇ x) ◇ y))�imply or refute Equation 255 x = ((x ◇ x) ◇ x) ◇ x ?

16 of 39

Results

  • A paper is in the works
  • Project participants also studied a number of other related questions:
    • Which equations are equivalent to the Higman-Neumann equation characterizing division in groups?
    • Which equations have satisfying finite models of every cardinality?
    • Which equations of order 5 have satisfying finite/infinite models?
    • and more

17 of 39

18 of 39

Infrastructure

Goal: Maintaining all possible implications, and their status.

Automatically use transitivity:

  • 𝑎→𝑏 and 𝑏→𝑐 implies 𝑎→𝑐.
  • 𝑎→𝑏 and ¬(𝑎→𝑐) implies ¬(𝑏→𝑐).
  • 𝑏→𝑐 and ¬(𝑎→𝑐) implies ¬(𝑎→𝑏).

19 of 39

Infrastructure – continued

  • Positive implications: transitive closure
  • For anti-implications, we build a representation graph
    • Two copies of each law, 𝑥 and 𝑥′
    • 𝑥→𝑦 becomes 𝑦→𝑥 and 𝑦′→𝑥′
    • ¬(𝑥→𝑦) becomes the edge 𝑥→𝑦′
  • Transitive closure of the representation graph
    • A path 𝑥→𝑦′ is an anti-implication ¬(𝑥→𝑦)
    • A path 𝑦→𝑥 is the positive implication 𝑥→𝑦

20 of 39

Infrastructure – Lean Technicalities

  • Mark results relevant to the implication graph using the attribute @[equational_result].
  • Many anti-implications from a single magma.
  • Custom Facts syntax Facts M list1 list2
    • Satisfies the laws in list1
    • Refutes the laws in list2
  • Transitive closure with a dummy node
    • 𝑥→𝑀 for laws in list1
    • 𝑀→𝑥′ for laws in list2
  • Anti-implications can be proven in multiple magmas - match up magmas and anti-implications to minimize verification work.

21 of 39

Proof Automation

  • Low hanging fruit without much automation
    • Simple rewrites
    • Fixed proof scripts
    • Developed concurrently with the infrastructure
  • Two types of proof automation
    • Lean tactics: tightly integrated, produce proofs in Lean
    • ATPs: independent of Lean, soundness risk
  • Translating proofs from ATPs to Lean

22 of 39

Proof Automation – Classification

  1. ATP directly in Lean as a tactic
  2. A Lean tactic which uses an external ATP, and translates its output to a Lean proof
  3. An external script which uses the ATP to produce Lean code.
  4. Most used for new automations in this project, less user-friendly

23 of 39

Proof Automation – Vampire

Vampire is a state-of-the-art automated theorem prover for first order logic (Kovács, L., Voronkov, A.. First-Order Theorem Proving and Vampire, CAV 2013). It implements superposition calculus, a generalization of resolution for equational logic. Its main deductive step is

where 𝜃 is the most general unifier of 𝑙, 𝑠.

24 of 39

Proof Automation – Example Interaction With Vampire

  • problem.tptp

fof(lhs, axiom, X = mul(X, mul(mul(X, X), Y))).

fof(rhs, conjecture, X = mul(X, mul(X, X))).

danielweber:~$ vampire problem.tptp

1. ! [X0] : ! [X1] : mul(X0,mul(mul(X0,X0),X1)) = X0 [input]

2. ! [X0] : mul(X0,mul(X0,X0)) = X0 [input]

3. ~! [X0] : mul(X0,mul(X0,X0)) = X0 [negated conjecture 2]

4. ! [X0,X1] : mul(X0,mul(mul(X0,X0),X1)) = X0 [flattening 1]

5. ? [X0] : mul(X0,mul(X0,X0)) != X0 [ennf transformation 3]

6. ? [X0] : mul(X0,mul(X0,X0)) != X0 => sK0 != mul(sK0,mul(sK0,sK0)) [choice axiom]

7. sK0 != mul(sK0,mul(sK0,sK0)) [skolemisation 5,6]

8. mul(X0,mul(mul(X0,X0),X1)) = X0 [cnf transformation 4]

9. sK0 != mul(sK0,mul(sK0,sK0)) [cnf transformation 7]

10. mul(X0,mul(X0,X0)) = X0 [superposition 8,8]

11. sK0 != sK0 [superposition 9,10]

13. $false [trivial inequality removal 11]

100 𝑥 = 𝑥 ◇ ((𝑥 ◇ 𝑥) ◇ 𝑦)→8 𝑥 = 𝑥 ◇ (𝑥 ◇ 𝑥)

25 of 39

Proof Automation – Example Implication

  1. Suppose by contradiction (1) 𝑥 ◇ ( (𝑥 ◇ 𝑥) ◇ 𝑦 ) = 𝑥 and (∗)𝑥0 ≠ 𝑥0 ◇ (𝑥0 ◇ 𝑥0).
  2. Superpose equation (1) with itself and get (2) 𝑥 ◇ ( 𝑥 ◇ 𝑥 ) = 𝑥.
    1. 𝑥
    2. = 𝑥 ◇ ((𝑥 ◇ 𝑥) ◇ 𝑦)
    3. = 𝑥 ◇ (𝑥’ ◇ ( (𝑥’ ◇ 𝑥’) ◇ 𝑦’ )) (setting 𝑥′ ≔ 𝑥 ◇ 𝑥,𝑦 ≔ (𝑥’ ◇ 𝑥’) ◇ 𝑦’)
    4. = 𝑥 ◇ 𝑥’ (by equation (1))
    5. = 𝑥 ◇ ( 𝑥 ◇ 𝑥 )
  3. Rewrite with (2) at (∗) and get 𝑥0 ≠ 𝑥0.
  4. Contradiction.

26 of 39

Proof Automation – Superposition in Lean

To justify superposition in Lean, we need to show it from simple substitution.

  • Given (by Vampire) an equation 𝑎 which is superposed at 𝑏 to give 𝑐, we will match up 𝑏 and 𝑐, using 𝑎 or its inverse to resolve leftover equalities.
  • This can be approximated by convert b .. <;> (first | apply a | apply (a ..).symm).
  • convert can sometimes assign the metavariables incorrectly
    • In the previous example the simple approach would’ve tried to set y := x and prove 𝑥 ◇ 𝑥 = 𝑥
  • We use a custom backtracking implementation
  • There can also be leftover unassigned metavariables, which were erased by 𝑎:
    • Rewriting 𝑥 ◇ (𝑥’ ◇ ( (𝑥’ ◇ 𝑥’) ◇ 𝑦’ )) with 𝑥’ ◇ ( (𝑥’ ◇ 𝑥’) ◇ 𝑦’ ) = 𝑥’ erases 𝑦.
    • Assigned using assumption, which works because we used contradiction to get some 𝑥0,… which don’t satisfy the target equation.

27 of 39

Counterexamples

  • Vampire resolved all positive implications
  • Finite counterexamples from Vampire’s finite model building
  • We also used Vampire to find infinite models
  • Vampire’s algorithm is complete - when it finishes without a proof there is no proof. The refutation can be seen as finding a confluent set of rewriting rules for an equation, which can resolved all implications from that equation.
    • Rewriting rules: directed equalities
    • Confluence: every value reduces to a unique normal form by greedily applying rewriting rules
    • Wasn’t done automatically

28 of 39

Asterix Equation

  • Equation 65: x = y ◇ (x ◇ (y ◇ x)) - The “Asterix” equation
  • Equation 1491: x = (y ◇ x) ◇ (y ◇ (y ◇ x)) - The “Obelix” equation
  • Asterix implies Obelix for finite magmas.
  • Does it imply it for infinite magmas?

29 of 39

Ruleset Approach

  • A new approach for deciding equational theories modulo an equation
  • Define a rule: if (some list of hypotheses of the form aᵢ ◇ aⱼ = aₖ) then (either a conclusion of the form aᵢ ◇ aⱼ = aₖ or aᵢ = aⱼ).
  • A ruleset is valid if all finite partial binary operators ◇ which satisfy all the rules can be completed to a total binary operator which satisfies all of the rules (possibly on a larger base set).

30 of 39

Ruleset Approach – Continued

  • A ruleset is necessary for a given equation if the equation implies all of the rules.
  • A ruleset is complete for an equation 𝑒 if it’s valid, necessary for it, and implies it for total operators.
  • We also add a rule for ◇ being a single-valued - if a₁ ◇ a₂ = a₃ and a₁ ◇ a₂ = a₄ then a₃ = a₄, because we represent a partial function as a predicate ◇(a₁, a₂, a₃).

31 of 39

Ruleset Approach – Decision Procedure

A complete ruleset gives a decision procedure - assign a variable for each intermediate value in the equation, and repeatedly apply the rules to figure out the operation on these values, and which ones are equal.

If the two sides of the equation become equal, then by necessity it’s implied by our equation, and if they don’t by validity it can be completed to a counterexample.

32 of 39

Validity from Greedy Extension

  • The method we used for showing validity of rulesets is greedy extension
  • The construction is over an arbitrary countable set
  • At each step the partial function is finite, so we can find a “fresh” value 𝑐 which isn’t mentioned at all.
  • Choose some a ◇ b which is unassigned and set a ◇ b = 𝑐
  • Make sure other rules are satisfied
    • Apply one iteration of the rules, making sure they’re satisfied when the hypotheses use the old operation and the conclusion uses the new operation.
    • Only assign x ◇ y = z if one of x,y,z is 𝑐
  • Prove that the rules are still preserved

33 of 39

Greedy Extension – Example

Consider for example Equation 1648 - x = (x ◇ y) ◇ ((x ◇ y) ◇ y).

As a rule it becomes x ◇ y = a, a ◇ y = b -> a ◇ b = x.

Adding the rule for left injectivity, a ◇ x = y, b ◇ x = y -> a = b, this becomes a complete ruleset for equation 1648 via greedy extension.

34 of 39

Greedy Extension – Visualization

x ◇ b = a, a ◇ b = c -> a ◇ c = x

Refutes 3253 x ◇ x = x ◇ (x ◇ (x ◇ x))

1

2

3

4

5

6

7

8

9

10

1

2

3

3

7

2

2

4

5

8

1

3

6

9

1

4

10

2

5

6

7

8

9

1

2

3

1

2

3

3

2

3

1

2

3

4

1

2

3

3

2

4

1

3

4

1

2

3

4

5

1

2

3

3

2

4

5

1

3

4

5

1

2

3

4

5

6

1

2

3

3

2

4

5

1

3

6

4

5

6

1

2

3

4

5

6

7

1

2

3

3

7

2

2

4

5

1

3

6

4

5

6

7

1

2

3

4

5

6

7

8

1

2

3

3

7

2

2

4

5

8

1

3

6

4

5

6

7

8

1

2

3

4

5

6

7

8

9

1

2

3

3

7

2

2

4

5

8

1

3

6

9

1

4

5

6

7

8

9

1

2

3

4

5

6

7

8

9

10

1

2

3

3

7

2

2

4

5

8

1

3

6

9

1

4

10

2

5

6

7

8

9

35 of 39

Proof Automation – Ruleset Construction

  • Maintain a ruleset
  • Use Vampire to prove the rules are preserved when defining the new operation
  • If they aren’t, find a finite counterexample.
  • Either we need some x ◇ y = z but it’s false, or we need some x = y but they’re different.
    • We can add a new rule to make sure this rule will be satisfied if we were to do this again.
      • If the violation doesn’t involve 𝑐, we can add the rule with the assumptions from the finite counterexample.
      • If it’s 𝑐 = z we change the conclusion to a ◇ b = z.
      • Otherwise, we also need to add the assumption that a ◇ b = 𝑐.
  • Continue until we can prove the rules are preserved (or there are too many rules for it to either prove or find a counterexample in reasonable time)

36 of 39

Ruleset Construction – Example

Initial rules:

  1. a ◇ b = c, a ◇ b = d -> c = d
  2. x ◇ b = a, a ◇ b = c -> a ◇ c = x

a

b

x

c

a

c

b/x

b

a

a

a

x

a

a

a

c

New rule:

3. b ◇ a = a, b ◇ b = a, b ◇ x = a, x ◇ a = a, x ◇ b = a, x ◇ x = a -> b = x

Problem – excessively specific rule

a

b

x

a

b

a

a

a

x

a

a

a

37 of 39

Proof Automation – Rule minimization

Vampire sometimes finds counterexample which aren’t minimal, in terms of the rules they produce.

Sometimes hypotheses can be dropped, or there are values which are equal but don’t necessarily need to be.

To make the code more efficient, before we add a rule to our ruleset, we minimize it - attempt to drop hypotheses, and attempt to make different occurrences of a variable different. We use Vampire to check that the minimized rules remain consequences of the equation.

In the previous example, the rule b ◇ a = a, b ◇ b = a, b ◇ x = a, x ◇ a = a, x ◇ b = a, x ◇ x = a -> b = x would be minimized to b ◇ a = a, x ◇ a = a -> b = x by dropping hypotheses, and then to b ◇ a = y, x ◇ a = y -> b = x by making occurrences different.

38 of 39

Theoretical Tools

In addition to what was discussed in here, many approaches were used in the project for handling specific equations, for example:

  • Linear operations: x ◇ y = a x + b y
  • Translation invariant magmas: x ◇ y = x + f(y-x)
  • Magma twisting: x ◇’ y = Tx ◇ Uy
  • Magma cohomology, similar to group cohomology.

39 of 39

Conclusion

Proof assistants like Lean let us run mathematical projects at a large scale, allowing for contribution from anybody, with a high guarantee of correctness.