Premise Selection in Natural Language:
Finding supporting facts for mathematical statements
(Work in Progress)
Deborah Ferreira
deborah.ferreira@manchester.ac.uk
University of Manchester
Motivational Example
?
Proposition...
Premise
?
Proposition...
E=mc²
Why not use ATPs?
Non trivial!
Starting question...
Can we adapt Natural Language Processing techniques in order to understand corpora containing mathematical text?
Focus: Natural Language Premise Selection
Natural Language Premise Selection
Definition (Natural Language Premise Selection): Given a set of premises (or supporting facts) P in a mathematical corpus (containing both natural language and mathematical elements) and a new conjecture c proposed by a user, predict those premises from P that will most likely be useful for generating a proof for c.
Natural Language Premise Selection
Mathematical KB
Conjecture
Premise 1
Premise 2
Premise n
...
Model
Natural Language premise selection: Example
Premises
Natural Language premise selection
Data Source: ProofWiki
Reframing as binary classification problem
From a set of possible premises of size n + 1, select the one that is useful for proving a conjecture c.
Conjecture
Premise
Not Premise
n=1
Premise
Not Premise
n=2
Premise
Not Premise
n=5
Premise
Not Premise
n=10
Not Premise
Not Premise
Not Premise
Not Premise
Not Premise
Not Premise
Not Premise
Not Premise
….
Reframing as binary classification problem
Conjecture - Premise Pairs.
Goal: Classify as True or False.
Selection of Distracting Premises:
Number of distractors
Handling Mathematical Text
Statement Representation
(STAR)
Combining Representations
Is B a premise to A?
Our Baseline:
Selecting Random Distracting Premises
Our Approach
Baseline
Selecting Similar Distracting Premises
Our Approach
Baseline
Transferring Knowledge between different areas
AA: Abstract Algebra
ST: Set Theory
TP: Topology
Our Approach
Baseline
Qualitative Analysis
Comparison with other baselines
Conclusion
Thanks for your attention!
Questions?