Evaluating Equivariance for Reconstruction
Savannah Thais, Columbia University
Daniel Murnane, Lawrence Berkeley National Lab
Why do some ML models outperform others?
We often highly fine-tune models and attribute their performance to a certain design choice: size, input features, hyperparemeters, network design, etc
Physical Inductive Biases
Data Structure
Model Constraints
Task Formulation
Relational structure, ordering, feature selection, pre-processing, etc
Restricting model weights, learned function, propagated information, etc
Physics informed neural networks, incorporating conservation laws or equations through loss function design, etc
Symmetries are (potentially) powerful physical inductive biases
ɸ
ɸ
Image from arxiv:2102.09844
Equivariant networks are popular, partially because they are (in some formulations) easy to implement
Diagram from Daniel Murnane
Potential Benefits of Equivariance
Model Efficiency
Accuracy
Generalizability
Data Efficiency
Physics as an ML Sandbox
Particle Tracking
Jet Tagging
Image from Nelson
Image from arxiv:2109.12636
Baseline Tagging Models
ParticleNet
ResNexT
Particle Flow
Energy Flow Polynomials
Equivariant Tagging Models
LorentzNet
NN with CG-layers that take tensor products and decompose into irreps using Clebsch-Gordan map, on particle features
Lorentz Group Network
Deep set-esq network using all totally symmetric Lorentz invariants and full set of 15 rank 2 to rank 2 maps as aggregators
PELICAN
Message passing GNN with Lorentz equivariant message and (optionally) unconstrained message, on particle graph
VecNet
Tracking Models
EuclidNet
Interaction Network
Message passing GNN with node and edge updates, on hit graph (with physics-based edge construction)
Message passing GNN with SO(2)-equivariant message construction, on hit graph (with physics-based edge construction)
Evaluating Equivariance
Accuracy
Model Efficiency
| Accuracy | AUC | Parameters | Ant Factor |
ResNeXt | 0.936 | 0.984 | 1.46M | 4.28 |
ParticleNet | 0.938 | 0.985 | 498k | 13.4 |
PFN | 0.932 | 0.982 | 82k | 67.8 |
EFP | 0.932 | 0.980 | 1k | 5000 |
LGN | 0.929 | 0.964 | 4.5k | 617 |
VecNet.1 | 0.935 | 0.984 | 633k | 9.87 |
VecNet.2 | 0.931 | 0.981 | 15k | 350 |
PELICAN | 0.943 | 0.987 | 45k | 171 |
LorentzNet | 0.942 | 0.9868 | 220k | 35 |
| N Hidden | AUC | Parameters | Ant Factor |
EuclidNet | 8 | 0.9913 | 967 | 11887 |
InteractionNet | 8 | 0.9849 | 1432 | 4625 |
EuclidNet | 16 | 0.9932 | 2580 | 5700 |
InteractionNet | 16 | 0.9932 | 4392 | 3348 |
EuclidNet | 32 | 0.9941 | 4448 | 3811 |
InteractionNet | 32 | 0.9978 | 6448 | 7049 |
Ant factor = 10^5/[(1-AUC)*N_p]
Tracking
Tagging
Evaluating Equivariance
Generalizability
Data Efficiency
| Training % | Accuracy | AUC |
LorentzNet | 0.5% | 0.932 | 0.9793 |
ParticleNet | 0.5% | 0.913 | 0.9687 |
LorentzNet | 1% | 0.932 | 0.9812 |
ParticleNet | 1% | 0.919 | 0.9734 |
LorentzNet | 5% | 0.937 | 0.9839 |
ParticleNet | 5% | 0.931 | 0.9839 |
Tagging
Tagging
Tracking
So, What Now?
What kinds of inductive biases are useful? How are they useful?
Over-constraint?
Is full equivariance the right approach for HEP tasks?
Expressivity?
Equivariance is a bit of an ill-defined model characteristic
Fundamental work on network expressivity indicates that feature choice and message construction largely determine expressivity
Transformers?
Transformers provide excellent performance with minimal or no inductive bias…
In a many parameter, big data regime does physics matter at all?
Three Takeaways
Consider the Physics Goals
Explore Other Inductive Biases
There is no one-size-fits-all solution to building the optimal model for a physics task. Trade-offs between model size, compute resources, data availability, robustness, etc are key
Consider inductive biases that modify data structures or task design, rather than constraining optimization space
Conduct apples-to-apples model comparisons that are better able to isolate the impact of design choices and carefully designed ablation studies
More Systematic Studies
What’s Next?
Let’s discuss!
st3565@columbia.edu
@basicsciencesav
Our paper: Equivariance Is Not All You Need
CREDITS: This presentation template was created by Slidesgo, and includes icons by Flaticon, and infographics & images by Freepik