Differentiation of blackbox combinatorial solvers
March 9, 2020
M. Rolínek
Autonomous Learning, MPI-IS Tübingen
BASED ON
Review scores: 8,8,8
Review scores: strong accept,strong accept,strong accept
1.
THE TALE OF TWO WORLDS
Combinatorial optimization
Deep Learning
Standpoint: Fancy combinatorial optimization solvers are ridiculously strong...
...to the point that it compromises the concept of NP-Hardness
Mission: Turn SOTA combinatorial optimization techniques into building blocks of neural networks.
...with minimal compromises! (one hyperparameter)
IN PARTICULAR
WHY BOTHER?
We argue that combinatorial generalization must be a top priority for AI to achieve human-like abilities.
[Battaglia et al, Relational Inductive Biases, Deep Learning and Graph Networks]
ARCHITECTURES FOR COMPOSITE PROBLEMS
2.
WHAT’S THE PROBLEM?
WHAT’S A SOLVER (FOR US)
continuous input w
discrete output y
linear cost c = y*w
arbitrary constraints
EXAMPLE SOLVERS
THE GRADIENT EXISTS!
Small change of input (edge weights) typically means zero change in output (optimal path).
IMPORTANT NON-SOLUTIONS
3.
WHAT WE DO?
THE ALGORITHM
Return gradient of function Lλ(w), a continuous “interpolation” of L(w).
Hyperparameter λ > 0 controls the “locality” of the interpolation
2D EXAMPLE
2D EXAMPLE
THE ALGORITHM
BEHIND THE SCENES II
4.
EXPERIMENTS
STAGE I - PROOF OF CONCEPT
Synthetic problems. Combinatorics on raw images.
RESULTS
[Differentiation of Blackbox Combinatorial Solvers, to appear at ICLR 2020]
STAGE II - LIGHTWEIGHT VISION
[Optimizing Rank-Based Metrics with Blackbox Differentiation, may appear at CVPR 2020]
STAGE III - MIDDLEWEIGHT VISION
STAGE III - MIDDLEWEIGHT VISION
SUMMARY
WORDS OF CAUTION