1 of 31

Neural networks course

Recitation 4: SVMs, interpretability

2 of 31

Motivation

  • For linearly separable data, the perceptron convergence to a optimal solution on the train
    • Not necessarily optimal with regard to generalization (test set performance)
  • Idea: In order to generalize, find a separation plane that maximizes the distance to the closest training points

3 of 31

4 of 31

5 of 31

6 of 31

7 of 31

8 of 31

Formulation

9 of 31

Formulation

  • Scaling w doesn’t change the distance, so we can set w such that wxiyi = 1 for the minimal i

10 of 31

11 of 31

12 of 31

Dual representation

  • We have a quadratic minimization problem with linear constraints
  • This can be efficiently solved with lagrange multipliers

Dual problem

Dual function

Lagrangian

13 of 31

Dual representation for SVMs

14 of 31

We will not show the solution to the maximization of g. But it turns out all αi are zero except of the ones belonging to the closest points: the support vectors.

15 of 31

Classification: weight all training example

16 of 31

Non linearly separable data

17 of 31

18 of 31

Non linear data

19 of 31

The dual representation

20 of 31

Non linearly separable data

  • Prediction requires calculating the dot product between the example x and all other example xi!

  • The kernel trick computes the dot product in the feature space implicitly

21 of 31

The kernel trick

  • We can often calculate the product without having to calculate and
  • Consider the following kernel: φ((a, b)) = (a, b, a2 + b2)

  • Naive calculation:
    • φ((a, b))φ((c, d)) = (a, b, a2 + b2)(c, d, c2 + d2) = ac + bd + (a2 + b2) (c2 + d2) = ac + bd + a2c2 + a2d2 + b2c2 + b2d2
  • Kernel trick: directly use ac + bd + (a2 + b2) (c2 + d2) = ac + bd + a2c2 + a2d2 + b2c2 + b2d2

22 of 31

23 of 31

Polynomial Kernel

24 of 31

25 of 31

Kernels

  • Theorem: if K is PSD then it represents a valid kernel (i.e. it corresponds to dot product in some feature space)

26 of 31

RBF kernel

27 of 31

28 of 31

29 of 31

30 of 31

31 of 31

Interpretability: Deep Dream