1 of 9

Newton’s method

2 of 9

Line search improves convergence, but does not solve all convergence problems

Last time, we saw that gradient descent with line search fails to converge in reasonable time for some simple functions

���

3 of 9

Line search improves convergence, but does not solve all convergence problems

Last time, we saw that gradient descent with line search fails to converge in reasonable time for some simple functions

�Sometimes step direction is the problem!��

4 of 9

Line search improves convergence, but does not solve all convergence problems

Last time, we saw that gradient descent with line search fails to converge in reasonable time for some simple functions

�Sometimes step direction is the problem!��

min

–gradient

5 of 9

Line search improves convergence, but does not solve all convergence problems

Last time, we saw that gradient descent with line search fails to converge in reasonable time for some simple functions

�Sometimes step direction is the problem!��

Example: if function is imbalanced, gradient is almost orthogonal to the minimum

min

–gradient

6 of 9

We can improve the step direction by taking curvature into account

Newton’s method

7 of 9

We can improve the step direction by taking curvature into account

Example: if curvature is strong in one direction, don’t step too far in that direction because the function will increase again

Newton’s method

8 of 9

We can improve the step direction by taking curvature into account

Example: if curvature is strong in one direction, don’t step too far in that direction because the function will increase again

Newton’s method addresses this by choosing the step length that minimizes a local quadratic approximation of the function

Newton’s method

9 of 9

What to know for next time

Topic Loss functions and regression