1 of 17

Optimization for Deep Learning

Prof. Seungchul Lee

Industrial AI Lab.

2 of 17

Why Optimization Matters

  • An important tool in
    1. Engineering problem solving and
    2. Decision science

  • In neural networks, optimization will be the key for tuning weights or finding the best model parameters

2

3 of 17

The Basics of Optimization

  • 3 key components
    1. Objective function
    2. Decision variable or unknown
    3. Constraints

  • Procedures
    • The process of identifying objective function, variables, and constraints for a given problem (known as "modeling”)
    • Once the model has been formulated, optimization algorithm can be used to find its solutions

3

4 of 17

Optimization: Mathematical Model

  •  

4

5 of 17

Optimization: Mathematical Model

  • In mathematical expression

  • Remarks: equivalent�

5

6 of 17

Solving Optimization Problems

6

7 of 17

Solving Optimization Problems

  •  

7

8 of 17

Solving Optimization Problems

  •  

8

9 of 17

 

  •  

9

10 of 17

 

  •  

10

11 of 17

Descent Direction (1D)

  • It motivates the gradient descent algorithm, which repeatedly takes steps in the direction of the negative gradient

11

12 of 17

Gradient Descent

12

13 of 17

Gradient Descent in Higher Dimension

13

14 of 17

Gradient Descent

  • Update rule:

14

15 of 17

 

  • Learning rate

15

16 of 17

Where will We Converge?

16

  • Random initialization
  • Multiple trials

17 of 17

Practically Solving Optimization Problems

  • The good news: for many classes of optimization problems, people have already done all the “hard work” of developing numerical algorithms
    • A wide range of tools that can take optimization problems in “natural” forms and compute a solution

  • Gradient descent
    • Easy to implement
    • Very general, can be applied to any differentiable loss functions
    • Requires less memory and computations (for stochastic methods)
    • Neural networks/deep learning
    • TensorFlow

17