1 of 17

Optimization for Deep Learning

Prof. Seungchul Lee

Industrial AI Lab.

2 of 17

Why Optimization Matters

An important tool in

Engineering problem solving and
Decision science

In neural networks, optimization will be the key for tuning weights or finding the best model parameters

3 of 17

The Basics of Optimization

3 key components

Objective function
Decision variable or unknown
Constraints

Procedures

The process of identifying objective function, variables, and constraints for a given problem (known as "modeling”)
Once the model has been formulated, optimization algorithm can be used to find its solutions

4 of 17

Optimization: Mathematical Model

5 of 17

Optimization: Mathematical Model

In mathematical expression

Remarks: equivalent�

6 of 17

Solving Optimization Problems

7 of 17

Solving Optimization Problems

8 of 17

Solving Optimization Problems

11 of 17

Descent Direction (1D)

It motivates the gradient descent algorithm, which repeatedly takes steps in the direction of the negative gradient

12 of 17

Gradient Descent

13 of 17

Gradient Descent in Higher Dimension

14 of 17

Gradient Descent

Update rule:

15 of 17

Learning rate

16 of 17

Where will We Converge?

Random initialization
Multiple trials

17 of 17

Practically Solving Optimization Problems

The good news: for many classes of optimization problems, people have already done all the “hard work” of developing numerical algorithms

A wide range of tools that can take optimization problems in “natural” forms and compute a solution

Gradient descent

Easy to implement
Very general, can be applied to any differentiable loss functions
Requires less memory and computations (for stochastic methods)
Neural networks/deep learning
TensorFlow

1 of 17

2 of 17

3 of 17

4 of 17

5 of 17

6 of 17

7 of 17

8 of 17

9 of 17

10 of 17

11 of 17

12 of 17

13 of 17

14 of 17

15 of 17

16 of 17

17 of 17