1 of 65

Welcome to TensorFlow!

CS 20: TensorFlow for Deep Learning Research

Lecture 1

1/12/2018

1

Thanks Danijar Hafner for the logo!

2 of 65

2

3 of 65

Agenda

Welcome

Overview of TensorFlow

Graphs and Sessions

3

4 of 65

What’s TensorFlow™?

“Open source software library for �numerical computation using data flow graphs”

4

5 of 65

Launched Nov 2015

5

6 of 65

Why TensorFlow?

  • Many machine learning libraries

6

7 of 65

Why TensorFlow?

  • Flexibility + Scalability�Originally developed by Google as a single infrastructure for machine learning in both production and research

7

8 of 65

Why TensorFlow?

  • Flexibility + Scalability
  • Popularity�

8

9 of 65

Companies using TensorFlow

9

10 of 65

Demand for tutorials on TensorFlow

10

11 of 65

Some cool projects using TensorFlow

11

12 of 65

Classify skin cancer

12

Dermatologist-level classification of skin cancer with deep neural networks (Esteva et al., Nature 2017)

13 of 65

WaveNet: Text to Speech

13

Wavenet: A generative model for raw audio (Oord et al., 2016)

It takes several hours to synthesize 1 second!

14 of 65

Drawing

14

Draw Together with a Neural Network (Ha et al., 2017)

15 of 65

Neural Style Translation

15

Image Style Transfer Using Convolutional Neural Networks (Gatys et al., 2016)

Tensorflow adaptation by Cameroon Smith (cysmith@github)

16 of 65

I hope that this class will give you the tool to build cool projects like those!

16

17 of 65

Goals

  • Understand TF’s computation graph approach
  • Explore TF’s built-in functions and classes
  • Learn how to build and structure models best suited for a deep learning project

17

18 of 65

CS20

18

19 of 65

Staff

Chip Huyen�huyenn@stanford.edu

Michael Straka�mstraka2@stanford.edu

Pedro Garzon�pgarzon@stanford.edu

19

20 of 65

Logistics

  • Piazza: piazza.com/stanford/winter2018/cs20
  • Staff email: cs20-win1718-staff@lists.stanford.edu
  • Students mailing list: cs20-win1718-students
  • Guests mailing list: cs20-win1718-guests

20

21 of 65

Grading

  • Assignments (3)
  • Participation
  • Check in

21

22 of 65

Resources

  • The official documentations
  • TensorFlow’s official sample models
  • StackOverflow should be your first port of call in case of bug
  • Books
    • Aurélien Géron’s Hands-On Machine Learning with Scikit-Learn and TensorFlow (O’Reilly, March 2017)
    • François Chollet’s Deep Learning with Python (Manning Publications, November 2017)
    • Nishant Shukla’s Machine Learning with TensorFlow (Manning Publications, January 2018)
    • Lieder et al.’s Learning TensorFlow A Guide to Building Deep Learning Systems (O’Reilly, August 2017)

22

23 of 65

Permission Number

23

24 of 65

Many of you are ahead of me in academia so I probably need more of your help than you do mine

24

25 of 65

Getting Started

25

26 of 65

import tensorflow as tf

26

27 of 65

Graphs and Sessions

27

28 of 65

Data Flow Graphs

TensorFlow separates definition of computations from their execution

Graph from TensorFlow for Machine Intelligence

28

29 of 65

Data Flow Graphs

Phase 1: assemble a graph

Phase 2: use a session to execute operations in the graph.

Graph from TensorFlow for Machine Intelligence

29

30 of 65

Data Flow Graphs

Phase 1: assemble a graph

Phase 2: use a session to execute operations in the graph.

Graph from TensorFlow for Machine Intelligence

30

This might change in the future with eager mode!!

31 of 65

What’s a tensor?

31

32 of 65

What’s a tensor?

An n-dimensional array

0-d tensor: scalar (number)

1-d tensor: vector

2-d tensor: matrix

and so on

32

33 of 65

Data Flow Graphs

import tensorflow as tf�a = tf.add(3, 5)

33

Visualized by TensorBoard

34 of 65

Data Flow Graphs

import tensorflow as tf�a = tf.add(3, 5)

34

Why x, y?

TF automatically names the nodes when you don’t explicitly name them.

x = 3

y = 5

Visualized by TensorBoard

35 of 65

Data Flow Graphs

import tensorflow as tf�a = tf.add(3, 5)

35

Interpreted?

5

3

a

Nodes: operators, variables, and constants

Edges: tensors

36 of 65

Data Flow Graphs

import tensorflow as tf�a = tf.add(3, 5)

36

Nodes: operators, variables, and constants

Edges: tensors

Tensors are data.

TensorFlow = tensor + flow = data + flow

(I know, mind=blown)

Interpreted?

5

3

a

37 of 65

Data Flow Graphs

import tensorflow as tf�a = tf.add(3, 5)�print(a)

37

>> Tensor("Add:0", shape=(), dtype=int32)

(Not 8)

5

3

a

38 of 65

How to get the value of a?

Create a session, assign it to variable sess so we can call it later

Within the session, evaluate the graph to fetch the value of a

38

39 of 65

How to get the value of a?

Create a session, assign it to variable sess so we can call it later

Within the session, evaluate the graph to fetch the value of a

import tensorflow as tf�a = tf.add(3, 5)�sess = tf.Session()�print(sess.run(a))�sess.close()

39

The session will look at the graph, trying to think: hmm, how can I get the value of a,

then it computes all the nodes that leads to a.

40 of 65

How to get the value of a?

Create a session, assign it to variable sess so we can call it later

Within the session, evaluate the graph to fetch the value of a

import tensorflow as tf�a = tf.add(3, 5)�sess = tf.Session()�print(sess.run(a))�sess.close()

40

>> 8

8

The session will look at the graph, trying to think: hmm, how can I get the value of a,

then it computes all the nodes that leads to a.

41 of 65

How to get the value of a?

Create a session, assign it to variable sess so we can call it later

Within the session, evaluate the graph to fetch the value of a

import tensorflow as tf�a = tf.add(3, 5)�sess = tf.Session()�with tf.Session() as sess:� print(sess.run(a))�sess.close()

41

8

42 of 65

tf.Session()

A Session object encapsulates the environment in which Operation objects are executed, and Tensor objects are evaluated.

42

43 of 65

tf.Session()

A Session object encapsulates the environment in which Operation objects are executed, and Tensor objects are evaluated.

�Session will also allocate memory to store the current values of variables.

43

44 of 65

More graph

x = 2�y = 3�op1 = tf.add(x, y)�op2 = tf.multiply(x, y)�op3 = tf.pow(op2, op1)�with tf.Session() as sess:� op3 = sess.run(op3)

44

Visualized by TensorBoard

45 of 65

Subgraphs

x = 2�y = 3�add_op = tf.add(x, y)�mul_op = tf.multiply(x, y)�useless = tf.multiply(x, add_op)�pow_op = tf.pow(add_op, mul_op)�with tf.Session() as sess:� z = sess.run(pow_op)

45

Because we only want the value of pow_op and pow_op doesn’t depend on useless, session won’t compute value of useless

→ save computation

useless

add_op

mul_op

pow_op

46 of 65

Subgraphs

x = 2�y = 3�add_op = tf.add(x, y)�mul_op = tf.multiply(x, y)�useless = tf.multiply(x, add_op)�pow_op = tf.pow(add_op, mul_op)�with tf.Session() as sess:� z, not_useless = sess.run([pow_op, useless])

46

tf.Session.run(fetches,

feed_dict=None,

options=None,

run_metadata=None)

fetches is a list of tensors whose values you want

useless

add_op

mul_op

pow_op

47 of 65

Subgraphs

Possible to break graphs into several chunks and run them parallelly across multiple CPUs, GPUs, TPUs, or other devices

Example: AlexNet

Graph from Hands-On Machine Learning with Scikit-Learn and TensorFlow

47

48 of 65

Distributed Computation

To put part of a graph on a specific CPU or GPU:

# Creates a graph.

with tf.device('/gpu:2'):

a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], name='a')

b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], name='b')

c = tf.multiply(a, b)

# Creates a session with log_device_placement set to True.

sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

# Runs the op.

print(sess.run(c))

48

49 of 65

What if I want to build more

than one graph?

49

50 of 65

You can

but you don’t need more than one graph

The session runs the default graph

50

51 of 65

But what if I really want to?

51

52 of 65

URGH, NO

52

53 of 65

  • Multiple graphs require multiple sessions, each will try to use all available resources by default
  • Can't pass data between them without passing them through python/numpy, which doesn't work in distributed
  • It’s better to have disconnected subgraphs within one graph

53

BUG ALERT!

54 of 65

I insist ...

54

55 of 65

tf.Graph()

create a graph:

g = tf.Graph()

55

56 of 65

tf.Graph()

to add operators to a graph, set it as default:

g = tf.Graph()�with g.as_default():� x = tf.add(3, 5)�sess = tf.Session(graph=g)�with tf.Session() as sess:� sess.run(x)

56

57 of 65

tf.Graph()

To handle the default graph:

g = tf.get_default_graph()

57

58 of 65

tf.Graph()

Do not mix default graph and user created graphs

g = tf.Graph()

# add ops to the default graph�a = tf.constant(3)

# add ops to the user created graph�with g.as_default():� b = tf.constant(5)

58

Prone to errors

59 of 65

tf.Graph()

Do not mix default graph and user created graphs

g1 = tf.get_default_graph()�g2 = tf.Graph()

# add ops to the default graph�with g1.as_default():� a = tf.Constant(3)

# add ops to the user created graph�with g2.as_default():� b = tf.Constant(5)

59

Better

But still not good enough because no more than one graph!

60 of 65

60

61 of 65

Why graphs

  • Save computation. Only run subgraphs that lead to the values you want to fetch.

61

62 of 65

Why graphs

  • Save computation. Only run subgraphs that lead to the values you want to fetch.
  • Break computation into small, differential pieces to facilitate auto-differentiation

62

63 of 65

Why graphs

  • Save computation. Only run subgraphs that lead to the values you want to fetch.
  • Break computation into small, differential pieces to facilitate auto-differentiation
  • Facilitate distributed computation, spread the work across multiple CPUs, GPUs, TPUs, or other devices

63

64 of 65

Why graphs

  • Save computation. Only run subgraphs that lead to the values you want to fetch.
  • Break computation into small, differential pieces to facilitate auto-differentiation
  • Facilitate distributed computation, spread the work across multiple CPUs, GPUs, TPUs, or other devices
  • Many common machine learning models are taught and visualized as directed graphs

A neural net graph from Stanford’s CS224N course

64

65 of 65

Next class

Basic operations

Constants and variables

Data pipeline

Fun with TensorBoard

Feedback: huyenn@stanford.edu

Thanks!

65