TensorFlow Tutorial [part 2]

Aram, Greg, Sami

CS 699. Representation Learning. Fall 2019

Agenda: TensorFlow tutorial

  • Datasets, zip, iterators, batching & shuffling.
  • Variables, assignment.
  • Layers.
  • Collections, Regularization, Losses.
  • Optimization Loops.
  • Saving, loading.

CS 699. Representation Learning. Fall 2019

Goal from this material

The goal is not to tour the different TensorFlow frameworks (Keras, etc).

Instead, goal is to give technical insight as to ...

… how to re-build these frameworks from scratch

Useful?

Of course:

  • For your own research! Make your computation faster. Write less code.
  • Adding new tensorflow features (e.g. make a graph algorithm, e.g. max-bipartite matching, be able to pass gradients) are publishable [with some sales pitch]!

CS 699. Representation Learning. Fall 2019

Agenda: TensorFlow tutorial

  • Datasets, zip, iterators, batching & shuffling.
  • Variables, assignment.
  • Layers.
  • Collections, Regularization, Losses.
  • Optimization Loops.
  • Saving, loading.

CS 699. Representation Learning. Fall 2019

Python VS C++

  • TensorFlow has Python APIs but executes on C++
  • In many cases, it is possible to accomplish tasks using a combination of Python and C++.
    • Computation inference/prediction (forward pass) and gradients (backward pass), should be done in C++.
    • Graph definition should be done in Python.
    • What about ?
      • Data manipulation [data on disk, is *not* exactly to be fed]. Examples:
        • Processing Text [maybe you want to map “words” onto IDs/onehot vectors, or use some pre-trained embedding like word2vec]
        • Data Augmentation: sometimes (flip) image, stretch it, or any affine projection. Dropout on large (sparse) inputs e.g. documents.
      • Meta-Optimization
        • Optimizing a neural network as part of alternating minimization e.g. with a quadratic solver.
      • Should these be on C++ or in Python?

CS 699. Representation Learning. Fall 2019

Python or C++?

  • Roughly, moving more logic to C++ implies faster execution.
  • If there is something you do a lot (e.g. image pre-processing), chances are, someone contributed a TensorFlow implementation.
  • In general, with small effort, you can do most operations in C++ engine.
    • Operation should be easily described using tensor transformations.
    • If not (e.g. io, like read image/video), someone likely contributed it to TF.
  • Question we will answer:
    • Where to do Data Manipulation? [for-loops over dataset, load images from disk]
      • In C++ or Python?
        • Recommendation: Rely more on C++, whenever possible!

CS 699. Representation Learning. Fall 2019

Data Manipulation:

  • tf.data: module contains many data readers.
  • tf.data.Dataset class:
    • An instance points to some data source (e.g. image list on disk)
      • Provides iterators over source, usable in the TF graph as Tensors.
      • E.g. calling sess.run() on an iterator’s tensor, returns numpy array of example.
    • Class is mature. Existing functionality allows for many of our use-cases (e.g. files from disk)
    • Follows the Builder Pattern (functions return self with modified behavior)
      • shuffle(buffer_size) # returns same data source, with shuffled order.
      • map(fn) # map takes a function that transforms every data element.
      • batch(batch_size) # iterators built after return extra (batch) dimension with given size

dataset = … # some images dataset.

batched_train = dataset.map(random_flip).map(random_crop).batch(20)

iter = batched_train.make_initializable_iterator()

CS 699. Representation Learning. Fall 2019

tf.data.Dataset

  • A number of useful concrete implementations. E.g. For CSV data.

Code next slides (copied from https://www.tensorflow.org/api_docs/python/tf/data/experimental/CsvDataset)

CS 699. Representation Learning. Fall 2019

CS 699. Representation Learning. Fall 2019

Other Datasets (we won’t cover)

  • File-list datasets tf.data.Dataset.list_files(pattern):
    • Given a directory pattern (‘/home/users/.../*.jpg’), returns iterator and its tensor will evaluate to one filename each evaluation (sess.run).
  • TFRecordDataset (to read TF Records)
    • Compact dataset created by APIs for reading and writing files (see tutorial)
  • TextLineDataset

CS 699. Representation Learning. Fall 2019

Datasets: Common Usage Pattern

More or less, 95%+ of data pipelines in TensorFlow, would follow this pattern:

  • Construct Dataset object, pointing to your data.
  • Manipulate your data (get it ready for your pipeline).
  • Combine data sources (if applicable) [a.k.a Dataset zip].
  • Shuffle.
  • Batch.
  • Make Iterator.

CS 699. Representation Learning. Fall 2019

Try out Datasets in Terminal

  • Recall: Point → Map / Zip → Shuffle → Batch → Iterate

CS 699. Representation Learning. Fall 2019

Variables

Variables are Tensors!

  • Like placeholders:
    • Have a type & shape. Usable in TF operations (tf.matmul) and overloads python operators: +, -, /, *
  • Unliked placeholders:
    • Shape must be fully-defined (dynamic [None] dimensions not allowed).
    • Each variable occupies memory footprint in TensorFlow session (on C++).
    • Has an initializer. An assign op, when ran (via sess.run), reserves memory in C++ and sets the variable to its initial value.
    • Offers assign(tensor) function. It returns an assign op. When ran (via sess.run), the memory content of variable will be set to value of tensor.

Walk through docs: https://www.tensorflow.org/api_docs/python/tf/Variable

CS 699. Representation Learning. Fall 2019

Switch to Terminal to things out!

  • Try out variables. Noting:
    • tf.global_variables()
    • tf.trainable_variables()
    • assign, assign_add, assign_sub.
    • Initializers.
    • Two different sessions.
    • Optimization.

CS 699. Representation Learning. Fall 2019

Agenda: TensorFlow tutorial

  • Datasets, zip, iterators, batching & shuffling.
  • Variables, assignment.
  • Layers.
  • Collections, Regularization, Losses.
  • Optimization Loops.
  • Saving, loading.

CS 699. Representation Learning. Fall 2019

Layers

  • There are two kinds of layer implementations: One in contrib, another in keras. They achieve similar functionality.
    • They have what we want (batch norm layer, fully-connected/dense/matmul layer, convolution, dropout, ...)
  • They differ on:
    • Keras layers have “python class” implementations. Can be useful for Object-Oriented Programmers (e.g. override a layer class to do something special, useful for me when doing Meta-Learning).
    • tf.contrib is function based.
    • Contrib takes “is_training” tensor, whereas Keras tends not to.
      • Using first, you write “one model” for training and testing. When training, feed is_training to True. When testing, feed to False.
    • Lots of times, I mix-and-match Keras+tf.contrib
    • Keras offers .fit() and .predict() i.e. like sklearn. It does *not* modify loss collections.
    • Show & Explain MNIST in Keras: https://www.tensorflow.org/tutorials

CS 699. Representation Learning. Fall 2019

Switch to Terminal to things out!

  • Try different layers [contrib VS keras]
    • Noticing Collections.

CS 699. Representation Learning. Fall 2019

Agenda: TensorFlow tutorial

  • Datasets, zip, iterators, batching & shuffling.
  • Variables, assignment.
  • Layers.
  • Collections, Regularization, Losses.
  • Optimization Loops.
  • Saving, loading.

CS 699. Representation Learning. Fall 2019

Optimization Loop

In Keras, it is automatic (model.fit()). In barebones TensorFlow:

opt = tf.train.AdamOptimizer(learning_rate=0.01)

train_op = tf.contrib.training.create_train_op(tf.losses.get_total_loss(), opt)

# repeatedly call (in for-loop):

sess.run(train_op, { … })

CS 699. Representation Learning. Fall 2019

Optimization Loop

In Keras, it is automatic (model.fit()). In barebones TensorFlow:

opt = tf.train.AdamOptimizer(learning_rate=0.01)

train_op = tf.contrib.training.create_train_op(tf.losses.get_total_loss(), opt)

# repeatedly call (in for-loop):

sess.run(train_op, { … })

What is train_op?

CS 699. Representation Learning. Fall 2019

Optimization Loop

In Keras, it is automatic (model.fit()). In barebones TensorFlow:

opt = tf.train.AdamOptimizer(learning_rate=0.01)

train_op = tf.contrib.training.create_train_op(tf.losses.get_total_loss(), opt)

# repeatedly call (in for-loop):

sess.run(train_op, { … })

What is train_op?

  • Array of many var.assign(var - lr * grad_loss_var)
    • Plus update ops. E.g. by Adam, BatchNorm, etc.

CS 699. Representation Learning. Fall 2019

Agenda: TensorFlow tutorial

  • Datasets, zip, iterators, batching & shuffling.
  • Variables, assignment.
  • Layers.
  • Collections, Regularization, Losses.
  • Optimization Loops.
  • Saving, loading.

CS 699. Representation Learning. Fall 2019

Saving Model

  • You can save the parameters like:

import pickle
var_dict = {v.name: v
for v in tf.global_variables()}
pickle.dump(sess.run(var_dict),
open('trained_vars.pkl', 'w'))

  • And restore like:

import pickle
var_values = pickle.load(
open('trained_vars.pkl'))
assign_ops = [v.assign(var_values[v.name])
for v in tf.global_variables()]
sess.run(assign_ops)

23

CS 699. Representation Learning. Fall 2019

Why not Keras-all-the-way?

  • On your assignments, you can use Keras or not, TensorFlow or not, for the most part (though some specific questions on HW2 require tensorflow).
  • Keras provides a lot of easy-to-use functionality (I use it, too, but not everywhere), esp for “linked-list” feed-forward computation.
  • We want to teach barebones TensorFlow to allow us to do computation that is not trivially supported by Keras, including Graph Convolution.

CS 699. Representation Learning. Fall 2019

Things we do not cover in detail

  • Distributed computation / Shared Parameter Servers.
    • https://www.tensorflow.org/guide/distribute_strategy
    • Useful if you have many computers and very large data, to train shared model (e.g. using Downpour SGD; Dean et al, NeurIPS 2012).
  • TensorBoard:
  • Scoping
    • Only adds prefix to variable names, created under a scope.
    • [try out in terminal]
  • Program controls: tf.while, tf.if

CS 699. Representation Learning. Fall 2019

Agenda: TensorFlow tutorial

  • Datasets, zip, iterators, batching & shuffling.
  • Variables, assignment.
  • Layers.
  • Collections, Regularization, Losses.
  • Optimization Loops.
  • Saving, loading.

CS 699. Representation Learning. Fall 2019

Tensorflow Part 2 - Google Slides