TensorFlow Haskell API

BayHac 2017

Judah Jacobson

Frederick Mayle

Greg Steuck

github.com/tensorflow/haskell

TensorFlow Haskell API

BayHac 2017

Judah Jacobson

Frederick Mayle

Greg Steuck

github.com/tensorflow/haskell

TensorFlow

- An open-source library for Machine Learning
- Developed by Google, open-sourced in Nov 2015; release 1.0 in Feb 2017
- Graph-based computational framework to distribute computation between different devices (CPU, GPU, ASIC) and multiple machines
- Mainly used to build, train and evaluate neural networks (research and production)
- Used for: image classification, speech recognition, beating humans at Go, ...
- The most developed/supported API is in Python; other bindings exist for C++, Java, Rust, Go, Haskell.

Why Haskell?

- Use Haskell types to model TensorFlow’s runtime semantics
- Catch errors earlier: at compile time, rather than (for example) after scheduling a job on a remote machine
- Refactor code more safely
- Distinguish “pure” and “stateful” bits of code

Why TensorFlow?

- Computational kernels are already optimized (CPU, GPU, ...)
- Distributed computation between machines
- Existing tooling for monitoring, serving
- Designed for interop with multiple languages

Outline of Talk

- Introduction to TensorFlow using the Haskell bindings
- Details of Haskell APIs, and benefits of type safety
- Graph construction with a hybrid pure/monadic API
- Composable type constraints for tensor operations
- Code walkthrough and live demo

Constructing and Running Graphs

main = runSession $ do

let node1 = vector [3::Float, 4]

let node2 = vector [1, 1]

let node3 = node1 + node

result <- run node3

liftIO $ print $ Vector.toList result

>> [4.0, 5.0]

Const

Const

Add

node2

node1

node3

Graph of “nodes”: each node takes zero or more tensors as inputs, and produces zero or more outputs

“Tensor” here is a handle to a node in the graph. A tensor represents a multidimensional array.

Note: run is overloaded; this is what it evaluates to here

Constructing and Running Graphs

main = runSession $ do -- runSession :: Session a -> IO a

-- instance Monad Session

let node1 = vector [3::Float, 4] :: Tensor Build Float

let node2 = vector [1, 1] :: Tensor Build Float

-- instance ... => Num (Tensor Build a)

let node3 = node1 + node

-- run :: Tensor Build Float -> Session (Vector Float)

result <- run node3

liftIO $ print $ Vector.toList result

Graph of “nodes”: each node takes zero or more tensors as inputs, and produces zero or more outputs

“Tensor” here is a handle to a node in the graph. A tensor represents a multidimensional array.

Note: run is overloaded; this is what it evaluates to here

Composition of Ops

main = runSession $ do

let node1 = vector [3::Float, 4]

let node2 = vector [1, 1]

let node3 = node1 + node2

let node4 = 3 * node3

result <- run node4

liftIO $ print $ Vector.toList result

>> [12.0,15.0]

Const

Const

Add

node2

node1

node3

Mul

node4

Const

“3”

TensorFlow graph

Compile-Time Type Checking

main = runSession $ do

let node1 = vector [3::Float, 4]

let node2 = vector [1::Int32, 1]

let node3 = node1 + node2

...

Couldn't match type ‘Int32’ with ‘Float’

Expected type: Tensor Build Float

Actual type: Tensor Build Int32

In the second argument of ‘(+)’, namely ‘node2’

In the expression: node1 + node2

Note: we only check that value types match, not sizes/dimensions.

TensorFlow has an underlying type system, The Python, C++, Java, and Go APIs all only catch this error at *runtime*. We (and perhaps also, the Rust bindings) capture it at compile time, which is much better when (for example) you’re about to schedule jobs on multiple remote machines.

And not only do we check the type of node2, we can also infer it automatically without any additional type signatures.

Language Bindings: How Do They Work?

Kernel Implementations (C++)

TensorFlow Runtime (C++)

- Graph creation
- Graph execution
- Device management

Low-level C API

Haskell API

Python API

Java API

...

Array Ops

Math Ops

State Ops

...

Haskell API: Structure

- TensorFlow.Core: Low-level API for graph construction and execution
- TensorFlow.GenOps.Core: Auto-generated wrappers for (nearly) all kernels
- Combinators on top of the original kernels
- Future work: higher-level combinators for Machine Learning and Neural Networks (tf.contrib.learn, Keras, etc.)

Haskell API: Session

data Session a -- A monad for building and running graphs

runSession :: Session a -> IO a

instance Monad Session

instance MonadIO Session

Session keeps track of: TensorFlow C API state, the current graph of nodes, the next unused node name identifier, ...

Construct graph nodes:

“Add”, “Const”, “Const”, ...

Assign unique names to the nodes:

“Add_1”, “Const_2”, “Const_3”, ...

Add those nodes to the TensorFlow graph

Run (a subset of) the nodes

Fetch the results into Haskell data

Haskell API: Tensor

data Tensor v a -- A handle into the graph; an output of a node, representing

-- a multidimensional array of data.

Tensor Build a -- An “unnamed” node (3, vector [1,2], a + b)

Tensor Value a -- A node which has been assigned a unique name

Our API is a hybrid: pure “unnamed” expressions whenever possible; monadic (Session) actions returning “named” values when necessary.

Monads let us supply nodes with unique “identity”, but are more awkward than pure expressions. Compare:

do {a <- add x y; b <- mul 3 a; sub b 1}

(3 `mul` (x `add` y)) `sub` 1

3 * (x + y) - 1

Our API is a mix of monadic and pure operations. Since it will be used to write mathematical formulas, pure expressions lead to more readable code. However, assignment of unique names has to happen in

Why monadic: Some operations need “identity”

Imagine a hypothetical “pure” API for random values:

truncatedNormal :: ... => Tensor v Int64 -> Tensor Build a

truncatedNormal ::

... => Tensor v Int64 -> Session (Tensor Value a)

let x = truncatedNormal shape

y = truncatedNormal shape

z = sub x y

let x = truncatedNormal shape

z = sub x x

x <- truncatedNormal shape

y <- truncatedNormal shape

let z = sub x y

x <- truncatedNormal shape

let z = sub x x

This isn’t just a Haskell problem; TensorFlow needs the programmer who writes such kernels to mark them as “stateful”, so that its own graph optimizations (CSE) don't break user code. However, our API appears to be unique in making the user code care about those details.

Stateful computations also need identity

TensorFlow’s graph nodes can modify persistent state:

The “variable” and “assign” nodes also need “identity”,

so they have to take place in the Session monad.

Variable

Assign

Const

let v = variable

a = assign v 3

in run (a, a)

let v = variable

w = variable

a = assign v 3

b = assign w 3

in run (a, b)

Variable

Assign

Const

Stateful Computations in TensorFlow

Tensor Ref a -- Stateful graph node (named)

variable :: Session (Tensor Ref a)

newtype ControlNode -- Named op with no outputs

assign :: Variable a -> Tensor v a

-> Session ControlNode

v <- variable

a <- assign v 3

w <- variable

b <- assign w 3

run (a,b)

v <- variable

a <- assign w 3

run (a,a)

The render function

The run function takes care of assigning names (recursively) to Tensor Builds.

run (21 + 42 :: Tensor Build a) :: Session (Vector a)

Alternately, the render function exposes this behavior manually:

render :: Tensor Build a -> Session (Tensor Value a)

instance ... => Num (Tensor Build a)

add :: ... => Tensor v a -> Tensor v’ a -> Tensor Build a

x <- render 21

y <- render 42

render (add x y)

render (add 21 42)

x <- render 21

render (add x 42)

Const_2

Add_3

Const_1

Rendering via Common Subexpression Elimination

Rendering a Tensor Build first checks whether an equivalent one has already been rendered, and if so looks that one up instead.

mul :: ... => Tensor v a -> Tensor v’ a -> Tensor Build a

let x = 42 + 17

render (mul x x)

x <- render (42 + 17)

y <- render (42 + 17)

render (mul x y)

let x = 42 + 17

y = 42 + 17

render (mul x y)

Add_3

Mul_4

Const_1

Const_2

Tensor Value Types

class TensorType a where ...

Instances: Float, Double, Int{8,16,32,64}, Word{8,16}, Bool, ByteString, Complex {Float,Double}, ...

For example:

fill :: TensorType a

=> Tensor v Int64 -- ^ shape of the output

-> Tensor v’ a -- ^ Scalar value to fill the result

-> Tensor Build a

Type-restricted ops

How can we write ops that only work for certain types?

add :: ... => Tensor v a -> Tensor v’ a -> Tensor Build a

mul :: ... => Tensor v a -> Tensor v’ a -> Tensor Build a

sum :: ... => Tensor v a -> Tensor v’ ix -> Tensor Build a

The TensorFlow runtime links in a separate implementation for each type:

- a can be Int*, Word*, Float, Double, Complex*, but not ByteString
- ix can be an index type (Int32, Int64)

TensorFlow gives us this information at the time of code generation; can we use it?

Type-restricted ops

We could generate a separate class for every op, but the constraints would grow with the number of ops in an expression:

add :: AddType a => Tensor v a -> Tensor v’ a -> Tensor Build a

foo a ix = sum (add a (mul a a)) ix

foo :: (AddType a, MulType a, SumType a, SumIxType ix)

=> Tensor v a -> Tensor v’ ix -> Tensor Build a

class AddType a where

instance AddType Int32

instance AddType Float

...

class SumIxType a where

instance SumIxType Int32

instance SumIxType Int64

The OneOf constraint

We define a “constraint type function”: OneOf :: * -> [*] -> Constraint

LANGUAGE DataKinds, ConstraintKinds, TypeFamilies, ...

sum :: (OneOf a ‘[Int32, Int64, Float, ...],

OneOf ix `[Int32, Int64])

=> Tensor v a -> Tensor v ix -> Tensor v a

This is tricky to implement because Haskell doesn’t have such “untagged union” types in general.

For example, Tensor v (Either Int32 Int64) cannot be unified with Tensor v Int32 or Tensor v Int64.

Type-level constraints

Our trick: use the fixed set of possible value types in TensorFlow:

class TensorType a

instance TensorType Int32 -- also Int64, Float, Double, ...

OneOf ix `[Int32, Int64]

=== (TensorType ix, ix /= Float, ix /= Double,

ix /= ByteString, ix /= Bool, ...)

type family a /= b :: Constraint where

a /= a = TypeError (Text "Unexpected type " :<>: ShowType a)

a /= b = ()

See TensorFlow.Types for details of the implementation.

Type-level constraints: usage

Assume five tensor types: Int32, Int64, Double, Float, Bool.

foo :: OneOf a ‘[Double, Float, Int32] => a -> a

bar :: OneOf a ‘[Float, Int32, Int64] => a -> a

Then their composition has type:

foo . bar :: OneOf a `[Float, Int32] => a -> a

The above types are aliases for:

foo :: (TensorType a, a /= Int64, a /= Bool) => a -> a

bar :: (TensorType a, a /= Double, a /= Bool) => a -> a

foo . bar :: (TensorType a,

a /= Int64, a /= Double, a /= Bool) => a -> a

Demo: MNIST

- The “hello world” dataset of computer vision
- 28x28 grayscale images of digits
- Digit classification: Learn a function f :: [Pixel] -> Digit

Simple Neural Network

Walk through code + live demo

Future Work

- Documentation and examples
- Higher-level APIs for optimization and Neural Networks
- Hook into new Gradients API to get better coverage
- Support sparse Tensors
- Real-world applications

Suggestions, issues, pull requests welcome!

github.com/tensorflow/haskell

Extra Slides

Executing Graphs

run :: Fetchable t a => t -> Session a

An overloaded function to execute given graph nodes and fetch the resulting data into Haskell types.

- The input can be rendered (Tensor Value a) or unrendered (Tensor Build a). run will automatically render its input if necessary.

instance Fetchable (Tensor v a) (Storable.Vector a)

instance Fetchable t a => Fetchable [t] [a]

instance (Fetchable t1 a1, Fetchable t2 a2)

=> Fetchable (t1, t2) (a1, a2)

...

For example: Fetchable [Tensor Build a] [Vector a]

Other multidimensional arrays are also possible as fetchable outputs.

Executing Graphs with Placeholders

Graph execution can depend on values that are “fed” in at run time:

placeholder :: Session (Tensor Value a)

runWithFeeds :: Fetchable t e => [Feed] -> t -> Session e

feed :: Tensor Value a -> TensorData -> Feed

do x <- placeholder

y <- placeholder

result <- runWithFeeds [feed x (simpleEncode [] [3, 4]),

feed y (simpleEncode [] [5, 6])]

(add x y)

liftIO $ print $ Vector.toList result

>> [8.0,10.0]