1 of 20

MLIR

2 of 20

I am not an expert!

MLIR: Multi-Level Intermediate�Representation for Compiler�Infrastructure�https://youtu.be/qzljG6DKgic�
Slides: https://bit.ly/2kJHErx

3 of 20

Today

4 of 20

Gap leading to domain specific IRs elsewhere

Domain specific optimizations, progressive lowering

LLVM IR

Machine IR

Asm

Swift

Java & JVM Languages

Java BC

SIL IR

Swift AST

Rust

MIR IR

Rust AST

Julia

Julia IR

Julia AST

XLA HLO

TF Graph

TensorFlow Ecosystem

Clang AST

C, C++, ObjC, CUDA, OpenCL, ...

CIL IR

5 of 20

The TensorFlow compiler ecosystem

Many “Graph” IRs, each with challenges:

Similar-but-different proprietary technologies: not going away anytime soon
Fragile, poor UI when failures happen: e.g. poor/no location info, or even crashes
Duplication of infrastructure at all levels

TensorFlow Graph

LLVM IR

XLA HLO

TPU IR

TensorFlow Lite

Several others

Tensor RT

nGraph

NNAPI

Many others

Core ML

Grappler

6 of 20

Domain Specific IRs

Great!

High-level domain-specific optimizations
Progressive lowering encourages reuse between levels

Not great!

Huge expense to build this infrastructure
Reimplementation of all the same stuff:

pass managers, location tracking, use-def chains, inlining, constant folding, CSE, testing tools, ….

Innovations in one community don’t benefit the others

7 of 20

What is MLIR?

8 of 20

What is MLIR?

TensorFlow

"An open source machine learning framework for everyone"

Multi-Level Intermediate Representation

"An open source program optimization framework for everyone"

Abstraction Building Toolkit
Reusable set of compiler passes for higher abstractions

Targeting analysis/program optimization/code generation

Open governance and part of LLVM

9 of 20

10 of 20

Extensible Operations Allow Multi-Level IR

TensorFlow

XLA HLO

LLVM IR

Also: TF-Lite, Core ML, other frontends, etc ...

%x = "tf.Conv2d"(%input, %filter)

{strides: [1,1,2,1], padding: "SAME", dilations: [2,1,1,1]}

: (tensor<*xf32>, tensor<*xf32>) -> tensor<*xf32>

%m = “xla.AllToAll"(%z)� {split_dimension: 1, concat_dimension: 0, split_count: 2}� : (memref<300x200x32xf32>) -> memref<600x100x32xf32>

%f = llvm.add %a, %b

: !llvm.float

Don’t we end up with the JSON of compiler IRs????

Lowering

11 of 20

MLIR “Dialects”: Families of defined operations

Example Dialects:

TensorFlow, LLVM IR, XLA HLO, TF Lite, Swift SIL…

Dialects can define:

Sets of defined operations
Entirely custom type system
Customization hooks
Constant folding, decoding

Operation can define:

Invariants on # operands, results, attributes, etc
Custom parser, printer, verifier, …

12 of 20

MLIR Type System - some examples

Scalars:

f16, bf16, f32, … i1, i8, i16, i32, … i3, i4, i7, i57, …

Vectors:

vector<4 x f32> vector<4x4 x f16> etc.

Tensors, including dynamic shape and rank:

tensor<4x4 x f32> tensor<4x?x?x17x? x f32> tensor<* x f32>

Others: functions, memory buffers, quantized integers, other TensorFlow stuff, ...

Extensible!!

13 of 20

Applications

14 of 20

TensorFlow Lite Converter

TensorFlow to TensorFlow Lite converter

Two different graph representations

Different set of ops & types

Different constraints/targets

Overlapping goals with regular compilation

Edge devices also can have accelerators (or a multitude of them!)

MLIR's pluggable type & rewrite system simplifies transforms & expressibility

Quantized types is a first class citizen in dialect

TF

Graph

translate

legalize

optimize

translate

TFlite flatbuffer

tf.*

tfl.*

15 of 20

One of the focusses: Usability

Usability of TOCO top complaint among TFLite users

CHECK'ing on errors, unsupported cases, confusing error messages

Debugging

Location back to your source!

Build on and extend TF debug info work

Tracking origin of instructions (e.g., origin of FMA from multiply and add is FusedLoc)

Report why a model failed to convert

Point to the unsupported ops/types/shapes

"Why does this model have unsupported X?"

Dialect types enable more checking & better reporting

Dialects can define types, higher-order types (such as quantized types) enable better invariant checks

16 of 20

Old “TOCO” User Experience

F0122 11:20:14.691357 27738 import_tensorflow.cc:2549] Check failed: status.ok() Unexpected value for attribute 'data_format'. Expected 'NHWC'

*** Check failure stack trace: ***

@ 0x5557b0ac3e78 base_logging::LogMessage::SendToLog()

@ 0x5557b0ac46c2 base_logging::LogMessage::Flush()

@ 0x5557b0ac6665 base_logging::LogMessageFatal::~LogMessageFatal()

@ 0x5557af51e22b toco::ImportTensorFlowGraphDef()

@ 0x5557af51f60c toco::ImportTensorFlowGraphDef()

(...)

@ 0x5557af4ac679 main

@ 0x7f7fa2057bbd __libc_start_main

@ 0x5557af4ac369 _start

*** SIGABRT received by PID 27738 (TID 27738) from PID 27738; ***

F0122 11:20:14.691357 27738 import_tensorflow.cc:2549] Check failed: status.ok() Unexpected value for attribute 'data_format'. Expected 'NHWC'

E0122 11:20:14.881460 27738 process_state.cc:689] RAW: Raising signal 6 with default behavior

Aborted

17 of 20

Improved User Experience

node “MobilenetV1/MobilenetV1/Conv2d_0/Conv2D” defined

at 'convolution2d'(tensorflow/contrib/layers/python/layers/layers.py:1156):

conv_dims=2)

at 'func_with_args'(tensorflow/contrib/framework/python/ops/arg_scope.py:182):

return func(*args, **current_args)

at 'mobilenet_base'(tensorflow_models/slim/nets/mobilenet/mobilenet.py:278):

net = opdef.op(net, **params)

...

at 'network_fn'(resnet/nets_factory.py:93):

return func(images, num_classes, is_training=is_training, **kwargs)

at 'build_model'(resnet/train_experiment.py:165):

inputs, depth_multiplier=FLAGS.depth_multiplier)

...

error: 'tf.Conv2D' op requires data_format attribute to be either 'NHWC' or 'NCHW'

This output is also evolving (caret pointing to error ala clang)

18 of 20

For the Web?

19 of 20

Some facts from MLIR investigations

Operator expansion is about 25% YoY for TensorFlow
Hardware vendors will implement dialects
Open governance

20 of 20

MLIR dialect on the web

No backwards compatible guarantees today from MLIR

A dialect could be invented that is backwards compatible
What does maintaining this look like?

Web sourcemaps => python code
Immediately tells you whether python code will execute in browser