1 of 20

MLIR

2 of 20

I am not an expert!

  • MLIR: Multi-Level Intermediate�Representation for Compiler�Infrastructure�https://youtu.be/qzljG6DKgic
  • Slides: https://bit.ly/2kJHErx

3 of 20

Today

4 of 20

Gap leading to domain specific IRs elsewhere

  • Domain specific optimizations, progressive lowering

LLVM IR

Machine IR

Asm

Swift

Java & JVM Languages

Java BC

SIL IR

Swift AST

Rust

MIR IR

Rust AST

Julia

Julia IR

Julia AST

XLA HLO

TF Graph

TensorFlow Ecosystem

Clang AST

C, C++, ObjC, CUDA, OpenCL, ...

CIL IR

5 of 20

The TensorFlow compiler ecosystem

Many “Graph” IRs, each with challenges:

  • Similar-but-different proprietary technologies: not going away anytime soon
  • Fragile, poor UI when failures happen: e.g. poor/no location info, or even crashes
  • Duplication of infrastructure at all levels

TensorFlow Graph

LLVM IR

XLA HLO

TPU IR

TensorFlow Lite

Several others

Tensor RT

nGraph

NNAPI

Many others

Core ML

Grappler

6 of 20

Domain Specific IRs

Great!

  • High-level domain-specific optimizations
  • Progressive lowering encourages reuse between levels

Not great!

  • Huge expense to build this infrastructure
  • Reimplementation of all the same stuff:
    • pass managers, location tracking, use-def chains, inlining, constant folding, CSE, testing tools, ….
  • Innovations in one community don’t benefit the others

7 of 20

What is MLIR?

8 of 20

What is MLIR?

  • TensorFlow
    • "An open source machine learning framework for everyone"
  • Multi-Level Intermediate Representation
    • "An open source program optimization framework for everyone"
  • Abstraction Building Toolkit
  • Reusable set of compiler passes for higher abstractions
    • Targeting analysis/program optimization/code generation
  • Open governance and part of LLVM

9 of 20

10 of 20

Extensible Operations Allow Multi-Level IR

TensorFlow

XLA HLO

LLVM IR

Also: TF-Lite, Core ML, other frontends, etc ...

%x = "tf.Conv2d"(%input, %filter)

{strides: [1,1,2,1], padding: "SAME", dilations: [2,1,1,1]}

: (tensor<*xf32>, tensor<*xf32>) -> tensor<*xf32>

%m = “xla.AllToAll"(%z)� {split_dimension: 1, concat_dimension: 0, split_count: 2}� : (memref<300x200x32xf32>) -> memref<600x100x32xf32>

%f = llvm.add %a, %b

: !llvm.float

Don’t we end up with the JSON of compiler IRs????

Lowering

11 of 20

MLIR “Dialects”: Families of defined operations

Example Dialects:

  • TensorFlow, LLVM IR, XLA HLO, TF Lite, Swift SIL…

Dialects can define:

  • Sets of defined operations
  • Entirely custom type system
  • Customization hooks
  • Constant folding, decoding

Operation can define:

  • Invariants on # operands, results, attributes, etc
  • Custom parser, printer, verifier, …

12 of 20

MLIR Type System - some examples

Scalars:

  • f16, bf16, f32, … i1, i8, i16, i32, … i3, i4, i7, i57, …

Vectors:

  • vector<4 x f32> vector<4x4 x f16> etc.

Tensors, including dynamic shape and rank:

  • tensor<4x4 x f32> tensor<4x?x?x17x? x f32> tensor<* x f32>

Others: functions, memory buffers, quantized integers, other TensorFlow stuff, ...

Extensible!!

13 of 20

Applications

14 of 20

TensorFlow Lite Converter

  • TensorFlow to TensorFlow Lite converter
    • Two different graph representations
      • Different set of ops & types
    • Different constraints/targets
  • Overlapping goals with regular compilation
    • Edge devices also can have accelerators (or a multitude of them!)
  • MLIR's pluggable type & rewrite system simplifies transforms & expressibility
    • Quantized types is a first class citizen in dialect

TF

Graph

translate

legalize

optimize

translate

TFlite flatbuffer

tf.*

tfl.*

tfl.*

15 of 20

One of the focusses: Usability

  • Usability of TOCO top complaint among TFLite users
    • CHECK'ing on errors, unsupported cases, confusing error messages
  • Debugging
    • Location back to your source!
      • Build on and extend TF debug info work
    • Tracking origin of instructions (e.g., origin of FMA from multiply and add is FusedLoc)
  • Report why a model failed to convert
    • Point to the unsupported ops/types/shapes
      • "Why does this model have unsupported X?"
  • Dialect types enable more checking & better reporting
    • Dialects can define types, higher-order types (such as quantized types) enable better invariant checks

16 of 20

Old “TOCO” User Experience

F0122 11:20:14.691357 27738 import_tensorflow.cc:2549] Check failed: status.ok() Unexpected value for attribute 'data_format'. Expected 'NHWC'

*** Check failure stack trace: ***

@ 0x5557b0ac3e78 base_logging::LogMessage::SendToLog()

@ 0x5557b0ac46c2 base_logging::LogMessage::Flush()

@ 0x5557b0ac6665 base_logging::LogMessageFatal::~LogMessageFatal()

@ 0x5557af51e22b toco::ImportTensorFlowGraphDef()

@ 0x5557af51f60c toco::ImportTensorFlowGraphDef()

(...)

@ 0x5557af4ac679 main

@ 0x7f7fa2057bbd __libc_start_main

@ 0x5557af4ac369 _start

*** SIGABRT received by PID 27738 (TID 27738) from PID 27738; ***

F0122 11:20:14.691357 27738 import_tensorflow.cc:2549] Check failed: status.ok() Unexpected value for attribute 'data_format'. Expected 'NHWC'

E0122 11:20:14.881460 27738 process_state.cc:689] RAW: Raising signal 6 with default behavior

Aborted

17 of 20

Improved User Experience

node “MobilenetV1/MobilenetV1/Conv2d_0/Conv2D” defined

at 'convolution2d'(tensorflow/contrib/layers/python/layers/layers.py:1156):

conv_dims=2)

at 'func_with_args'(tensorflow/contrib/framework/python/ops/arg_scope.py:182):

return func(*args, **current_args)

at 'mobilenet_base'(tensorflow_models/slim/nets/mobilenet/mobilenet.py:278):

net = opdef.op(net, **params)

...

at 'network_fn'(resnet/nets_factory.py:93):

return func(images, num_classes, is_training=is_training, **kwargs)

at 'build_model'(resnet/train_experiment.py:165):

inputs, depth_multiplier=FLAGS.depth_multiplier)

...

error: 'tf.Conv2D' op requires data_format attribute to be either 'NHWC' or 'NCHW'

This output is also evolving (caret pointing to error ala clang)

18 of 20

For the Web?

19 of 20

Some facts from MLIR investigations

  • Operator expansion is about 25% YoY for TensorFlow
  • Hardware vendors will implement dialects
  • Open governance

20 of 20

MLIR dialect on the web

  • No backwards compatible guarantees today from MLIR
    • A dialect could be invented that is backwards compatible
    • What does maintaining this look like?
  • Web sourcemaps => python code
  • Immediately tells you whether python code will execute in browser