1 of 26

API Wrangling with ANGLE

Shannon Woods, Google

shannonwoods@chromium.org

2 of 26

What’s in this talk

  • What ANGLE is, why it came to be, who it serves, and where it’s going in the future
  • How it all gets wrangled: technical issues arising from API differences, and how we solved them
  • What this tells us about API design in general

3 of 26

ANGLE

(the overview)

4 of 26

ANGLE: What is it?

ANGLE translates OpenGL ES to system-native APIs

    • Direct3D 9 and Direct3D 11
    • Desktop OpenGL
    • Native OpenGL ES

Certified conformant OpenGL ES 2.0 implementation

OpenGL ES 3.0 implementation nearing completion

Shader validation & translation

5 of 26

ANGLE: Why is it?

WebGL: write once, run everywhere

    • Single API target for all devices
    • Mobile vs. desktop
    • Platform vs. platform

OpenGL ES - a subset of desktop OpenGL

    • Present on (most) mobile phones
    • But not on desktop...

6 of 26

Supporting OpenGL ES on the desktop

OpenGL ES - a subset of desktop OpenGL?

    • Not quite— but close
    • Framebuffer objects shared in desktop, but not ES
    • Binary/precompiled shader support in ES 3.0
    • ETC2/EAC texture compression
    • GL_MAX_ELEMENT_INDEX not in desktop until GL 4.3
    • EGL...

Great! We’ll translate ES to OpenGL…

    • Linux: No problem!
    • MacOS X: Compatibility vs. Core - differences, but it works
    • Windows...

7 of 26

The OpenGL ES on Windows situation

In the year 2010...

OpenGL drivers exist for Windows!

But…

    • Not all are conformant
    • Not all are stable
    • Users may need to install drivers themselves

Ok, so we need to translate OpenGL ES to D3D...

8 of 26

What ANGLE looks like today

Stable: � OpenGL ES 2.0 to Direct3D 9 and Direct3D 11

Complete but new:� OpenGL ES 2.0 and 3.0 to OpenGL for Windows

Nearing completion:� OpenGL ES 2.0 and 3.0 to OpenGL for Linux and MacOS X

Future:� OpenGL ES 2.0 & 3.0 shim & validation on native OpenGL ES

9 of 26

Where ANGLE is used

To support browser compositing

    • Chrome on Windows

As a WebGL rendering back-end

    • Chrome, Opera, and Firefox on Windows

Shader validation & translation

    • Chrome, Opera, Firefox, and Safari on Windows, Linux, Mac OS, and mobile

Portability to Windows mobile

    • Qt user interface/app development framework
    • Windows Store app development templates from Microsoft

Portability to Xbox One

    • Games, other apps

10 of 26

Wrangling

(the gritty details)

11 of 26

GL and D3D - What’s the difference anyway?

Coordinate Systems

    • Handedness doesn’t actually differ!
    • But… window origin and viewport transform do�Y axis is inverted on viewport transform, and window origin is upper left�This spells trouble for render-to-texture!
    • D3D 9 pixel centers differ from GL… and D3D 11

ANGLE’s solution

    • Option 1: Flip Y in vertex shader and before present, reverse winding order
    • Option 2: Invert texture sampling, textures on upload, cube map Y-axis
    • Pixel centers: Simple offset to frag coord

x

y

z

12 of 26

GL and D3D - Vertex data differences

Vertex & Index buffers different resource types in D3D9, but unified in D3D11 & GL� In D3D 9, this required delaying creation of resources until draw time

D3D in general provides support for fewer data types than GL� Type skew leads to deinterleaving data, punishing well-behaved apps� Again, a bigger problem for D3D 9 than 11, but unorms are still problematic

Primitive types - GL supports a greater set of primitive types� Makes developer lives easier, but driver code more complex� GL_LINE_LOOP not so bad� GL_TRIANGLE_FAN requires index buffer rewriting� GL_POINTS are expanded to quads in a geometry shader

13 of 26

GL and D3D - Flat shading & provoking vertex

ES 3.0 and D3D 11 both support flat shading…� But the provoking vertex convention differs� Desktop GL has glProvokingVertex() to control��ANGLE uses geometry shaders to adjust vertex order� Must now generate geometry shaders dynamically at draw time� Has interactions with other features� - Transform feedback� - Primitive restart

GL

D3D

14 of 26

GL and D3D - Primitive restart

ES 3.0 allows primitive restart to be toggled on/off; D3D considers it always-on� D3D uses (2N - 1) index value as strip cut index for N-bit index type� This is a valid index in GL if primitive restart is off, and real-world data uses it

Originally handled in ES 2.0 via type promotion� UNSIGNED_BYTE→UNSIGNED_SHORT (happens anyway)� UNSIGNED_SHORT→UNSIGNED_INT� UNSIGNED_INT→ …uh-oh

Now we restrict GL_MAX_INDEX to MAX_UINT - 1� Rewrite smaller types when restart is enabled

15 of 26

GL and D3D - Textures

GL has a larger number of supported formats here, too� Unsupported formats handled by conversion in ANGLE, at a performance cost� e.g.: GL_LUMINANCE → GL_RGBA (L, L, L, 0)

D3D requires the application to specify dimensions and format at texture creation� GL’s immutable textures behave similarly

16 of 26

GL and D3D - Texture Swizzle

ES 3.0 introduces blanket swizzle parameters for textures� App can, e.g., swap R and G channels every sample, without altering shader� Useful for using GL_RED in place of luminance or alpha-only textures��This doesn’t exist at all in D3D!� D3D developers instead do channel swizzling in the shader� Feature removed from WebGL 2 because of this��ANGLE’s approach� Option 1: Pre-swizzle data when uploaded or swizzle changed� Option 2: Swizzle in the shader… more complicated than it sounds!

17 of 26

GL and D3D - Framebuffers

Interaction of write masks and clear commands is not the same between APIs� D3D’s clears ignore masks; ANGLE must use draws instead� Multiple draw buffers/render targets compound performance impact

Blit in WebGL constrained by D3D 9’s StretchRect() limits� No resizing, flipping, filtering, or format conversion� D3D 11 doesn’t have these restrictions, but WebGL still does

18 of 26

GL and D3D - Shading language

Row-major vs. column-major: well-known difference, but little impact for ANGLE

Compilation

    • D3D designed for ahead-of-time shader compilation
    • GL more friendly toward runtime compilation— in ES 2.0, glProgramBinary() is by extension only
    • D3D 9’s shader compiler slow, does not handle complex flow control well

Subtle behavior differences� GLSL short-circuits conditionals, D3D always evaluates both

if ((r = 1.0) == 1.0 || (r = 0.0) == 0.0)

out = vec4(r, 0.0, 0.0, 1.0);

19 of 26

API Design

(in which we get philosophical)

20 of 26

API design philosophy

OpenGL, OpenGL ES

    • Maintained by standards body
    • Extension mechanism - often how features reach the core API
    • Backwards compatibility

Direct3D

    • Maintained by single entity
    • Features arrive via new API revision
    • Sometimes backwards compatible, mechanism may change

Different APIs design approaches produce different API surfaces

21 of 26

So the API isn’t just window dressing?

D3D and GL are solving mostly the same problems, in mostly the same way� ...but as ANGLE demonstrates, subtle differences can have surprising impact

    • When parameters must be specified may impact whether objects are created ahead of time or at draw time
    • Information specified API-side may alter shader behavior, and trigger expensive recompilation
    • Data may need to be shadowed, copied, and retrieved, impacting storage and performance

That’s a problem unique to ANGLE, right?� Unfortunately not.

22 of 26

From WebGL to the hardware

Chrome’s GPU process validates WebGL calls, tracks state needed for validation, and stores copies of some objects.

ANGLE validates OpenGL ES calls, tracks all GL state, stores copies of most objects and shaders at least temporarily, and performs conversions

The D3D driver validates D3D calls, tracks D3D state, stores copies of objects & shaders, and reorganizes data for consumption by hardware

The hardware receives & stores data, and finally renders

Chrome GPU Process

ANGLE

D3D Driver

GPU

23 of 26

API to hardware translation

Different vendors make different hardware, different architectures within vendors� No portable API will be a perfect match for the underlying hardware

Some drivers change behavior based on what an app is doing

Driver may need to...

    • Store copies of data
    • Delay upload to GPU
    • Delay command buffer dispatches & flushes
    • Patch in-flight command buffers
    • Recompile shaders

24 of 26

Next-generation APIs

Several APIs trying to solve a similar problem: Metal, D3D 12, Vulkan

    • All three aim to reduce overhead by lowering abstraction in the API, complexity in the driver
    • All three place emphasis on application control of synchronization, object creation
    • All three describe themselves as low-level, explicit, or close to the hardware

Eliminate hidden work

    • Validation: eliminate or make explicit
    • Separate GPU work from CPU work
    • Some hardware has special needs, e.g. low memory bandwidth for tilers

25 of 26

Less work in the driver, more work in the app

Things that used to be the driver’s responsibility need to be done in the application� Understanding underlying hardware becomes more important for app devs

Important concepts to understand

    • Synchronization control/multithreading
    • Memory allocation
    • CPU-GPU work submission & data transfer

26 of 26

Advice & Questions