lecture 11

ARKit and CoreML

cs198-001 : Spring 2019

  • Past halfway mark for Final Projects!
    • Your app should be partially working now!
    • If you need help, please please please ask your TA for support! That’s what we’re here for :)

  • Sampler Lab this week will likely focus on CoreML, but may be delayed by a couple days


  • Introduced by Apple at WWDC 2017
    • Revolutionary shift in traditional developer model

  • ARKit: Accessible augmented reality in your app through camera and motion features

  • CoreML: Comparatively easy machine learning model integration

ARKit and CoreML


  • ARKit: Accessible augmented reality in your app through camera and motion features
    • Visual Inertial Odometry (motion sensors + cameras)
    • Plane detection
    • Lighting estimation
    • Rudimentary facial detection and tracking
    • [new!] Object scanning and detection

What is ARKit?

  • SceneKit: Apple’s way of managing 3D objects and spaces! Similar to SpriteKit (2D), except with one more dimension
    • Also handles interfacing with ARKit
    • Can be used with SpriteKit for 2D objects
    • Building blocks: Coordinates, Position, Scale, and Rotation

ARKit ❤ SceneKit

  • Coordinates: X is left-right, Y is up-down, Z is forward-backwards
    • Camera is at (0,0,0) - don’t place things there!
    • Measurement is in meters


  • Position: SCNVector3(x,y,z)
    • z=0.5 means “place object ½ meter in front of user”

  • Scale: SCNVector3(x,y,z)
    • z=0.5 means “compress outwards direction by ½”

  • Rotation: SCNVector4(x,y,z,w)
    • x, y, and z correspond to the axis the object should be rotated around
    • w is the angle in radians


  • Most of the time, you should start with the Augmented Reality template in Xcode.
  • However, if you want to do it yourself, follow these steps:
    • Add an ARSceneView, and allow it camera permissions
    • Initialize an ARWorldTrackingConfiguration to track planes, feature points, and rotation
    • Add objects by creating SCNNodes, attaching geometry or models to that node, and adding the node as a child of the current scene’s root node


  • Add an ARSceneView, and allow it camera permissions

override func viewDidLoad() {


sceneView.delegate = self // Set the view's delegate

let scene = SCNScene() // Create a new scene

sceneView.scene = scene // Set the scene to the view


ARKit - Managing Sessions

  • Initialize an ARWorldTrackingConfiguration to track planes, feature points, and rotation

override func viewWillAppear(_ animated: Bool) {


let configuration = ARWorldTrackingConfiguration()



override func viewWillDisappear(_ animated: Bool) {




ARKit - Managing Sessions

  • Use SCNNodes to add objects
    • Create SCNNode
    • Create geometry (SCN___) to add to node
    • Set node position (and/or rotation/scale if necessary)
    • Add created node as child node of sceneView’s root node

let sphere = SCNSphere(radius: 0.05)

let node = SCNNode()

node.geometry = sphere

node.position = SCNVector3(x, y, z)


ARKit - Working With Objects

  • For complex objects:
    • Safely load scene from file
    • Create a new node
    • Add each of the children nodes of the loaded scene as children nodes of the new node we created
    • Set position (scale/rotation if applicable) of new node
    • Add created node as child node of sceneView’s root node

  • To delete objects:
    • Remove the child node from the root node of sceneView

See documentation for more details!

ARKit - Working With Objects

  • Add a gesture recognizer to sceneView

  • Add this extension to transformation matrix class

ARKit - Interaction

  • Now we can check if we tapped on a feature point, and add a node if we did

  • Or we can check if we tapped a node that already exists and remove it.

ARKit - Interaction


  • CoreML: Comparatively easy machine learning model integration
    • Lets you convert a premade ML model (written in Python or C) into a .mlmodel that can be used by your app!
    • Super optimized code for iOS devices with some helpful preprocessing included
    • Currently doesn’t support on-device training (although apparently that’s coming soon?)

What is CoreML?

CoreML Architecture




(GPU Stuff)

  • Vision: Allows you to do a lot of Computer Vision related things (detect facial landmarks, read text, scan barcodes…

    • Define a VNCoreMLModel (if necessary)
    • Call VNRequestHandler
    • Define and execute a VNRequest
    • Process the returned VNObservation for whatever you need

  • Be sure to give camera permissions!

CoreML ❤ Vision

Here’s an example showcasing VNDetectText - in this case, we don’t need a custom model.

func detectText(image: UIImage) {

let textRequest =



let textRequestHandler = VNImageRequestHandler(cgImage:

image.cgImage!,options: [:])

do {

try textRequestHandler.perform([textRequest])

} catch { print(error) }


CoreML ❤ Vision

func detectTextCompletionHandler(request: VNRequest, error: Error?) {

guard let results = request.results as?

[VNTextObservation] else {return}

// ...

for result in results {

// Do some stuff here


// Do more stuff here, if necessary


CoreML ❤ Vision

  • Apple has a bunch of popular models already converted here
    • You can also find models and code on Github or arXiv
    • Or, you can use Python with coremltools or Turi Create to convert code to .mlmodel format

  • Then, define a VNCoreMLModel and either use model.prediction(from: _) or VNCoreMLRequest(model: _, completionHandler: _)

CoreML - Custom Models

  • Define a VNCoreMLModel
    • Import your .mlmodel by dragging-and-dropping, Swift will automatically create a class for it.

guard let model = try? VNCoreMLModel(for: VGG19().model) else { fatalError("Couldn't load model") }

CoreML - Custom Models

  • Turi Create is great for simple ML tasks - like classification, object detection, recommendations, etc.
  • coremltools is better if you’re converting a model from Tensorflow, Caffes, Keras, etc.
    • If you’re using coremltools, you need to be on Python 2.7, and install numpy, scipy, sklearn, and coremltools using pip on top of whatever ML library you’re already using (probably Tensorflow).
    • If you’re using Turi Create, you can just install turicreate (but still use Python 2.7)

CoreML - Creating Models

  • If you want to learn more about the backbone behind today’s maching learning stuff, check these resources out:
    • ML@B Blog (and their Decals!)
    • Berkeley’s CS189
    • Stanford’s CS231N (specifically, the semesters taught by Karpathy or Li)
    • Goodfellow et. al.’s Deep Learning book

Further Learning

Lecture 11 - Google Slides