1 of 25

ML Systems Fail, Part I:

How to Manage Mistakes at Inference

April 23rd, 2023

2 of 25

Me

  • Habeeb Shopeju
  • Research Engineer, Machine Learning
  • Thomson Reuters Labs
  • Interested in building Information Retrieval and Machine Learning Systems

3 of 25

You

  • Software Engineer
  • Product Owners/Manager
  • Data Scientist
  • AI Enthusiast

4 of 25

Disclaimer!!!

5 of 25

A Blessing & A Curse

  • Explicit logic not required
  • Tackles a wide range of inputs
  • A black box

6 of 25

A Blessing & A Curse

  • Explicit logic not required
    • It just works but that makes it hard to make specific changes.
  • Tackles a wide range of inputs
    • Does well with variety but that makes it hard to streamline causes of failure.
  • A black box
    • Great at finding patterns but that makes it hard to explain the patterns.

7 of 25

Sample Use Cases

  • Self-driving Cars
  • Document Editors
  • System Software

8 of 25

Interacting With ML Systems

  • Automate
  • Prompt
  • Annotate
  • Organize

9 of 25

Automate

  • Taking an action for the user without giving the user ability to approve or decline.
  • Intelligence needs to be really good and the cost of mistakes need to be small for this to pay off.

10 of 25

Prompt

  • Initiating an interaction between user and system about an action to be taken.
  • Intelligence should be reliable i.e. not as reliable as the automating approach but not utterly unreliable either.

11 of 25

Prompt

12 of 25

Annotate

  • Adding subtle information to the experience that can help the user make better decisions.
  • It is passive; requires the user to take action manually and can this information even be missed.

13 of 25

Annotate

14 of 25

Organize

  • Determining what information should be shown to the user and in what order.
  • This kind of interaction works best when there are a lot of options.

15 of 25

Organize

16 of 25

Interacting with ML Systems

  • Automate
  • Prompt
  • Annotate
  • Organize

17 of 25

Accepting the Perfect Imperfections

  • Guardrails
  • The Undo Button
  • Human in the Loop

18 of 25

Guardrails

  • Mechanisms to check that the results of ML models make sense
  • Can be Machine Learning driven or plain logic based on proxies

19 of 25

The Undo Button

  • Classic Ctrl+Z move
  • Ensures model doesn’t have negative implications if prediction is wrong

20 of 25

Human in the Loop

  • The use of human intervention to cover for model failures
  • Stands in between the model and its use case

21 of 25

Feedback

  • Explicit
  • Implicit

22 of 25

Explicit

  • User says precisely that a prediction is correct or wrong
  • Works well with prompting and annotation

23 of 25

Implicit

  • System tries to assume what the user thinks of the prediction
  • Works well with organization and automation

24 of 25

Questions??

25 of 25

Thank You🎈🎈🎈

Up Next

ML Systems Fail, Part 2: How to Manage Mistakes from Model Training