1 of 60

2 of 60

Artificial intelligence: machines replacing people?

AlphaGo

Watson

Deep Blue

3 of 60

Augmented intelligence: machines collaborating with people

Business Exec

Clinician

Designer

AI Researcher

Supervisor

Assistant

Guidelines,

questions, and hints

Partial solutions,

with uncertainty

4 of 60

5 of 60

The need for augmented intelligence

Opinions from:

  • Epidemiologists
  • Economists
  • Field workers
  • Policy advocates
  • Stakeholders

Data

6 of 60

The need for augmented intelligence

Policy advocate

"What are the comparable countries to Kenya in terms of everything we

know about the malnutrition rate of infants?"

Domain expert

"Recent work in development economics suggests sanitation standards

influence growth stunting in India but not in Africa."

Field researcher

"Here is new data on ~10,000 children in Bangladesh. Please update all

relevant models and inform stakeholders."

Statistician

"Despite what the economists think, the p-value for this hypothesis test

indicates that my mixed-effects model's finding of two different country

clusters with respect to longitudinal variation in sanitation standards is not actually significant."

7 of 60

What is probabilistic programming?

Probabilistic

Programming

Probabilistic modeling & inference

System

Software

Programming

Languages

Probabilistic languages

that are read and written by both humans and machines

Probabilistic

Programming

System

Modeling

Assumptions

Data & constraints

Probable

answers

Inference

hints

Questions

8 of 60

Probabilistic programming for augmented intelligence

BayesDB: an AI assistant for

empirical inference

VentureScript:

an AI assistant

for AI research

Picture: an AI

assistant for

visual scene

understanding & design

Probabilistic

Programming

System

Modeling

Assumptions

Data & constraints

Probable

answers

Inference

hints

Questions

Probabilistic languages

that are read and written by both humans and machines

Venture

9 of 60

What is BayesDB?

Relational Databases:

Query the data

What products did customer X�buy last year?

BayesDB:

Query its probable implications

What products will customer X�probably buy this year?

Necessarily uncertain answer�Model and inference independence

Tables

Indexes

DDL

SQL

Populations

GPMs

MML

BQL

Mansinghka et al. (arXiv 2016; in review)

Answer is certain (assuming error-free data)�Physical data independence: what, not how

10 of 60

Small Data

\

Big Data

100 records, 1 variable

109+records

Medium Data

102-106 records, 10-103 fields

11 of 60

Sky surveys

12 of 60

Scholastic outcomes

13 of 60

Bond trades &

Public offerings

14 of 60

UCS Satellites Database

http://www.ucsusa.org/nuclear-weapons/space-weapons/satellite-database

15 of 60

SQL Programmer

Machine Learning

Consultant

We have received an intelligence report indicating the existence of a previously unknown satellite intended for geosynchronous orbit with a dry mass of 500 kilograms. Who probably operates the satellite, and why? Is it plausibly a threat?

Domain

Expert

Augmented Intelligence

16 of 60

Machine learning requires many decisions

recoding categorical inputs?

no

coding

error

ignore

binary

coding

unseen categoricals?

fill in�("impute")

drop

mean

which

strategy?

median

most frequent

handle rows with missing data?

which classifier?

SVM

random forest

polynomial

which

kernel?

radial basis

1

2

3

...

degree?

sigmoid

predict country and purpose jointly?

concatenate the labels

purpose--country

separately classify purpose and country

17 of 60

Machine learning results are unstable

drop missing, no coding, random forest, separate classifiers

Probably Egypt, definitely science

impute missing, no coding, random forest, separate classifiers

Probably India, probably science

impute missing, binary coding, svm, rbf kernel, joint classification

No idea

Approach 1

Approach 2

Approach 3

18 of 60

The data is too sparse to use directly

SELECT Country_of_Operator, Purpose

FROM satellites

WHERE Class_of_Orbit = ‘GEO’

AND Dry_Mass_kg = 500

Mansinghka et al. (arXiv 2016; in review)

SELECT Country_of_Operator, Purpose

FROM satellites

WHERE Class_of_Orbit = ‘GEO’

AND Dry_Mass_kg BETWEEN 400 AND 600

19 of 60

Mansinghka et al. (arXiv 2016; in review)

Expert Dialog

  • What do you think?
  • I don’t know. It’s probably not dangerous, because it’s expensive to put something in GEO, and it’s only 500 kilo, which is light. Could be a cheap communications satellite, could be a science satellite.
  • Are you sure? Can you quantify that?
  • Quantify what? No.

Domain experts are costly, qualitative, and may be biased

20 of 60

BayesDB can quickly answer many questions

SIMULATE Country_of_Operator, Purpose

FROM satellites

GIVEN Class_of_Orbit = ‘GEO’ AND Dry_mass_kg = 500

1000 TIMES

Mansinghka et al. (arXiv 2016; in review)

21 of 60

BayesDB can quickly answer many questions

Where do Chinese surveillance satellites usually fly?

How much does a communications satellite probably weigh?

What launch site will a French company probably use?

Are satellites generally getting lighter?

How much does mass affect expected lifetime?

What kinds of satellites are in the most surprising orbits?

How many satellites will probably launch from Baikonur this year?

What is probably the maximum launch mass for a Proton M?

22 of 60

BayesDB can quickly answer many questions

ESTIMATE name, class_of_orbit, period_minutes, PROBABILITY OF period_minutes

FROM satellites

WHERE class_of_orbit = GEO

ORDER BY PROBABILITY OF period_minutes ASCENDING LIMIT 10

Mansinghka et al. (arXiv 2016; in review)

Off by 10x;

not an outlier

Meant 24 hours

not 24 minutes

"What are the GEO satellites with the 10 least likely orbital periods?"

23 of 60

How the models were built: the BayesDB Meta-Modeling Language (MML)

Mansinghka et al. (arXiv 2016; in review)

INITIALIZE 16 GENERATIVE POPULATION MODELSFOR satellites;�IMprove satellites FOR 4 MINUTES;

CREATE POPULATION satellites� FROM ucs_satellites.csv�

CREATE METAMODEL ON satellites

USING default_metamodel( GUESS(*) );

"Choose whatever data types you think are reasonable --- I don't have any knowledge about that."

"Use the data from this .CSV file."

"Build me a quick-and-dirty ensemble of models that gives me some ability to quantify uncertainty."

24 of 60

Default meta-model suggests possible biomarkers

Improved meta-model finds significant evidence

Clinical data +

Microbiome

measurements

25 of 60

Technical Challenges

  1. Language coverage for BQL�
  2. Language coverage for MML�
  3. Model independence via Generative Population Models (GPMs)��simulate( [q0, q1, …],� {g0 = vg0, …} )��logpdf( {q0 = vq0, …},� {g0 = vg0, …} )
  4. Default meta-model that works reliably

BayesDB:

Query its probable implications

What products will customer X�probably buy this year?

Necessarily uncertain answer�Model and inference independence

Populations

GPMs

MML

BQL

Mansinghka et al. (arXiv 2016; in review)

26 of 60

Challenges in building a 1-dimensional metamodel

"A genuine Bayesian solution seems difficult here, since it requires a prior distribution on the space of all distributions on the real line.

  • B. Efron, Why Isn't Everyone a Bayesian? (1986)

27 of 60

Challenges in building a high-D metamodel

PCA

k-means

Bayes Net Structure Learning

28 of 60

Challenges in building a high-D metamodel

PCA

k-means

Bayes net structure learning

High-D predictive accuracy

Low-D predictive accuracy

Robust to missing values

Heterogeneous data

Ignores junk columns, noise

Autoselects model capacity

Qualitative constraints

Scalable

29 of 60

Challenges in building a 2-dimensional metamodel

30 of 60

Human intelligence finds patterns that correlation cannot

Correlation matrix for UCS Satellites data

ρ(perigee, apogee) = 0.3

Mansinghka et al. (arXiv 2016; in review)

31 of 60

Human intelligence finds patterns that correlation cannot

Mansinghka et al. (arXiv 2016; in review)

32 of 60

CrossCat: models high-D via divide-and-conquer

M. et al. (JMLR 2016; in press)

Shafto, Kemp, M., and Tenenbaum. (COGSCI 2006)

  1. Divides variables into views�
  2. Divides views into categories�
  3. Parametrically models each variable in each category, based on data type

taxonomy

junk

ecology

CrossCat

33 of 60

CrossCat: models high-D via divide-and-conquer

Traditional

Mixture

models

Problem: many variables appear to be noise

34 of 60

Mansinghka et al. (JMLR 2016; in press)

35 of 60

CrossCat: models high-D via divide-and-conquer

Mansinghka et al. (JMLR 2016; in press)

36 of 60

CrossCat overcomes limitations of correlation

ESTIMATE PROBABILITY OF DEPENDENCE

FROM PAIRWISE COLUMNS OF satellites_cc

37 of 60

Small Data

\

Big Data

100 records, 1 variable

109+records

Medium Data

102-106 records, 10-103 fields

38 of 60

Detecting predictors of insulin resistance in 30 minutes

RISC diabetes study (Chalmers University & European consortium): ~250 patients, ~400 variables

ESTIMATE COLUMNS DEPENDENCE PROBABILITY WITH M

FROM risc_cc� ORDER BY DEPENDENCE PROBABILITY WITH M

LIMIT 10;

39 of 60

Detecting predictors of insulin resistance in 30 minutes

Mardinoglu, M., et al. (in preparation)

40 of 60

Probabilistic programming for augmented intelligence

BayesDB: an AI assistant for

empirical inference

VentureScript:

an AI assistant

for AI research

Picture: an AI

assistant for

visual scene

understanding & design

Probabilistic

Programming

System

Modeling

Assumptions

Data & constraints

Probable

answers

Inference

hints

Questions

Probabilistic languages

that are read and written by both humans and machines

Venture

41 of 60

42 of 60

43 of 60

Computer vision as the inverse problem of computer graphics

?

44 of 60

Inverse graphics

Bottom-up computer vision

M.*, Kulkarni*, et al. (2013)

45 of 60

Input Image

Reconstruction

R(S) = IR

Kulkarni, Kohli, Tenenbaum, and M. (2015)

"Find a face shape and texture that matches this input image."

46 of 60

Kulkarni, Kohli, Tenenbaum, and M. (2015)

"What does this face look like from the side? Or when lit differently?"

47 of 60

Texture

Mesh

Stochastic Scene Generator

Differentiable

Approx. Renderer

Differentiable

Stochastic Comparator

Scene

Approximate Reconstruction

Image Data

Camera

& Lighting

?

?

Texture

Mesh

Camera

& Lighting

?

48 of 60

From computer graphics to probabilistic interfaces

Kulkarni, Kohli, Tenenbaum, and M. (2015)

"Find a pose for this blender figure model that matches this image, ignoring everything but shape."

Blender

model

49 of 60

Kulkarni, Kohli, Tenenbaum, and M. (2015)

"Find a simple, radially symmetric object that matches the silhouette of this wine bottle."

Probabilistic graphics for modeling generic objects

50 of 60

Ritchie, Mildenhall, Goodman, Hanrahan (SIGGRAPH 2014)

"Find a collection of cylinders that casts a shadow that looks like this face in profile."

Probabilistic graphics for machine-assisted art

51 of 60

The MIT Probabilistic Computing Project (I)

Processor

Synchronous digital logic

Boolean circuits

Databases

Graphics

Scripting

???

???

Discrete-time, discrete-state Markov chains

Stochastic digital circuits

BayesDB

Picture

Venture

M.. Natively Probabilistic Computation. (2009)�M. & Jonas. Building fast, Bayesian computing machines out of intentionally stochastic, digital parts. (2014)

Operating System

Systems

Programming

???

Theory & Algorithms

How to program

???

???

M., Freer, and Roy.�When are probabilistic programs�probably computationally tractable? (2010)�

Cusmano-Towner and M.�How fast does a probabilistic program�probably converge? (NIPS 2017)��Roy, Ackerman, Freer.�On the computability of conditioning. (2010)

Data Visualization

GUI Interactions

???

???

Software

Hardware

Theory &

culture

52 of 60

The MIT Probabilistic Computing Project (II)

Processor

Synchronous digital logic

Boolean circuits

Databases

Graphics

Scripting

???

???

Discrete-time, discrete-state Markov chains

Stochastic digital circuits

BayesDB

Picture

Venture

M.. Natively Probabilistic Computation. (2009)�M. & Jonas. Building fast, Bayesian computing machines out of intentionally stochastic, digital parts. (2014)

Operating System

Systems

Programming

???

Theory & Algorithms

How to program

???

???

M., Freer, and Roy.�When are probabilistic programs�probably computationally tractable? (2010)�

Cusmano-Towner and M.�How fast does a probabilistic program�probably converge? (NIPS 2017)��Roy, Ackerman, Freer.�On the computability of conditioning. (2010)

Data Visualization

GUI Interactions

???

???

Software

Hardware

Theory &

culture

53 of 60

What is the future of intelligent computation?

Supervisor

Assistant

Guidelines,

questions, and hints

Partial solutions,

with uncertainty

Licklider

Minsky

Engelbart

54 of 60

Feras Saad (MEng Student)

Tejas Kulkarni (PhD Alum)

Alexey Radul (Engineering Lead)

Richard Tibbetts (Visiting Scientist)

Marco Cusumano-Towner (PhD Student)

Rax Dillon (PM)

Gregory Martin (Engineer)

Taylor Campbell (Engineer)

Anthony Lu (MEng Student)

Ronny Diaz (Admin)

http://probcomp.csail.mit.edu

Acknowledgements

55 of 60

56 of 60

Satellites: asking the data

SELECT Country_of_Operator, Purpose

FROM satellites

WHERE Class_of_Orbit = ‘GEO’

AND Dry_Mass_kg BETWEEN 300 AND 700

Mansinghka et al. (arXiv 2016; in review)

57 of 60

Build credibility --- step 1: compare probable dependencies with common sense

Contractor, Country, and Longitude

Orbital Dynamics

Users, Purpose, and Type of Orbit

Mass, Power, and Lifetime

ESTIMATE PROBABILITY OF DEPENDENCE

FROM PAIRWISE COLUMNS OF satellites_cc

Non-Dependence of Mass and Country

58 of 60

Build credibility --- step 3: ask simpler questions with known answers

SIMULATE Country_of_Operator, Purpose

FROM satellites_cc

GIVEN Class_of_Orbit = ‘GEO’,

1000 TIMES

59 of 60

How the models were built: the BayesDB Meta-Modeling Language (MML)

CREATE METAMODEL satellites_custom ON satellites

USING composer(random_forest (� MODEL Type_of_Orbit (CATEGORICAL)� GIVEN Apogee_km, Perigee_km, Eccentricity,� Period_minutes, Launch_Mass_kg, Power_watts,� Anticipated_Lifetime, Class_of_orbit� ),

external_model (� source = 'kepler.py',� MODEL Period_Minutes (NUMERICAL)� GIVEN Perigee_km, Apogee_km� ),

default_metamodel (� Country_of_Operator CATEGORICAL,� Date_of_Launch NUMERICAL,� Inclination_radians CYCLIC,

ENSURE perigee_km AND apogee_km

ARE MARGINALLY DEPENDENT

))

);

INITIALIZE 16 GENERATIVE POPULATION MODELSFOR satellites_custom;�ANALYZE satellites_custom FOR 4 MINUTES;

SIMULATE perigee_km, period_minutes�FROM satellites�GIVEN purpose = <...>

100 TIMES

Revised Metamodel

Default Metamodel

60 of 60

Detecting predictors of insulin resistance in 30 minutes

.csv risc risc1.utf8.csv

.sql pragma table_info(risc)

SELECT waist FROM risc WHERE "sid1a of RISC 1_3" = 1023;

.nullify risc1 .

SELECT waist FROM risc WHERE "sid1a of RISC 1_3" = 1023;

CREATE GENERATOR risc_cc FOR risc� USING crosscat( GUESS(*), "sid1a of RISC 1_3" IGNORE );

.describe columns risc_cc

INITIALIZE 64 MODELS FOR risc_cc;

ANALYZE risc_cc FOR 8 MINUTES CHECKPOINT 2 ITERATION WAIT;

.heatmap ESTIMATE PAIRWISE DEPENDENCE PROBABILITY FROM risc_cc;

CREATE TEMP TABLE risc_M_deps AS� ESTIMATE COLUMNS DEPENDENCE PROBABILITY WITH M

AS "P(dep with M)" FROM risc_cc;

SELECT * from risc_M_deps ORDER BY "P(dep with M)" DESC LIMIT 30;

SQL: Load the CSV

SQL: Treat "." as NULL

MML: Ask BayesDB to guess datatypes and ignore id column

MML: Build 64 models using 8 minutes

BQL: Look at global dependencies and report probable predictors of insulin resistance