Earth Engine and Google Cloud Platform
Earth Engine User Summit 2018
Matt Hancher, Co-Founder and Engineering Manager, Google Earth Engine
Ground Rules
A whirlwind tour with lots of material
Use these slides as a quick-reference later: g.co/earth/eeus2018-cloud
Take breaks with cute animals
Agenda
Introduction to Google Cloud Platform
Getting Started with Cloud Platform
Command Line Tools: gcloud, gsutil, and earthengine
Exporting Data, Maps, and Map Tiles
Compute Engine and Container Engine
Dataflow, BigQuery, and other GCP Services
TensorFlow and Cloud ML Engine
Savannah the Fennec Fox. Image: Tom Thai
6
For the past 19 years, Google has been building out the world’s fastest, most powerful, highest quality cloud infrastructure on the planet.
6
6
6
Carbon Neutral since 2007.
100% Renewable Energy since 2017.
Measure Power Usage Effectiveness (PUE)
Adjust the Thermostat
Use Free Cooling
Manage Airflow
Optimize Power Distribution
Confidential & Proprietary
Google Cloud Platform
8
Over 15 Years of Tackling Big Data Problems
2008
2002
2004
2006
2010
2012
2014
2015
GFS
MapReduce
2005
BigTable
Spanner
2016
Millwheel
Tensorflow
Dataflow
Flume Java
Dremel
Google �Papers
Over 15 Years of Tackling Big Data Problems
2008
2002
2004
2006
2010
2012
2014
2015
GFS
2005
BigTable
Millwheel
Tensorflow
Spanner
2016
Dataflow
Flume Java
Dremel
MapReduce
Google �Papers
Open
Source
“Google is living a few years in the future and sending the rest of us messages”
Doug Cutting - Hadoop Co-Creator
Use the best of Google’s innovation to
solve the problems that matter most to you.
Over 15 Years of Tackling Big Data Problems
Google �Papers
2008
2002
2004
2006
2010
2012
2014
2015
GFS
Flume Java
Open
Source
2005
Cloud
Products
BigQuery
Pub/Sub
Dataflow
Bigtable
BigTable
Dremel
Spanner
ML
2016
Millwheel
Tensorflow
Dataflow
MapReduce
Solid Foundation
Serverless Data Platform
Let Developers Just Code
Unique Hardware Infrastructure
Purpose-built �chips | Purpose-built �servers | Purpose-built storage | Purpose-built network | Purpose-built data centers |
SJC (JP, HK, SG) 2013
Edge points of presence (>100)
Leased and owned fiber
#
#
Future region and number of zones
Current region and number of zones
Trailing 3 Year CAPEX Investment�
$29.4 Billion
2
3
3
3
3
3
3
3
3
4
3
3
Frankfurt
Singapore
S Carolina
N Virginia
Belgium
London
Taiwan
Mumbai
Sydney
Oregon
Iowa
São Paulo
Finland
Tokyo
Montreal
California
Netherlands
3
3
3
3
3
2
3
GCP Regions
Jupiter
100K servers communicate at 10Gb/s
Resembles reading the Library of Congress in 1/10th sec
Comparable to 40 million home high speed internet connections
Layered Security: Defense in Depth
Deployment
Usage
Operations
Application
Network
Storage
OS+IPC
Boot
Hardware
Titan
Google’s purpose-built chip to establish hardware root of trust for both machines and peripherals on cloud infrastructure.
The Journey to a Web-Scale Cloud
Physical/Colo
Serverless/No-ops
Storage
Processing
Memory
Network
Virtualized
Storage
Processing
Memory
Network
Phase 1
Phase 2
Phase 3
Focus on efficiency and productivity
Analytics
Resource provisioning
Performance tuning
Monitoring
Reliability
Deployment & configuration
Handling growing scale
Utilization improvements
Typical Big Data Processing
Focus on Insight,
Not infrastructure
Analytics
Big Data with Google
Serverless Data Platform
Just send events
Just run queries
Just write pipelines
Our models, built on the results of validation with BigQuery customers, showed that organizations can expect to save between $881K and $2.7M over a three-year period by leveraging BigQuery instead of planning, deploying, testing, managing, and maintaining an on-premises Hadoop cluster.
– Enterprise Strategy Group (ESG) White Paper
1 Billion Users
Compliance
ISO 27001
ISO 27017
ISO 27018
HIPAA
ISAE 3402 Type II
AICPA SOC
AICPA SOC
PCI DSS v3.1
FedRAMP ATO
New in March 2018!
SSAE 15 Type II
Agenda
Introduction to Google Cloud Platform
Getting Started with Cloud Platform
Command Line Tools: gcloud, gsutil, and earthengine
Exporting Data, Maps, and Map Tiles
Compute Engine and Container Engine
Dataflow, BigQuery, and other GCP Services
TensorFlow and Cloud ML Engine
Posing Sand Kitten. Image: Charles Barilleaux
Cloud Projects
All Google Cloud Platform resources live within a project.
Projects manage settings, permissions, billing info, etc.
Cloud Console
Log in at console.cloud.google.com
Cloud Shell
Learn more at cloud.google.com/shell
Cloud Pricing Calculator
Try it at cloud.google.com/products/calculator
Geo for Good Cloud Credits Program
Available to nonprofit, research or public benefit partners in countries where Cloud Platform is available.
Credits will be applied to your Cloud account for use on any of the Google Cloud Platform products.
Fill out the application form: g.co/earth/cloud-credits
Share your use cases with us!
Cloud Storage
Durable and highly available object storage (i.e. file storage) in the cloud, as well as static content serving on the web.
Several storage types all use the same APIs and access methods.
Regions and Zones
Each data center is in a global region, such as Central US, Western Europe, or East Asia.
Each region is a collection of zones, which are isolated from each other within the region.
For example, zone a in the East Asia region is named asia-east1-a.
Note: If you use higher-level Cloud Platform services then you do not need to care!
Objects and Buckets
Files in Google Cloud Storage are called objects.
You store your objects in one or more buckets.
Buckets live in a single global namespace.
Cloud Storage URL: gs://my-bucket/path/to/my-object
Cloud Storage Permissions
Objects can be either public or private.
You control object permissions using Access Control Lists (ACLs).
ACLs grant READER, WRITER, and OWNER permissions to one or more grantees.
You can set the default ACL for newly-created objects in a bucket.
Cloud Storage Console
A simple user interface to:
Serving Static Content
Public objects are served directly over HTTPS:
https://storage.googleapis.com/my-bucket/path/to/my-object
Private objects can be accessed from a browser by logged-in users too,
but it is slower and involves a URL redirection:
https://console.cloud.google.com/m/cloudstorage/b/my-bucket/o/path/to/my-object
Cloud Storage Pricing
Three storage classes:
API queries:
Network egress bandwidth varies by region and volume, $0.08–$0.23 per GB.
Free Usage Limits:
5 GB-months of Regional Storage
5,000 Class A operations
50,000 Class B operations
1 GB Egress to most destinations
A Cloud Storage Pricing Case Study
Scenario: Serving map tiles for a global Landsat-derived layer.
Multi-Regional Storage: 100GB @ $2.60/month
Reads: 1M queries (≈30K page views) @ $1.00/month
Bandwidth: 10GB (distributed globally) @ $1.27/month
Total Cost: $4.87/month ($58.44/year)
Agenda
Introduction to Google Cloud Platform
Getting Started with Cloud Platform
Command Line Tools: gcloud, gsutil, and earthengine
Exporting Data, Maps, and Map Tiles
Compute Engine and Container Engine
Dataflow, BigQuery, and other GCP Services
TensorFlow and Cloud ML Engine
Cheeta Cub. Image: Frontierofficial
The gsutil and gcloud Command Line Tools
gsutil
gcloud
Both come with the Google Cloud SDK: cloud.google.com/sdk
The earthengine Command Line Tool
earthengine
Comes with the Earth Engine Python SDK:
Manage Assets and Files
List assets and files with ls:
earthengine ls users/username/folder
gsutil ls gs://my-bucket/folder
Copy and move assets and files with cp and mv:
earthengine cp users/username/source users/username/destination
gsutil mv gs://my-bucket/source gs://my-bucket/destination
Remove assets and files with rm:
earthengine rm users/username/asset_id
gsutil rm gs://my-bucket/filename
Create Buckets, Folders, and Collections
Create a Cloud Storage Bucket:
gsutil mb gs://my-new-bucket
Create an Earth Engine folder:
earthengine create folder users/username/my-new-folder
Create an Earth Engine image collection:
earthengine create collection users/username/my-new-folder
Upload images from Cloud Storage to Earth Engine
Simple image upload:
earthengine upload image --asset_id my_asset gs://my-bucket/my_file.tif
Control the pyramid of reduced-resolution data:
--pyramiding_policy sample
(Options are mean, sample, mode, min, and max. The default is mean.)
Control the image’s mask:
--nodata_value=255
--last_band_alpha
Upload Tables from Cloud Storage to Earth Engine
A simple table upload:
earthengine upload table --asset_id my_asset gs://my-bucket/my_file.shp
Shapefiles consist of multiple files. Specify the URL to the main .shp file.
Earth Engine will automatically use sidecar files that have the same base filename but different extensions.
Manage Image Metadata in Earth Engine
Set a metadata property on an image asset:
earthengine asset set -p name=value users/username/asset_id
Set the special start time property on an image asset:
earthengine asset set --time_start 1978-10-15T12:34:56 users/username/asset_id
(You can use the same flags to set properties when uploading an image!)
Dump information about an asset:
earthengine asset info users/username/asset_id
Manage Access Permissions
Access Control Lists (ACLs) are how you manage access permissions for private data.
Get an asset’s or object’s ACL with “acl get”:
earthengine acl get users/username/asset_id
gsutil acl get gs://my-bucket/path/to/my/file
Set a “public” (world-readable) or “private” ACL with “acl set”:
earthengine acl set public users/username/asset_id
gsutil acl set private gs://my-bucket/path/to/my/file
Manage Access Permissions (Part 2)
Copy an ACL from one asset to others with “acl get” and “acl set”:
gsutil acl get gs://my-bucket/source > my_acl
gsutil acl set my_acl gs://my-bucket/destination/*
Change an individual user’s access with “acl ch”:
gsutil acl ch -u user@domain.com:R gs://my-bucket/source
Use :W to grant write access, or -d to delete the user’s permissions.
Use the special AllUsers user to control whether all users can see your object.
(These all work the same way in earthengine, too.)
Manage Earth Engine Batch Tasks
List your recent batch tasks:
earthengine task list
Print more detailed info about a specific task:
earthengine task info TASK_ID
Cancel a task:
earthengine task cancel TASK_ID
Agenda
Introduction to Google Cloud Platform
Getting Started with Cloud Platform
Command Line Tools: gcloud, gsutil, and earthengine
Exporting Data, Maps, and Map Tiles
Compute Engine and Container Engine
Dataflow, BigQuery, and other GCP Services
TensorFlow and Cloud ML Engine
Hedgehog. Image: Andrew (Doctor_Q)
Exporting Images to Cloud Storage
You can export images directly to Cloud Storage.
// Export an image to Cloud Storage.
Export.image.toCloudStorage({
image: image,
description: 'MyImageExport',
bucket: 'my-bucket',
fileNamePrefix: 'my_filename',
scale: 30,
region: geometry,
});
This will produce a file named gs://my-bucket/my_filename.tif.
If the image is too large it will be automatically split across multiple files.
Exporting Images to Cloud Storage
Or do it in Python.
from ee.batch import Export
# Export an image to Cloud Storage.
task = Export.image.toCloudStorage(
image=image,
description='MyImageExport',
bucket='my-bucket',
fileNamePrefix='my_filename',
scale=30,
region=geometry,
)
task.start()
Exporting Images to Cloud Storage
Export Cloud Optimized GeoTIFFs.
from ee.batch import Export
# Export an image to Cloud Storage.
task = Export.image.toCloudStorage(
image=image,
description='MyImageExport',
bucket='my-bucket',
fileNamePrefix='my_filename',
scale=30,
region=geometry,
formatOptions={'cloudOptimized': True},
)
task.start()
Learn more at
Exporting Tables to Cloud Storage
Export tables (i.e. FeatureCollections) directly to Cloud Storage.
# Export a table to Cloud Storage.
task = Export.table.toCloudStorage(
collection=features,
description='MyTableExport',
bucket='my-bucket',
fileNamePrefix='my_filename',
)
This will produce a file named gs://my-bucket/my_filename.csv.
In addition to CSV, you can also export Shapefiles, GeoJSON, KML, or KMZ.
Exporting Videos to Cloud Storage
Export videos directly to Cloud Storage.
# Export a video to Cloud Storage.
task = Export.video.toCloudStorage(
collection=images,
description='MyVideoExport',
bucket='my-bucket',
dimensions=720,
framesPerSecond=12,
region=geometry,
)
This will produce a file named gs://my-bucket/myVideoExport.mp4.
Exporting Maps and Map Tiles
Export map tiles directly to Cloud Storage.
# Export a map to Cloud Storage.
task = Export.map.toCloudStorage(
image=image,
description='MyMapExport',
bucket='my-bucket',
path='my_folder',
region=geometry,
maxZoom=5,
})
This will produce a folder named gs://my-bucket/my_folder/ containing your tiles.
Exporting Maps and Map Tiles
The map tile path is: folder/Z/X/Y
Z: The zoom level. Level 0 is global, and each higher level is twice the resolution.
X, Y: The x and y positions of the tile within the zoom level. 0/0 is the upper left.
The Map tiles are in the Google Maps Mercator projection, which is used by most web mapping applications.
If you specifically request PNG or JPG tiles then they will have a .png or .jpg extension.
By default they are a mix of PNG and JPG (a.k.a. “AUTO”) and have no file extension.
Exporting Maps and Map Tiles
Browse the output in the Cloud Storage Browser: console.cloud.google.com/storage/browser
index.html: A Simple HTML Map Viewer
Quickly view your map tiles.
Share a link.
Embed in an <iframe>.
index.html: A Simple HTML Map Viewer
Or, use the code as a starting point for a custom app.
If you expect much traffic, be sure to sign up for a Maps API key.
earth.html: View in Google Earth!
View and share your imagery in the new Google Earth on the web, iOS, and Android!
Map Tile Permissions
By default, the map tiles and index.html file are world readable.
You must be an OWNER of the bucket in order to use this mode.
You can write tiles using the bucket's default ACL instead.
Specify writePublicTiles=false. (You need only be a WRITER to use this mode.)
You can change the default ACL that will be applied to newly-written objects.
For example, make all objects world-readable by default, e.g. for web serving:
gsutil defacl ch -u AllUsers:R gs://my-bucket
The defacl command works just like the acl command, but changes the default ACL.
Agenda
Introduction to Google Cloud Platform
Getting Started with Cloud Platform
Command Line Tools: gcloud, gsutil, and earthengine
Exporting Data, Maps, and Map Tiles
Compute Engine and Container Engine
Dataflow, BigQuery, and other GCP Services
TensorFlow and Cloud ML Engine
Squirrel! Image: Rachel Kramer
Google Compute Engine
Compute Engine with Earth Engine
Two common reasons to use Compute Engine and Earth Engine together:
Run third-party binaries or legacy tools, or run computations that can't be expressed in the Earth Engine API, using data from the Earth Engine data catalog.
Run applications built with EE Python API, such as custom-built web applications. (But also consider App Engine for this use case; it's often simpler.)
Data never has to leave the cloud. Use Cloud Storage as a staging area.
Compute Engine Console
Find it at console.cloud.google.com/compute
Getting Started with Compute Engine
Compute Engine Quick Start:
cloud.google.com/compute/docs/quickstart-linux
Install the Earth Engine SDK:
sudo apt-get update
sudo apt-get install libffi-dev libssl-dev python-dev python-pip
sudo pip install cryptography google-api-python-client earthengine-api
Two Authentication Options
Use your ordinary Google account
Use a Service Account
Using your Ordinary Google Account
Create your virtual machine instance:
gcloud compute instances create my-instance --machine-type f1-micro --zone us-central1-a
Log into your instance via ssh:
gcloud compute ssh my-instance
Now authenticate your instance to EE:
earthengine authenticate
It will print a URL to log in via your browser. Copy and paste the code back into the shell.
Using your Ordinary Google Account
Now you can easily authenticate to Earth Engine from Python:
import ee
ee.Initialize()
That's it! Once you've logged in and authenticated, your credentials are stored locally on the VM and are used by default.
Using your Compute Engine Service Account
Configure the appropriate scopes when you create your Compute Engine instance:
GCP_SCOPE=https://www.googleapis.com/auth/cloud-platform
EE_SCOPE=https://www.googleapis.com/auth/earthengine
gcloud compute instances create my-instance \
--machine-type f1-micro --scopes ${GCP_SCOPE},${EE_SCOPE}
Note: Today you can only create Compute Engine instances with custom scopes using the gcloud command line tool, not via the Compute Engine web UI.
Using your Compute Engine Service Account
Now you can easily authenticate to Earth Engine from Python:
import ee
from oauth2client.client import GoogleCredentials
ee.Initialize(GoogleCredentials.get_application_default())
That's it! In a properly-configured VM you never have to worry about managing service account credentials.
Authorizing your Compute Engine Service Account
Email earthengine@google.com to authorize your service account for EE.
(You only need to do this once: all your Compute Engine instances can share the same Service Account. If you prefer, you can configure others to isolate apps from each other.)
To find your Service Account id:
gcloud compute instances describe my-instance
...
serviceAccounts:
- email: 622754926664-compute@developer.gserviceaccount.com
...
Remember to share any private assets you need with that account!
Compute Engine Pricing
You can choose standard machine sizes,
or you can configure a custom machine size.
Automatic discounts for sustained use.
Preemptible VMs are discounted to around 21% of the base rate!
Typical US prices for a few machine types:
Free Usage Tier:
One f1-micro VM instance
Type | CPUs | Memory | Typical Price / Hour | Sustained Price / Month |
f1-micro | 1 (shared) | 0.60GB | $0.007 | $3.88 |
n1-standard-1 | 1 | 3.75GB | $0.0475 | $24.27 |
n1-standard-16 | 16 | 60GB | $0.76 | $388.36 |
Google Container Engine
A powerful automated cluster manager for running clusters on Compute Engine.
Lets you set up a cluster in minutes, based on requirements you define (such as CPU and memory).
Built on Docker and the open-source Kubernetes system.
We launch over�2 Billion�containers per week.
Containers at Google
Building what’s next
77
Agenda
Introduction to Google Cloud Platform
Getting Started with Cloud Platform
Command Line Tools: gcloud, gsutil, and earthengine
Exporting Data, Maps, and Map Tiles
Compute Engine and Container Engine
Dataflow, BigQuery, and other GCP Services
TensorFlow and Cloud ML Engine
Sea Otter. Image: Linda Tanner
Cloud Dataflow
A unified programming model and a managed service for:
Frees you from tasks like resource management and performance optimization.
Based on Google technologies Flume and MillWheel,�now open source as Apache Beam.
Under the hood, Earth Engine batch jobs are built on the same technology as Cloud Dataflow.
The Dataflow Programming Model
A Java and Python environment for data transformation pipelines.
The Dataflow Programming Model
// Batch processing pipeline
Pipeline p = Pipeline.create();� p.begin()
.apply(TextIO.Read.named(“ReadLines”)
.from(options.getInputFile()))� .apply(new CountWords())
.apply(MapElements.via(new FormatAsTextFn())
.apply(TextIO.Write.named(“WriteCounts”)
.to(options.getOutput()));
p.run();
The Dataflow Programming Model
// Batch processing pipeline
Pipeline p = Pipeline.create();� p.begin()
.apply(TextIO.Read.from(“gs://...”))�
.apply(ParDo.of(new ExtractTags())� .apply(Count.create())� .apply(ParDo.of(new ExpandPrefixes())� .apply(Top.largestPerKey(3))�
.apply(TextIO.Write.to(“gs://...”));
p.run();
// Stream processing pipeline
Pipeline p = Pipeline.create();� p.begin()� .apply(PubsubIO.Read.from(“input_topic”))� .apply(Window.<Integer>by(FixedWindows.of(5, MINUTES))�
.apply(ParDo.of(new ExtractTags())� .apply(Count.create())� .apply(ParDo.of(new ExpandPrefixes())� .apply(Top.largestPerKey(3))�
.apply(PubsubIO.Write.to(“output_topic”));
p.run();
Cloud Dataproc
A managed service offering:
Great for migrating existing open source computation pipelines into Google Cloud Platform with ease.
Dataflow and Spark
Thinking about writing a totally custom processing pipeline?
Read the article, “Dataflow/Beam & Spark: A Programming Model Comparison”
cloud.google.com/dataflow/blog/dataflow-beam-and-spark-comparison
Cloud BigQuery
Google’s fully managed, petabyte scale, low cost data warehouse for tabular data analysis.
BigQuery is serverless: just upload your data and immediately begin issuing familiar SQL queries, with nothing to manage.
BigQuery is ridiculously fast at ripping through huge tables of data in parallel.
Cloud SQL
A fully managed PostgreSQL and MySQL service.
Let Google manage your database so you can focus on your applications.
PostgreSQL support includes PostGIS extensions, the best-in-class open source spatial SQL relational database.
Colaboratory: Easy Jupyter Notebooks in the Cloud
A Jupyter notebook environment that runs entirely in the cloud and requires no setup to use.
Stored in Google Drive, just like Google Docs or Sheets.
Free to use.
Easily manage all your Cloud Platform and�Earth Engine resources in one place!
Enabling Colaboratory in Google Drive
New → More → Connect more apps
Configuring Earth Engine in Colaboratory
Install the Earth Engine API:
!pip install earthengine-api
Get a link to authenticate to Earth Engine:
!earthengine authenticate --quiet
Save your credentials:
!earthengine authenticate --authorization-code=PASTE_YOUR_CODE_HERE
Initialize the Earth Engine API in the usual way:
import ee
ee.Initialize()
Configuring Access to Cloud Platform in Colaboratory
Kick off the authentication flow:
from google.colab import auth
auth.authenticate_user()
Paste the code into the box. You're done!
Earth Engine as a GCP Service
The Earth Engine Cloud API is an HTTP/REST interface to Earth Engine.
Available for early access (pre-Alpha) now:
Coming soon:
Coming later:
Example: Get Info About an Asset
GET /v1/assets/CIESIN/GPWv4/population-density/2000
Returns:
{
"type": "IMAGE",
"path": "CIESIN/GPWv4/population-density/2000",
"updateTime": "2016-12-16T19:51:16.107Z",
"time": "2000-01-01T00:00:00Z",
"bands": [
{
"name": "population-density",
"dataType": {
"precision": "FLOAT32"
},
...
"sizeBytes": "198246654"
}
(Note: Output is truncated to fit on one slide!)
Example: Fetch Pixels from an Asset
POST /v1/assets:getPixels
{
"path": "LANDSAT/LC8/LC80440342017037LGN00",
"bandIds": ["B5", "B4", "B3"],
"visualizationParams": {
"ranges": [{"min": 0, "max": 25000}]
},
"encoding": "JPEG",
"pixelGrid": {
"affine_transform": {
"scaleX": 30,
"scaleY": -30,
"translateX": 580635,
"translateY": 4147365
},
"dimensions": {
"width": 256,
"height": 256
}
}
}
Returns:
Agenda
Introduction to Google Cloud Platform
Getting Started with Cloud Platform
Command Line Tools: gcloud, gsutil, and earthengine
Exporting Data, Maps, and Map Tiles
Compute Engine and Container Engine
Dataflow, BigQuery, and other GCP Services
TensorFlow and Cloud ML Engine
Northern Pearly Eye. Image: USGS Bee Inventory and Monitoring Lab
Sharing our tools with people around the world
TensorFlow �released in Nov. 2015
#1 Repository
for machine learning�on GitHub
Artificial Intelligence
The science to make things smart
Machine Learning
Building machines that can learn
Neural Network
A type of algorithm in machine learning
It all started with cats… lots and lots of cats
A Neural Network is a Function that can Learn
Growth of Machine Learning at Google
2012
2013
2014
2015
2016
0
1000
2000
3000
4000
# of directories containing neural net model description files
2017
Keys to Successful Machine Learning
Large Datasets
Good Models
Lots of Computation
Machine Learning
is made for Cloud
Introduction to TensorFlow
Google's open source library for machine intelligence.
Operates over tensors: n-dimensional arrays
Using a flow graph: data flow computation framework
Introduction to TensorFlow
import tensorflow as tf
# define the network
x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.nn.softmax(tf.matmul(x, W) + b)
# define a training step
y_ = tf.placeholder(tf.float32, [None, 10])
xent = -tf.reduce_sum(y_*tf.log(y))
step = tf.train.GradientDescentOptimizer(0.01).minimize(xent)
Visualization with TensorBoard
Introduction to Cloud Machine Learning Engine (Cloud ML)
Fully managed distributed training and prediction
High-throughput batch training and prediction
Low-latency online prediction
HyperTune for hyper-parameter tuning automation
Tensor Processing Unit (TPU)
Created by Google to train and execute deep neural networks.
15–30X faster: Like fast-forwarding 7 years into the future!
Fully managed in the cloud
Architected for TensorFlow
Best price for performance
180 TFLOP (peak) per device
Cloud TPU Offerings
Cloud TPU (4 TPU v2 chips)
Cloud TPU Pod (256 TPU v2 chips)
Cloud TPU Roadmap
April
July
Q1 2018
Q3 2018
Q2 2018
May
Dec
June
Aug
Sep
Oct
Nov
Q4 2018
Feb
March
Beta
Alpha
Beta
GA
Jan
Q1
Cloud TPU Pod
Cloud TPU
EAP
GA
The Right Tool for the Job
Data Scientist
Cloud ML Engine
Build custom models
ML researcher
Use & extend OSS SDK
App Developer
Perception Services
Use pre-trained models
Earth Engine and TensorFlow Today
Similar graph-based programming model with Python client libraries.
Preprocess Data in EE
Training & Inference in TF
Post-process & Visualize in EE
Export
Import
Train/Test data (Export.table)
TFRecord
Image data (Export.Image)
TFRecord
.train()
.predict()
Predictions
TFRecord
upload
Cloud Storage
EE+TF+Jupyter
Drive everything from a unified
Python environment in Jupyter,
e.g. in Colaboratory.
The Future: Drive Cloud ML Models from Earth Engine
Early prototype: Landsat cloud detection, built with TensorFlow,�hosted in CloudML, running in the Earth Engine Code Editor.
Cloud Pub/Sub
Reliable, many-to-many, asynchronous messaging
Cloud Storage
Powerful, simple, and cost-effective object storage
Raw logs, files, assets, Google Analytics data, and so on
Events, metrics, and so on
A common configuration: capture input data
Cloud Pub/Sub
Cloud Storage
Raw logs, files, assets, Google Analytics data, and so on
Events, metrics, and so on
Stream
Batch
Cloud Dataflow
Data processing engine for
batch and stream processing
A common configuration: process and transform
Cloud Pub/Sub
Cloud Storage
Raw logs, files, assets, Google Analytics data, and so on
Events, metrics, and so on
Stream
Batch
Cloud Dataflow
Data processing engine for
batch and stream processing
Cloud Dataproc
Managed Spark and Hadoop
Batch
A common configuration: process and transform
Cloud Pub/Sub
Cloud Storage
Raw logs, files, assets, Google Analytics data, and so on
Events, metrics, and so on
Stream
Batch
Cloud Dataflow
Cloud Dataproc
Batch
BigQuery
Extremely fast
and cheap on-demand analytics engine
Bigtable
High performance
NoSQL database for large workloads
A common configuration: analyze and store
Cloud Pub/Sub
Cloud Storage
Raw logs, files, assets, Google Analytics data, and so on
Events, metrics, and so on
Stream
Batch
Cloud Dataflow
Cloud Dataproc
Batch
BigQuery
Bigtable
Cloud Machine Learning
Large scale
Train your own models
A common configuration: learn and recommend
Thank you!
Madacascar Lemur