Distributed Databases
an exploration of approaches and best practices
Julia Ferraioli
Developer Advocate
Brian Dorsey
Developer Programs Engineer
Your Hosts
Julia Ferraioli
Developer Advocate
@juliaferraioli
Brian Dorsey
Developer Programs Engineer
@briandorsey
Why Distributed Databases?
Image courtesy of Allie Brosh of Hyperbole and a Half
Images by Connie Zhou
Images by Connie Zhou
Your Panelists
Google Cloud Datastore
Tyler Hannan
@tylerhannan
Chris Ramsdale
@cramsdale
Will Shulman
@willshulman
Mike Miller
@mlmilleratmit
Riak: An Open Source, Distributed Key/Value Database
Basho Technologies
Tyler Hannan
About Basho Technologies
Who we are, what we do
What Is Riak?
The Benefits of Riak
Riak is an Ops-friendly database that is:�
How Does That Work?
The Properties of a Distributed Database
Riak is a key/value store that is:
Riak is a Key/Value Store
Simple Operations, Opaque Values, Layered with Extras
Bucket
Key
Value
Key
Value
Key
Value
Riak is Masterless
Deployed as a Cluster of Nodes
Node
Node
Node
Node
Node
"Big Data", "Web Scale", "Other Terms"
When Your Data Is Critical, Scalability Is Critical
$ gcutil --project=RiakCluster addinstance \� riak5 --machine_type=n1-standard-4�$ gcutil --project=RiakCluster ssh riak5
# Install Riak programmatically or via startup script
$ riak-admin cluster join riak1@192.168.2.2�$ riak-admin cluster plan�$ riak-admin cluster commit
Shell
When Would I Use Riak on Google Compute?
The situations & the circumstances
Operationally-friendly database
- combined with -
Operationally-scalable compute platform
for gaming, social, mobile, retail, advertising, etc.
Getting to Know Cloudant
Your Friendly Neighborhood NoSQL Database Service
Mike Miller
Co-Founder, Chief Scientist
Ships with a mobile strategy
Google Cloud Datastore
Scale with your users, not your servers
Chris Ramsdale
Product Manager, Google Cloud Platform
Google Cloud Platform Storage
Family of Managed Storage Services
Google Cloud Platform Storage
Family of Managed Storage Services
Announcing the Google Cloud Datastore
App Engine High Replication Datastore (HRD)
Fully Managed Schemaless Storage
Google Cloud Datastore
Google Cloud Datastore
HRD
Memcache
Managed Runtimes
Task Queues
Google Cloud Datastore
HTTP Interface
Google Cloud Datastore
Bringing Google Infrastructure to Developers
API Frontend
Cloud Datastore Service
Megastore
BigTable
Colossus
Networking
Server Hardware
Google Cloud Datastore
Bringing Google Infrastructure to Developers
API Frontend
Cloud Datastore Service
Megastore
BigTable
Colossus
Networking
Server Hardware
Google Cloud Datastore
High Availability
API Frontend
Cloud Datastore Service
Megastore
BigTable
Colossus
Networking
Server Hardware
Google Cloud Datastore
High Scalability
API Frontend
Cloud Datastore Service
Megastore
BigTable
Colossus
Networking
Server Hardware
Google Cloud Datastore
Access from Anywhere
API Frontend
Cloud Datastore Service
Megastore
BigTable
Colossus
Networking
Server Hardware
Managed Frontend
App Engine SDK
Unmanaged Backend
Cloud Datastore API
Google Cloud Datastore
Fully Managed
API Frontend
Cloud Datastore Service
Megastore
BigTable
Colossus
Networking
Server Hardware
Google Cloud Datastore
Fully Managed
API Frontend
Cloud Datastore Service
Megastore
BigTable
Colossus
Networking
Server Hardware
Google Cloud Datastore
Fully Managed
API Frontend
Cloud Datastore Service
Megastore
BigTable
Colossus
Networking
Server Hardware
An intro to MongoDB and MongoLab
in < 5 minutes
Will Shulman
CEO MongoLab
What is MongoDB?
MongoDB is an open source, high-performance, distributed, and document-oriented database.
MongoDB is document-oriented
a.k.a. object-oriented
{
_id: 1234,
author: { name: "Bob Davis", email : bob@davis.com },
post: "In these troubled times I like to …",
date: { $date: "2010-07-12 13:23UTC" },
location: [ -121.2322, 42.1223222 ],
rating: 2.2,
comments: [
{ user: "jgs32@gmail.com", upVotes: 22, downVotes: 14, text: "Great point" },
{ user: "holly.lu@gmail.com", upVotes: 421, downVotes: 22, text: "You're a moron" }
],
tags: [ "Politics", "Virginia" ]
}
MongoDB is great as an operational data store
. . . with a rich query language
db.posts.find({ author.name: "mike" })
db.posts.find({ rating: { $gt: 2 }})
db.posts.find({ tags: "Software" })
db.posts.find().sort({date: -1}).limit(10)
db.places.find({loc: {$within : {$center : [[40,40],10]}}})
db.places.aggregate({$group: { _id: "$state", pop: { $sum: "$pop" }}})
MongoDB is great as an operational data store
. . . with support for indexes on any field
db.posts.ensureIndex({ author.name : 1 })
db.posts.find({ author.name: "mike" })
MongoDB is a distributed database
. . . with high availability via Replica Set clusters
primary
secondary_0
secondary_n
client
. . .
replication
MongoDB is a distributed database
. . . with horizontal scalability via Sharded Clusters
client
. . .
shard_0
shard_1
shard_n
mongos
config_0
config_1
config_2
mongos
. . .
Replica Set
What is MongoLab?
MongoLab is MongoDB-as-a-Service
MongoLab is MongoDB-as-a-Service
Features/benefits
We automate the operational aspects of running MongoDB (so you don't have to)
Product offering
MongoLab is MongoDB-as-a-Service
We support all the major cloud providers
New as of today!
SELECT questions FROM audience
Tyler Hannan
@tylerhannan
Chris Ramsdale
@cramsdale
Will Shulman
@willshulman
Mike Miller
@mlmilleratmit
Google Cloud Datastore
<Thank You!>
jrf@google.com
google.com/+JuliaFerraioli
@juliaferraioli
briandorsey@google.com
google.com/+BrianDorsey
@briandorsey