Google Next 25 June 2015

Google Next is a one day conference where Google presents their Cloud Platform products.

Keynote

The keynote gave an overview of Google’s philosophy and some the products.  Cloud is the fastest growing product at Google and is the biggest change for companies in general.  Enterprises are realising that they need to move to the cloud in order to be more agile, and that the cloud is more secure than their own data centers.

Google’s philosophy is to promote the cloud in general and put themselves in a good position among all the providers.  They see themselves as being better for the community - in terms of open standards, as well as green data centers - and the most innovative - several times they mentioned that they’ve been developing and using the technology for 15 years and have learnt what works and what doesn’t.  It’s already well known that Google makes their own optimized hardware for their data centers, but they’ve now reached the point where the network is the bottleneck and they’ve designed their own internal network that’s capable of 1 petabit/second, and built up a worldwide network that brings the connection from the data centers closer to the clients.  They mostly avoided mentioning Amazon and Azure, but said that different cloud platforms have different pros and cons.  Google’s cloud is more stable and reliable, but not necessarily as performant as the others.

The products that were highlighted in the keynote:

AppEngine

The original Google Cloud product.  Platform-As-A-Service.  Lets you just write the code, in Python, Java, PHP or Go.  Auto-scaling down to zero.

Compute Engine

Infrastructure-As-A-Service.  Cleverer than just virtual machines, with features such as live migrate which allows hardware and software updates without downtime, and preemptible VM which allows immediate and automatic scaling for a greater cost saving.

Container Engine (beta) and Container Registry (GA)

Managed Docker containers.  Google has developed Kubernetes which is an open source container manager.  Containers are self-contained applications that contain all their dependencies and can be quickly deployed locally and on the cloud.  The containers are an open format and so can be deployed to any cloud provider.

Storage

This covers many products including Bigtable (beta), NoSQL Datastore and regular SQL (MySQL).

Google invented MapReduce and Hadoop but discovered that it doesn’t work well at huge scale.  They’ve now moved on to BigQuery and Dataflow (beta) which perform near-real-time queries on big data.

They also mentioned Nearline storage which provides cheap archive storage with quick access.  Data can be accessed in a few seconds, as opposed to Amazon Glacier which is a few hours.  The same API is used for all storage options.

After the keynote the presentations split off into two tracks, so I stuck with the developer track which had the following four talks.

From Zero to Hero

Described the architecture they used to make a 3D photo product using off-the-shelf components.  They set up a load of Nexus 5’s in a circle and used the cloud to process the pictures and make a 3D animated GIF.  This included blob Storage, Pub/Sub messaging, and Compute.

o5_B0RWsAgSrJGwY4cDtlbn7b2I1n2LNuolGg-htkpTK=w1387-h780-no

Real-time Mobile Games

Similar to the previous talk, they used an example application to explain a product, in this case Firebase, which is a cross-platform real-time database.  The game was asteroids where every update on every device was sent to the cloud database and immediately updated on all the other devices.

Desired State with Kubernetes

Demonstration of Kubernetes running locally to manage desired state of a cluster of Docker containers.  It has a nice visualization of the state.

fzgiFYleE_KKJYNQV9_xLnvKzCRWDaqXidVJgGXQ-GWq=w1387-h763-no

The World Beyond MapReduce

Nice demonstration of how quickly and easily you can use BigQuery to run SQL-like queries on big data.  It also included Dataflow to get the data from external sources.  He did complex queries on a few hundred GB of json data and it took around 30 seconds, which is much less time than it would take even to just read that from disk, but it’s reading and doing the map/shuffle/reduce in parallel on hundreds of machines, and you’re only billed for the time that it’s actually running.

One of the partners of the event was Tableau who can make some nice visualizations of the data.