1 of 30

DC/OS Flink Demo

@ApacheFlink @dcos @joerg_schad

© 2017 Mesosphere, Inc. All Rights Reserved.

2 of 30

Fast Data

2

Batch

Event Processing

Micro-Batch

Days

Hours

Minutes

Seconds

Microseconds

Solves problems using predictive and prescriptive analytics

Reports what has happened using descriptive analytics

Predictive User Interface

Real-time Pricing and Routing

Real-time Advertising

Billing, Chargeback

Product recommendations

© 2016 Mesosphere, Inc. All Rights Reserved.

3 of 30

The SMACK Stack

3

EVENTS

Ubiquitous data streams from connected devices

INGEST

Apache Kafka

STORE

Apache Spark

ANALYZE

Apache Cassandra

ACT

Akka

Ingest millions of events per second

Distributed & highly scalable database

Real-time and batch process data

Visualize data and build data driven applications

Mesos/ DC/OS

Sensors

Devices

Clients

© 2016 Mesosphere, Inc. All Rights Reserved.

4 of 30

DATA PROCESSING AT HYPERSCALE

4

EVENTS

Ubiquitous data streams from connected devices

INGEST

Apache Kafka

STORE

Apache Spark

ANALYZE

Apache Cassandra

ACT

Akka

Ingest millions of events per second

Distributed & highly scalable database

Real-time and batch process data

Visualize data and build data driven applications

DC/OS

Sensors

Devices

Clients

© 2016 Mesosphere, Inc. All Rights Reserved.

5 of 30

STREAM PROCESSING

  • Apache Storm
  • Apache Spark
  • Apache Samza
  • Apache Flink
  • Apache Apex
  • Concord
  • cloud-only: AWS Kinesis,�Google Cloud Dataflow

5

© 2016 Mesosphere, Inc. All Rights Reserved.

6 of 30

EXECUTION MODEL

Micro-Batching

Native Streaming

6

© 2016 Mesosphere, Inc. All Rights Reserved.

7 of 30

Apache Flink In a Nutshell

7

Event-driven applications�(event sourcing, CQRS)

Stateful, event-driven,�event-time-aware processing

Batch Processing (data sets)

Stream Processing / Analytics (data streams, windows, …)

8 of 30

WHY APACHE FLINK ON Apache MESOS and DC/OS?

  • Broadly, the Flink community has been working to support a range of resource managers
    • The evolution from analytics -> event-driven applications

  • DC/OS and Mesos were in high-demand from the Flink community
    • 30% of Flink user survey respondents were running on DC/OS or Mesos even before an official integration

  • Flink 1.2.0 (Feb 2017) included DC/OS and Mesos integration

© 2017 Mesosphere, Inc. All Rights Reserved.

9 of 30

9

© 2017 Mesosphere, Inc. All Rights Reserved.

10 of 30

Flink’s Mesos Integration

  • Kudos to Eron Wright ( EronWright) for this work

10

Apache Flink Framework

Mesos Master

Mesos App Master

Flink Mesos�ResourceManager

JobManager

Mesos Task

TaskManager

Mesos Task

TaskManager

Allocate Resources

Launch Mesos tasks

Register

Execute Job

11 of 30

Resource Manager Components

  • Monitors connection to Mesos

11

Connection Monitor

Launch Coordinator

  • Resource offer processing and task scheduling
  • Gathers offers and matches them to tasks using Fenzo

Task Monitor

Reconciliation Coordinator

  • Monitors Mesos tasks
  • Triggers reconciliation
  • Makes sure tasks are properly killed
  • Reconciles tasks view between ResourceManager and Mesos Master

12 of 30

13 of 30

DC/OS

Datacenter Operating System (DC/OS)

Distributed Systems Kernel (Mesos)

Big Data + Analytics Engines

Microservices (in containers)

Streaming

Batch

Machine Learning

Analytics

Functions & Logic

Search

Time Series

SQL / NoSQL

Databases

Modern App Components

Any Infrastructure (Physical, Virtual, Cloud)

14 of 30

DC/OS Flink Package

dcos package install * flink

*--package-version=1.2.0-1.4

https://github.com/mesosphere/dcos-flink-service

© 2017 Mesosphere, Inc. All Rights Reserved.

15 of 30

Finally...

© 2017 Mesosphere, Inc. All Rights Reserved.

16 of 30

ANY QUESTIONS?

THANK YOU!

16

@dcos

users@dcos.io

/groups/8295652

/dcos

/dcos/examples

/dcos/demos

chat.dcos.io

© 2016 Mesosphere, Inc. All Rights Reserved.

17 of 30

AGENDA

Mesosphere & data Artisans

Overview of Mesosphere DC/OS

Overview of Apache Flink & data Artisans

Why we’re better together

Demo

Q&A

© 2017 Mesosphere, Inc. All Rights Reserved.

18 of 30

IN THE ALWAYS CONNECTED ECONOMY SOFTWARE IS YOUR BUSINESS.

Build more apps.

Personal and meaningful customer experiences

Innovate faster.

Capture value from new value streams and actionable data insights

Modernize IT.

Always-available services that are secure and cost effective

19 of 30

MODERN ENTERPRISE APPS REQUIRE NEW CAPABILITIES.

Distributed computing expertise.

Engineering and operations of secure & highly reliable services at scale

DevOps process & culture.

Frequent & reliable releases supported by an automated CI/CD toolchain

Cloud-native technologies.

Containers, microservices, data services (e.g., Spark, Kafka, Cassandra)

20 of 30

BUSINESS APPS

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

CLOUD LIMITATIONS

- LOCK-IN

- HIGH COST

- NO CONTROL

AWS

PLATFORM SERVICES

OPERATIONS & TOOLS

ADMINISTRATORS

OPERATIONAL PROCESSES

PROPRIETARY TECHNOLOGIES

CONTAINER�ORCHESTRATION

CI/CD

BIG DATA ANALYTICS

MESSAGE�QUEUE

DISTRIBUTED DATABASE

SEARCH

AWS INFRASTRUCTURE (EC2)

21 of 30

WHERE IS YOUR CUSTOMER DATA?

WHO CONTROLS YOUR IT STRATEGY?

WHAT IS THE COST OF NO CONTROL?

BUSINESS APPS

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

AWS

PLATFORM SERVICES

OPERATIONS & TOOLS

ADMINISTRATORS

OPERATIONAL PROCESSES

PROPRIETARY TECHNOLOGIES

CONTAINER�ORCHESTRATION

CI/CD

BIG DATA ANALYTICS

MESSAGE�QUEUE

DISTRIBUTED DATABASE

SEARCH

AWS INFRASTRUCTURE (EC2)

22 of 30

We are building for the coming IoT market.. �If I build everything in a silo, I have no chance...

With Mesosphere DC/OS, I have one single contiguous cluster ...

I can ingest data, store it and run all my apps as well, �and that's a huge advantage

Larry Rau

Director Architecture & Infrastructure, Verizon Labs

23 of 30

Programming Model

23

Computation

Computation

Computation

Computation

Source

Source

Sink

Sink

Transformation

state

state

state

state

24 of 30

API & Execution

24

7

Source

DataStream<String> lines = env.addSource(new FlinkKafkaConsumer010(…));

DataStream<Event> events = lines.map(line -> parse(line));

DataStream<Statistic> stats = stream

.keyBy("id")

.timeWindow(Time.seconds(5))

.sum(new MyAggregationFunction());

stats.addSink(new BucketingSink(path));

keyBy()/

window()/

apply()

Transformation

Transformation

Sink

Streaming

Dataflow

map()

Source

Sink

25 of 30

Distributed Runtime

25

26 of 30

Levels of Abstraction

26

Process Function (events, state, time)

DataStream API (streams, windows)

Table API (dynamic tables)

Stream SQL

low-level (stateful stream processing)

stream processing &

analytics

declarative DSL

high-level language

27 of 30

Data Stream API

27

val lines: DataStream[String] = env.addSource(� new FlinkKafkaConsumer09<>(…))

val events: DataStream[Event] = lines.map((line) => parse(line))

val stats: DataStream[Statistic] = stream

.keyBy("sensor")

.timeWindow(Time.seconds(5))

.sum(new MyAggregationFunction())

stats.addSink(new BucketingSink(path))

28 of 30

Table API & Stream SQL

28

29 of 30

Fenzo

  • Developed by Netflix
  • Generic task scheduler for frameworks
  • Matching between tasks and resource offers
    • Pluggable fitness evaluator

29

Fenzo

Mesos

Launch Coordinator

Periodic resource offers

Tell Fenzo offered resources & tasks

Fenzo returns resource task matchings

Tasks to launch

30 of 30

New Distributed Architecture

  • Flip-6: Redesign of Flink’s distributed architecture
  • Adds dynamic resource de/allocation

30

Mesos Master

Mesos Cluster Client

(2) HTTP POST JobGraph/Jars

Flink Master Process

Flink Mesos�ResourceManager

JobManager

(4) Start Process (and supervise)

(8) Deploy�Tasks

(7) Register

(5) Request slots

Flink Mesos

Dispatcher

(3) Allocate container�for Flink master

(6) Allocate containers�for TaskManagers

Marathon

(1) Start and monitor dispatcher

Mesos Task

TaskManager

Mesos Task

TaskManager

https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077