DC/OS Flink Demo
@ApacheFlink @dcos @joerg_schad
© 2017 Mesosphere, Inc. All Rights Reserved.
Fast Data
2
Batch
Event Processing
Micro-Batch
Days
Hours
Minutes
Seconds
Microseconds
Solves problems using predictive and prescriptive analytics
Reports what has happened using descriptive analytics
Predictive User Interface
Real-time Pricing and Routing
Real-time Advertising
Billing, Chargeback
Product recommendations
© 2016 Mesosphere, Inc. All Rights Reserved.
The SMACK Stack
3
EVENTS
Ubiquitous data streams from connected devices
INGEST
Apache Kafka
STORE
Apache Spark
ANALYZE
Apache Cassandra
ACT
Akka
Ingest millions of events per second
Distributed & highly scalable database
Real-time and batch process data
Visualize data and build data driven applications
Mesos/ DC/OS
Sensors
Devices
Clients
© 2016 Mesosphere, Inc. All Rights Reserved.
DATA PROCESSING AT HYPERSCALE
4
EVENTS
Ubiquitous data streams from connected devices
INGEST
Apache Kafka
STORE
Apache Spark
ANALYZE
Apache Cassandra
ACT
Akka
Ingest millions of events per second
Distributed & highly scalable database
Real-time and batch process data
Visualize data and build data driven applications
DC/OS
Sensors
Devices
Clients
© 2016 Mesosphere, Inc. All Rights Reserved.
STREAM PROCESSING
5
© 2016 Mesosphere, Inc. All Rights Reserved.
EXECUTION MODEL
Micro-Batching
Native Streaming
6
© 2016 Mesosphere, Inc. All Rights Reserved.
Apache Flink In a Nutshell
7
Event-driven applications�(event sourcing, CQRS)
Stateful, event-driven,�event-time-aware processing
Batch Processing (data sets)
Stream Processing / Analytics (data streams, windows, …)
WHY APACHE FLINK ON Apache MESOS and DC/OS?
© 2017 Mesosphere, Inc. All Rights Reserved.
9
© 2017 Mesosphere, Inc. All Rights Reserved.
Flink’s Mesos Integration
10
Apache Flink Framework
Mesos Master
Mesos App Master
Flink Mesos�ResourceManager
JobManager
Mesos Task
TaskManager
Mesos Task
TaskManager
Allocate Resources
Launch Mesos tasks
Register
Execute Job
Resource Manager Components
11
Connection Monitor
Launch Coordinator
Task Monitor
Reconciliation Coordinator
DC/OS
Datacenter Operating System (DC/OS)
Distributed Systems Kernel (Mesos)
Big Data + Analytics Engines
Microservices (in containers)
Streaming
Batch
Machine Learning
Analytics
Functions & Logic
Search
Time Series
SQL / NoSQL
Databases
Modern App Components
Any Infrastructure (Physical, Virtual, Cloud)
DC/OS Flink Package
dcos package install * flink
*--package-version=1.2.0-1.4
https://github.com/mesosphere/dcos-flink-service
© 2017 Mesosphere, Inc. All Rights Reserved.
Finally...
© 2017 Mesosphere, Inc. All Rights Reserved.
ANY QUESTIONS?
THANK YOU!
16
@dcos
users@dcos.io
/groups/8295652
/dcos
/dcos/examples
/dcos/demos
chat.dcos.io
© 2016 Mesosphere, Inc. All Rights Reserved.
AGENDA
Mesosphere & data Artisans
Overview of Mesosphere DC/OS
Overview of Apache Flink & data Artisans
Why we’re better together
Demo
Q&A
© 2017 Mesosphere, Inc. All Rights Reserved.
IN THE ALWAYS CONNECTED ECONOMY SOFTWARE IS YOUR BUSINESS.
Build more apps.
Personal and meaningful customer experiences
Innovate faster.
Capture value from new value streams and actionable data insights
Modernize IT.
Always-available services that are secure and cost effective
MODERN ENTERPRISE APPS REQUIRE NEW CAPABILITIES.
Distributed computing expertise.
Engineering and operations of secure & highly reliable services at scale
DevOps process & culture.
Frequent & reliable releases supported by an automated CI/CD toolchain
Cloud-native technologies.
Containers, microservices, data services (e.g., Spark, Kafka, Cassandra)
BUSINESS APPS
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
CLOUD LIMITATIONS
- LOCK-IN
- HIGH COST
- NO CONTROL
AWS
PLATFORM SERVICES
OPERATIONS & TOOLS
ADMINISTRATORS
OPERATIONAL PROCESSES
PROPRIETARY TECHNOLOGIES
CONTAINER�ORCHESTRATION
CI/CD
BIG DATA ANALYTICS
MESSAGE�QUEUE
DISTRIBUTED DATABASE
SEARCH
AWS INFRASTRUCTURE (EC2)
WHERE IS YOUR CUSTOMER DATA?
WHO CONTROLS YOUR IT STRATEGY?
WHAT IS THE COST OF NO CONTROL?
BUSINESS APPS
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
AWS
PLATFORM SERVICES
OPERATIONS & TOOLS
ADMINISTRATORS
OPERATIONAL PROCESSES
PROPRIETARY TECHNOLOGIES
CONTAINER�ORCHESTRATION
CI/CD
BIG DATA ANALYTICS
MESSAGE�QUEUE
DISTRIBUTED DATABASE
SEARCH
AWS INFRASTRUCTURE (EC2)
We are building for the coming IoT market.. �If I build everything in a silo, I have no chance...
With Mesosphere DC/OS, I have one single contiguous cluster ...
I can ingest data, store it and run all my apps as well, �and that's a huge advantage
Larry Rau
Director Architecture & Infrastructure, Verizon Labs
Programming Model
23
Computation
Computation
Computation
Computation
Source
Source
Sink
Sink
Transformation
state
state
state
state
API & Execution
24
7
Source
DataStream<String> lines = env.addSource(new FlinkKafkaConsumer010(…));
DataStream<Event> events = lines.map(line -> parse(line));
DataStream<Statistic> stats = stream
.keyBy("id")
.timeWindow(Time.seconds(5))
.sum(new MyAggregationFunction());
stats.addSink(new BucketingSink(path));
keyBy()/
window()/
apply()
Transformation
Transformation
Sink
Streaming
Dataflow
map()
Source
Sink
Distributed Runtime
25
Levels of Abstraction
26
Process Function (events, state, time)
DataStream API (streams, windows)
Table API (dynamic tables)
Stream SQL
low-level (stateful stream processing)
stream processing &
analytics
declarative DSL
high-level language
Data Stream API
27
val lines: DataStream[String] = env.addSource(� new FlinkKafkaConsumer09<>(…))
val events: DataStream[Event] = lines.map((line) => parse(line))
val stats: DataStream[Statistic] = stream
.keyBy("sensor")
.timeWindow(Time.seconds(5))
.sum(new MyAggregationFunction())
stats.addSink(new BucketingSink(path))
Table API & Stream SQL
28
Fenzo
29
Fenzo
Mesos
Launch Coordinator
Periodic resource offers
Tell Fenzo offered resources & tasks
Fenzo returns resource task matchings
Tasks to launch
New Distributed Architecture
30
Mesos Master
Mesos Cluster Client
(2) HTTP POST JobGraph/Jars
Flink Master Process
Flink Mesos�ResourceManager
JobManager
(4) Start Process (and supervise)
(8) Deploy�Tasks
(7) Register
(5) Request slots
Flink Mesos
Dispatcher
(3) Allocate container�for Flink master
(6) Allocate containers�for TaskManagers
Marathon
(1) Start and monitor dispatcher
Mesos Task
TaskManager
Mesos Task
TaskManager
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077