1 of 25

Build your own social media analytics with Apache Kafka

Jakub Scholz, 29th January 2022

https://strimzi.io

2 of 25

Build your own social media analytics with Apache Kafka

About me

  • Senior Principal Software Engineer @ Red Hat
  • Maintainer of Strimzi project (https://strimzi.io)
  • Apache Kafka contributor

@scholzj� https://github.com/scholzjhttps://www.linkedin.com/in/scholzj/

2

3 of 25

Build your own social media analytics with Apache Kafka

Apache Kafka?More than just a messaging broker!

3

4 of 25

Build your own social media analytics with Apache Kafka

Apache Kafka

  • More than just a messaging broker
  • Ecosystem of components and tools
    • Part of Apache Kafka project but also 3rd party
  • Event Streaming Platform
    • What does it mean?

4

5 of 25

Build your own social media analytics with Apache Kafka

5

Brokers

Streams

API

Other

Connect

API

Mirror Maker

Java

Clients

Connectors

Camel�Connectors

...

Debezium

...

UIs

AI / ML

Operators

Other clients

Schema Registries

6 of 25

Build your own social media analytics with Apache Kafka

6

Connect API

Broker

Streams API

Import

Store

Distribute

Process

Export

7 of 25

Build your own social media analytics with Apache Kafka

7

Kafka Connect�+�Camel Connectors

Kafka Brokers

Streams

API

8 of 25

Build your own social media analytics with Apache Kafka

Twitter API

  • Allows you to get information from Twitter
    • Your timeline, tweets, retweets, direct messages or searches for keywords
  • Free version is available
    • Has rate limits
    • Used in the demos

8

9 of 25

Build your own social media analytics with Apache Kafka

Kafka Connect

  • Standalone component of Apache Kafka project
    • Used to get data from other systems into Kafka or the other way around
    • Works with connectors plugins which provide the integration with other systems
    • Source and Sink connectors
    • Most connector plugins are outside of Apache Kafka

9

10 of 25

Build your own social media analytics with Apache Kafka

Camel Connectors

  • Part of the Apache Camel project
    • Several hundred different integrations
    • Can be used on its own or with Apache Kafka
  • Three connectors for connecting to the Twitter API as well
    • Timeline, Search and Direct Messages

10

11 of 25

Build your own social media analytics with Apache Kafka

Kafka Brokers

  • Central part of the Kafka architecture
    • Distribute the messages between consumers and producers
      • Even Kafka Connect or Kafka Streams are consumers / producers
    • Stores the messages
    • Provides high availability and scalability

11

12 of 25

Build your own social media analytics with Apache Kafka

Streams API

  • Just a library which can be included in your application, not a framework
  • Lot of functionality
    • Stateless and Stateful operations
    • Joins
    • Windowing
    • Scalable

12

13 of 25

Build your own social media analytics with Apache Kafka

Quarkus

  • Java Stack designed for Cloud Native deployments
  • Fast start-up, small memory footprint, native compilation
  • Integrates the Kafka Consumer, Producer and Streams API

13

14 of 25

Build your own social media analytics with Apache Kafka

Deep Java Library

  • Deep Learning library
  • Builds on top of PyTorch, Apache MXNet and TensorFlow
  • Can be used for things such as …
    • … Image Classification, Object Detection, Sentiment Analysis

14

15 of 25

Build your own social media analytics with Apache Kafka

Strimzi

  • Provides operators for running Apache Kafka on Kubernetes
    • Makes it easy to deploy and run Apache Kafka clusters
  • Supports all Apache Kafka components
    • Including Kafka Connect and its connectors

15

16 of 25

Demo

Build your own social media analytics with Apache Kafka

16

  • Deploying Kafka

17 of 25

Build your own social media analytics with Apache Kafka

Timeline Word Cloud

  • Connect / Connector reads the Twitter timeline
    • Tweets from accounts you are following
  • Kafka Streams API analyzes the tweets
    • What topics are we most interested in?
    • What hashtags are most common in our timeline?
    • REST API is used to get the results and show them on a website

17

18 of 25

Demo

Build your own social media analytics with Apache Kafka

18

  • Word Cloud

19 of 25

Build your own social media analytics with Apache Kafka

Sentiment Analysis

  • Connect / Connector searches for tweets containing special keywords
  • Kafka Streams API analyzes the tweets
    • Machine Learning will be used to detect the sentiment of the tweets
    • Decides which are positive and which negative
    • Tweets identified as positive or negative are sent to alert topic
  • Connect / Connector alerts with Twitter DM about them

19

20 of 25

Demo

Send your tweets with #BYOSMA

Build your own social media analytics with Apache Kafka

20

  • Sentiment Analysis

21 of 25

Build your own social media analytics with Apache Kafka

Ad-hoc analysis

  • Just play with the stream of tweets
  • Experiment with different ideas / hypothesis and confirm them
  • Example:
    • Do tweets with some attached media get more retweets?

21

22 of 25

Demo

Build your own social media analytics with Apache Kafka

22

  • Ad-hoc analysis

23 of 25

Build your own social media analytics with Apache Kafka

Other Ideas

  • What is the right time to publish tweets?
  • Where do people tweeting about you / your project live? What apps do they use?
  • Write a bot which will react to messages
  • Try it also for other social networks

23

24 of 25

Build your own social media analytics with Apache Kafka

Links

24

25 of 25

Thank you

Demos & Slides:� http://jsch.cz/devconfcz2022

Build your own social media analytics with Apache Kafka