1 of 26

Remoting Kafka Plugin

Google Summer of Code 2018 Project

© 2018 All Rights Reserved.

1

2 of 26

Introduction

© 2018 All Rights Reserved.

2

3 of 26

About me

  • Pham Vu Tuan - @pvtuan10
  • Before GSoC
    • Final year Computer Science student from Singapore

  • After GSoC
    • Software Engineer for an Internet platform company in Singapore
    • Focus on backend development

© 2018 All Rights Reserved.

3

4 of 26

GSoC Mentors

  • Oleg Nenashev - @oleg-nenashev

  • Supun Wanniarachchi - @Supun94

  • Devin Nusbaum - @dwnusbaum

  • Jeff Thompson - @jeffret-b

© 2018 All Rights Reserved.

4

5 of 26

Overview

© 2018 All Rights Reserved.

5

6 of 26

Overview

  • Jenkins uses TCP for master - agent communication (JNLP in particular)
  • Problems with existing JNLP protocols:
    • If the connection or agent fails, the build fails and we have to restart
    • Issues with traffic prioritization and multi-agent communication, which impact Jenkins stability and scalability
  • Project Goal: Develop a plugin that uses Apache Kafka to build a fault-tolerant communication in Jenkins

© 2018 All Rights Reserved.

6

7 of 26

What is Apache Kafka?

  • 3 key capabilities:
    • Publish and subscribe to stream of records, similar to message queue systems such as RabbitMQ or ActiveMQ
    • Store streams of records in fault-tolerance durable way
    • Process streams of records as they occur
  • Basic concepts:
    • Apache Kafka uses replicated commit log, instead of queues
    • Streams of records are stored in categories called topics
    • Each record consists of a key, value and a timestamp

© 2018 All Rights Reserved.

7

8 of 26

Why Apache Kafka?

  • Apache Kafka suitable features:
    • Distributed, replicated commit log, which helps to remove message delivery complexity
    • Support data streaming, which RabbitMQ is lack of
    • Apache Kafka is said to be scaled better
    • Good support from the community

© 2018 All Rights Reserved.

8

9 of 26

Technical Design

© 2018 All Rights Reserved.

9

10 of 26

High-level Architecture

© 2018 All Rights Reserved.

10

11 of 26

Plugin Communication Model

  • The plugin uses manual partition assignment for topic management in Kafka. A master-agent connection will create a topic on the fly with 4 partitions:
    • Partition 0: command from master to agent
    • Partition 1: command from agent to master
    • Partition 2: secret message from master to agent
    • Partition 3: secret message from agent to master
  • Advantages over dynamic partition assignment:
    • Less number of topics to manage per master-agent connection
    • Less consumer groups per connection

© 2018 All Rights Reserved.

11

12 of 26

GSoC Summary

© 2018 All Rights Reserved.

12

13 of 26

GSoC Summary

  • 6 months: 3 phases preparation + 3 phases coding
    • Feb 2018 - May 2018: Preparation phase
    • Phase 1 (May - Jun): Command transport to support Kafka, first skeleton of the plugin
    • Phase 2 (Jun - July): Security improvement + first alpha release
    • Phase 3 (July - Aug): 1.0 release + community engagement work
  • More than 40 Github PRs resolved and merged
  • Current version is 1.1 with demo instructions and technical documentation updated in Github
  • 3 blog posts published in jenkins.io to introduce the plugin to the community

© 2018 All Rights Reserved.

13

14 of 26

Features

© 2018 All Rights Reserved.

14

15 of 26

Kafka Global Configuration

© 2018 All Rights Reserved.

15

16 of 26

Launch agent with Kafka launcher

© 2018 All Rights Reserved.

16

17 of 26

Launch agent as a JAR

© 2018 All Rights Reserved.

17

18 of 26

Launch agent as a Docker Image

© 2018 All Rights Reserved.

18

19 of 26

Run jobs, pipeline with Kafka agent

© 2018 All Rights Reserved.

19

20 of 26

Command transport over Kafka

© 2018 All Rights Reserved.

20

21 of 26

Live Demo

© 2018 All Rights Reserved.

21

22 of 26

Future Work

© 2018 All Rights Reserved.

22

23 of 26

Future Work (JENKINS-53417)

  • Agent recovery to continue running jobs after disconnection to Kafka (JENKINS-52954)
  • CloudAPI implementation (JENKINS-51474)
  • Chunking capabilities for Kafka channel (JENKINS-51709)
  • Consumer pooling, NIO options (JENKINS-52199)
  • Support multiple Kafka hosts to achieve fault-tolerance communication (JENKINS-52542)
  • Stop bundling remoting.jar in Remoting Kafka Agent (JENKINS-51944)
  • Make Zookeeper configuration optional (JENKINS-52870)

© 2018 All Rights Reserved.

23

24 of 26

Links

© 2018 All Rights Reserved.

24

25 of 26

Links

© 2018 All Rights Reserved.

25

26 of 26

Q & A

© 2018 All Rights Reserved.

26