1 of 18

Oslo PTG / 02-27 2018 / Dublin / Bus evaluation�

Who am I?

Matthieu Simonin

Permanent Research Engineer at Inria (french computer science research center)

Part time in the Discovery project (funded by Inria)

  • https://beyondtheclouds.github.io/
  • Research activities on Fog/Edge computing

1

2 of 18

Oslo PTG / 02-27 2018 / Dublin / Bus evaluation

Who am I?

Involved in OpenStack

  • FEMDC (Fog Edge Massively Distributed Clouds) Working group

  • Performance Working Group

2

3 of 18

Oslo PTG / 02-27 2018 / Dublin / Bus evaluation

Why am I here ?

3

lightweight / resilient / locality

4 of 18

Oslo PTG / 02-27 2018 / Dublin / Bus evaluation

Why am I here ?

Massively Distributed RPC test plan:

https://docs.openstack.org/performance-docs/latest/test_plans/massively_distribute_rpc/plan.html

Goals:

  • Evaluate oslo.messaging patterns in a distributed context
    • RPC servers/clients pushed at the edge of the network
    • Distribute the bus
      • Clustered or federation of RabbitMQ or ?
      • AMQP1.0 routers (qpid-dispatch-router)
  • Evaluate resiliency of bus + oslo.messaging in a distributed context
    • e.g network failures, high latency
  • Identify gaps

4

5 of 18

Oslo PTG / 02-27 2018 / Dublin / Bus evaluation

Why am I here ?

Patterns under study:

  • Anycast (single target)
    • How far we can go in terms of #clients and #servers on a (distributed) bus
  • Multiple targets
    • How many different targets can be handled by a (distributed) bus
  • Broadcast (single target fanout)
    • How far we can go in terms of #servers on a (distributed bus)
  • Multiple targets broadcast

5

6 of 18

Oslo PTG / 02-27 2018 / Dublin / Bus evaluation

Why am I here ?

Target infrastructure : mainly Grid’5000

  • reconfigurable platform for reproducible research

  • think of a distributed Ironic with

  • 1000 physical machines

6

7 of 18

Oslo PTG / 02-27 2018 / Dublin / Bus evaluation

What has been done so far

Framework : https://github.com/msimonin/ombt-orchestrator

- Assumption : We first consider oslo.messaging outside the OpenStack use

- Get machines on a target platform

- Perform initial configurations

- Orchestrate ombt2 agents deployment

- Collects metrics

- system metrics (CPU, Mem, Network traffic ...)

- application metrics (latency, message rate, failures...)

7

8 of 18

Oslo PTG / 02-27 2018 / Dublin / Bus evaluation

What has been done so far

Baseline study: one single Target (test case 1 on 5)

- Standalone RabbitMQ instance : https://tinyurl.com/ya5tw5pd

- Standalone Qpid Dispatch Router Instance (AMQP 1.0) https://tinyurl.com/yalk3j95

- 4x Qpid Dispatch Router Instance: https://tinyurl.com/y7nxf2v9

8

9 of 18

Oslo PTG / 02-27 2018 / Dublin / Bus evaluation

What has been done so far

Baseline study:

campaign:

test_case_1:

bus: [“rabbitmq”, “qpid-dispatch-router”, “qpid-dispatch-router-4x”]

nbr_servers: [1, 250, 500, 750, 1000]

nbr_clients: [1, 250, 500, 750, 1000, 1250, 1500, 1750, 2000]

call_type: ["rpc-call", “rpc-cast”]

nbr_calls: [10000]

pause: [0.1]

length: [1024]

9

10 of 18

Oslo PTG / 02-27 2018 / Dublin / Bus evaluation

10

Looking at the bus

RabbitMQ is eating a lot of resources for the largest run

  • 20 cores
  • 25 GB (calls tests) / 3.5 GB (casts tests)

QDR is much more moderate

  • 4 cpus max
  • 25 GB (calls tests) / 450 MB (casts tests)

11 of 18

Oslo PTG / 02-27 2018 / Dublin / Bus evaluation

11

CALLs

CASTs

12 of 18

Oslo PTG / 02-27 2018 / Dublin / Bus evaluation

12

Looking at RPCs servers/clients metrics

CPU/MEM for RabbitMQ driver and AMQP1.0 driver

  • For cast test: No diff observed between the two drivers

  • For call test: Slight differences in the memory consumption (increased memory usage)
    • needs more investigation in the AMQP1.0 driver implementation

Application metrics:

  • So far QDR is scaling pretty well (low latency / high throughput)

13 of 18

Oslo PTG / 02-27 2018 / Dublin / Bus evaluation

13

RabbitMQ

QDR

14 of 18

Oslo PTG / 02-27 2018 / Dublin / Bus evaluation

14

rabbitMQ

QDR

15 of 18

Oslo PTG / 02-27 2018 / Dublin / Bus evaluation

RabbitMQ drivers is consuming some TCP connections

15

Approx 2x (#servers + #clients)

Connection objects isn’t shared between clients and servers in oslo.messaging

(thread safe issue ?)

16 of 18

Oslo PTG / 02-27 2018 / Dublin / Bus evaluation

AMQP 1.0 driver is re-using connections

16

1x (#clients + #servers) connections used

17 of 18

Oslo PTG / 02-27 2018 / Dublin / Bus evaluation

What is next ?

1)

  • Scalability : 10 K+ more RPC clients / servers (using a distributed bus)
  • Distribution: Multi-sites emulations (latency / bandwitdth emulations)
  • Completeness: Study all messaging patterns

2)

  • Go back to the OpenStack use case

3) Vancouver Summit

17

18 of 18

Matthieu Simonin

matthieu.simonin@inria.fr

Oslo PTG / 02-27 2018 / Dublin / Bus evaluation�

18