1 of 23

Adding Observability to Nomad Applications

Using the Grafana

Open Observability Platform

Copyright © 2020 HashiCorp

2 of 23

Not long ago...

3 of 23

Well that escalated quickly

Amazon architecture shift

Amazon retail website was a big monolithic architecture back in 2001, seven years later they run the death star.

*source: Werner Vogels on Twitter.

4 of 23

Operational Complexity

You need a mature operations team to manage lots of services, which are being redeployed regularly.

Microservice Trade-Offs by Martin Fowler

5 of 23

Cyril Tovena�(He/Him) 🇫🇷

Software Engineer at Grafana Labs

6 of 23

  1. What is Observability ?

2. The Grafana Stack

3. Deployment with

7 of 23

Copyright © 2020 HashiCorp

What is Observability ?

8 of 23

Observability

The definition

Observability (o11y) is a measure of how well internal states of a system can be inferred from knowledge of its external outputs.

In practice, observability is achieved by using various tools and techniques to make sense of complex distributed systems.

9 of 23

Observability

The Three Pillars

10 of 23

Today’s reality

Disparate systems. Disparate data.

11 of 23

Copyright © 2020 HashiCorp

The Grafana Stack

12 of 23

Grafana

The Open Source Single Pane of Glass

13 of 23

Grafana

Our Opinionated Open Source Stack

Prometheus

Metrics

Loki

Logs

Tempo

Traces

14 of 23

Why

Prometheus

A pull-based monitoring system with dynamic service discovery, built for the cloud.

A powerful query language and multidimensional data model for rich, ad hoc analysis.

Open source, incredibly resource efficient and simple to operate.

github.com/prometheus/prometheus

15 of 23

Why

Loki

A multi-tenant log aggregation system inspired by Prometheus

Incredibly small index and highly compressed log data which makes it easy and affordable to operate.

Open source, support various cloud storage backend.

github.com/grafana/loki

16 of 23

Why

Tempo

An open source, easy-to-use and high-scale distributed tracing backend.

Tempo is cost-efficient, and is deeply integrated with Grafana, Prometheus, and Loki.

Support any of the open source tracing protocols, including Jaeger, Zipkin, and OpenTelemetry.

github.com/grafana/tempo

17 of 23

Grafana

A better incident response workflow

18 of 23

Copyright © 2020 HashiCorp

Deployment with

19 of 23

The New Stack

A simple three-tier demo application, fully instrumented with Prometheus, Jaeger and Loki logging

github.com/grafana/tns

20 of 23

Deployment with Nomad

~/go/src/github.com/cyriltovena/observability-nomad > tree

├── Makefile

├── Vagrantfile

├── jobs

│ ├── grafana.nomad.hcl

│ ├── loki.nomad.hcl

│ ├── prometheus.nomad.hcl

│ ├── promtail.nomad.hcl

│ ├── tempo.nomad.hcl

│ └── tns.nomad.hcl

└── provisioning

├── dashboard.json

└── dashboard.yaml

TERMINAL

21 of 23

Copyright © 2020 HashiCorp

Demo time !

22 of 23

What’s Next ?

Monitoring more than just your applications.

  • Activate Nomad telemetry
  • Prometheus Node Exporter
  • Consul Exporter
  • Journald logs with Promtail
  • cAdvisor
  • Windows Exporter
  • …….

23 of 23

23

Thank You

github.com/cyriltovena/observability-nomad

cyril.tovena@grafana.com @kuqd github.com/cyriltovena slack.grafana.com #loki