1 of 22

Enterprise-Scale DevOps

with a Startup-Sized Team

using GitLab, AWS, and Friends

MICHAEL IRWIN and CARL HARRIS

NOVEMBER 7, 2019

2 of 22

Who are we?

  • Michael Irwin (@mikesir87)
    • Disruptor of Things at VT since 2011
    • AWS Certified SA (Associate) and Developer; Docker Captain/Community Leader
  • Carl Harris
    • Boss of ^
    • Executive Scapegoat
    • CTO
  • Both involved with many prototype/sandbox efforts working on the bleeding edge

3 of 22

Summit

  • Virginia Tech's research administration platform
    • pre-award+submission and contract negotiation workflows
    • post-award… maybe someday
  • Our first enterprise scale dev-to-cloud effort (circa 2016)
  • Architecture
    • Java EE backend running on OpenJRE 11 and Wildfly
    • PostgreSQL database plus various platform services (AWS RDS, S3, Secrets Manager, Lambda, etc)
    • Single-page front end web applications based on AngularJS (then), React (now)

4 of 22

Summit Team Composition

Developers

QA/Acceptance Testing

Product Team

5 of 22

Supporting Agile Development

master

CREST-7373

other feature branches

2.21.0

post-merge fixes

end of sprint +

time for pprd testing

start of sprint

subsequent features will

get next build version

acceptance testing and

final code reviews performed

6 of 22

GitLab is the Nexus for Summit Development

  • Every Summit component has its own source repository
  • Each repository has its own Gitlab-CI configuration
    • allows different kinds of builds for components using vastly different technologies
  • A manifest repository ties it all together for per-branch packaging and deployment
    • component builds use a webhook to trigger manifest builds

7 of 22

Container Images are the Lingua Franca of CI

Every CI build produces at least one container image and pushes to at least one registry.

8 of 22

Summit in a Box™

  • Uses docker app to run the entire stack.
  • At runtime, the app is configured to use local alternatives for cloud platform services such as AWS S3, Secrets Manager, etc.
  • Traefik does Host-header-based routing using container labels.
  • With development happening across multiple components, how do we get a consistent set of component images that work together?

9 of 22

Summit Build Pipeline

Component Repository

pushes

code to a branch

starts build pipeline

Local Registry

config now uses image from component build

push component image

webhook

notification

Developer

Manifest Repository

starts build pipeline

build component image

build app image

push app image

10 of 22

Manifest Updates

version: "3.7"�services:� api:� image: summit/api:abcdef

...

desktop:� image: summit/desktop:123456

...� mobile:

image: summit/mobile:abc123

...� docs:

image: summit/docs:123abc

...

version: "3.7"�services:� api:� image: summit/api:234567

...

desktop:� image: summit/desktop:123456

...� mobile:

image: summit/mobile:abc123

...� docs:

image: summit/docs:123abc

...

Pre-push to component repository

Post-push to component repository

11 of 22

Developer Experience

  • SiaB command line interface provides mechanism to disable components in the stack
  • Developer runs a component under development from the IDE using plain old docker-compose.
  • Traefik uses the same container labels to route traffic to the component under development.
  • Developer just needs her preferred IDE and Docker to be productive.

12 of 22

The Naive/Simple QA

Simply follow these simple steps:

  1. Get a beefy machine
  2. Launch each application stack (using the manifest)
  3. Hope for the best!
  4. Teardown when branch deleted

13 of 22

14 of 22

15 of 22

Goals for QA 2.0

Feature Branch Isolation

Scale to Many Deployments

Dynamic, yet Persistent

$

Keep Costs Down

16 of 22

QA State Machine - per Feature Branch

17 of 22

The QA Infrastructure

18 of 22

Summit QA Pipeline

Manifest Repository

pushes

code

ECR on AWS

Serverless QA Manager

launch notification

Developer

QA Tester

Component Repository

19 of 22

What’s it cost?

  • The QA environment has cost us (over nine months)…
    • SQS - $0.51-0.63/month
    • Lambda - $0.09-0.20/month
    • ECS - $23.44-62.66/month
    • RDS - $50-55/month

20 of 22

Lessons Learned

  • DevOps is great, but even better when driven by needs and culture
  • GitLab for code and CI = awesome
  • Cloud services are cheap, use them
  • Writing your code correctly can reduce vendor lock-in
  • There is a point of diminishing returns… don’t go overboard to save every penny

21 of 22

Sharing what we learned

  • We’ve open-sourced the state machine and task execution (github.com/cloudseam)
    • Could be easily extended to work on Azure, GCP, OpenFaaS, etc.
  • State machines are defined using YAML
  • Working to make shareable deployment models (QA in a Box)

22 of 22

Thanks! Questions?