1 of 35

Hello, Terraform 👋

Building up to a real-world use case

Matt Christie – Data Engineer II: Technical Lead

Gibbs Land Use and Environment Lab (GLUE)

2 of 35

Background

Our use case

3 of 35

Background

  • Our lab uses EC2 instances to run data pipelines, ex.:

Hello, Terraform

4 of 35

Background

  • Our lab uses EC2 instances* to run data pipelines, ex.:

Hello, Terraform

*We call these instances "build servers"

5 of 35

Background

  • The data pipelines have a definite beginning and end

Hello, Terraform

6 of 35

Background

  • The data pipelines have a definite beginning and end

Hello, Terraform

(i.e. build servers don't need to be up indefinitely)

7 of 35

Background

  • However, build servers are often up long after pipelines finish

Hello, Terraform

8 of 35

Background

  • However, build servers are often up long after pipelines finish

Hello, Terraform

This is because server setup is a tedious manual process; developers try to batch multiple pipeline runs to avoid setup overhead

9 of 35

Background

  • With this usage pattern, it's easy to rack up accidental uptime

Hello, Terraform

10 of 35

Solution: Terraform

Automate build server setup + teardown

11 of 35

Bill of Materials

  • At a high level:

Hello, Terraform

git push

SSH

EC2

DB

Runner

submit job

remote files

12 of 35

Bill of Materials

Hello, Terraform

git push

SSH

EC2

DB

Runner

submit job

remote files

Itemize these and develop Terraform configuration incrementally

1

2

4

3

5

13 of 35

Sidebar: Development Philosophy

  • Separation of concerns: While the data pipeline environment can be improved in multiple ways, improvements are more efficiently implemented and gracefully deployed if targeted one at a time

Hello, Terraform

14 of 35

Sidebar: Development Philosophy

  • Separation of concerns: While the data pipeline environment can be improved in multiple ways, improvements are more efficiently implemented and gracefully deployed if targeted one at a time

Hello, Terraform

Do we want Terraform or not?? There's always future work 😀

15 of 35

Implementation: A step-by-step process

  • Let's take a look at the commits in build-server-setup, our git repository for this Terraform project

Hello, Terraform

This part will be mostly interactive, but see notes below for a summary

16 of 35

Step 1: EC2 instance

  • Start simple: Manage an EC2 instance

Hello, Terraform

Separation of concerns: Demonstrate that Terraform can bring resources up and down without troubleshooting complex configuration

17 of 35

Step 2: SSH

  • Basic needs: Access provisioned resources

Hello, Terraform

Managing security groups with Terraform enables tighter access controls

18 of 35

Step 3: Data volume

  • Make database state available to builds

Hello, Terraform

EBS volumes can be mounted by AWS volume ID to avoid device name mapping issues

19 of 35

Step 4: Remote files

  • Make input files available to builds

Hello, Terraform

I decided to keep the existing stateful solution for now, though its corresponding security group rule is being managed with Terraform

20 of 35

Step 5: GitLab Runner

  • Make build server listen for jobs

Hello, Terraform

This and other processes can be initialized in user_data

21 of 35

Retrospective

Lessons and future work

22 of 35

Value Added

  • Significantly reduces overhead for build server setup + teardown

Hello, Terraform

23 of 35

Value Added

  • Significantly reduces overhead for build server setup + teardown

Hello, Terraform

Setup:

  • Previous process: 15 min tedious manual tasks
  • New process: 1 min (IP lookup + variable change + TF apply)
  • Over 10x reduction in setup time 🚀

Teardown:

  • Previous process: Point-and-click termination, but sometimes dangling data volume
  • New process: Better resource cleanup (no dangling volumes, no stale security group rules)

24 of 35

Takeaways

  • There are levels of adoption, both in philosophy and features

Hello, Terraform

25 of 35

Takeaways

  • There are levels of adoption, both in philosophy and features

Hello, Terraform

Ex. How stateful do we want our AMI to be?

26 of 35

Takeaways

  • Terraform models infrastructure very precisely, to the point where new design decisions can be considered

Hello, Terraform

27 of 35

Takeaways

  • Terraform models infrastructure very precisely, to the point where new design decisions can be considered

Hello, Terraform

Ex. Managing a security group scoped exclusively to a short-lived EC2 instance. Shout-out to cloud office hours for suggesting this!

28 of 35

Takeaways

  • When in doubt … user_data! 🙃

Hello, Terraform

29 of 35

Takeaways

  • When in doubt … user_data! 🙃

Hello, Terraform

This feels counter to Terraform's design philosophy in that it kicks the resource management can down the road. Any providers that explicitly model system resources? (ex. mounts, processes, etc.)

30 of 35

Takeaways

  • Recommendation: Stick to modeling existing functionality when starting out. Concurrent improvements are tempting, but project scope can spiral!

Hello, Terraform

31 of 35

Takeaways

  • Recommendation: Stick to modeling existing functionality when starting out. Concurrent improvements are tempting, but project scope can spiral!

Hello, Terraform

I spent a half-day unsuccessfully troubleshooting minor improvements to the remote file mount, but the more conservative solution was much simpler and made a complete project possible

32 of 35

Learning 📈

  • My mental model of Terraform improved a lot while implementing the project
  • Most helpful distinction: data vs. resources
  • Also more fully appreciating how granular the AWS provider is: it models nearly every AWS object you can think of

Hello, Terraform

33 of 35

Learning 📈

  • My mental model of Terraform improved a lot while implementing the project
  • Most helpful distinction: data vs. resources
  • Also more fully appreciating how granular the AWS provider is: it models nearly every AWS object you can think of

Hello, Terraform

Something to think about: Providers are very powerful. This project used the AWS provider almost exclusively, but Terraform isn't restricted to managing cloud infrastructure. What other problem spaces could benefit from managing declared state?

34 of 35

Roadmap (potential work items)

  • Use a Terraform backend to prevent inconsistent management of infrastructure across machines?
  • Handle multi-user and short-lived configurations with modules, templates and workspaces?
  • Improve remote file mount: Migrate files to S3, then plug into existing architecture with s3fs
  • Add infrastructure setup + teardown to the CI pipeline
  • Providers for more explicitly modeling background processes?

Hello, Terraform

35 of 35

Thank you!