1 of 38

Tools for

Taming

Tangled Code

Seth Tisue

Scala team

Lightbend, Inc.

@SethTisue (Twitter)

@SethTisue (GitHub)

Chicago Area Scala Enthusiasts

March 29, 2018

2 of 38

me

Scala since 2008

Lightbend since 2015

San Francisco since 2016

3 of 38

who am I to talk?

two experiences:

  • lead developer on a large�Scala + Java codebase
  • built Sculpt

4 of 38

you

  • consumer of tooling?
  • builder of tooling?

5 of 38

feedback welcome

6 of 38

clean code

of course clean code is well-formatted.

clean code is also well-structured.

7 of 38

internal dependency structure

“internal” — your code.

“dependency” — compile-time dependency

8 of 38

dependencies

9 of 38

structure: goals

modularity

readability, comprehensibility, maintainability

fast incremental compiles

10 of 38

dependencies matter

  • for maintaining velocity and retaining developers as the years pass
  • for achieving specific product goals

11 of 38

NetLogo case study: goals achieved

2000: monolithic codebase, dependencies everywhere

2002: separate authoring environment from runtime engine� —> deliver lightweight, embeddable runtime

2004: separate GUI code from headless code� —> support batch operation from command line

2008: open-source independently buildable key components� (whole codebase became open-source later)

2013: separate compiler from runtime engine� —> run compiler in browser on Scala.js

12 of 38

NetLogo case study

  • no cycles
  • clear overall macro-structure�(4 layers)
  • key subgraphs with�no incoming arrows
  • CI-enforced

after:

before:

13 of 38

tooling can help

two basic approaches:

  • bytecode analysis
  • source code analysis

14 of 38

bytecode analysis

  • Classycle
  • ProGuard

👍 language-independent�👍 mature

15 of 38

source code analysis

  • Acyclic
  • Sculpt

👍 high fidelity

👎 bleeding edge

16 of 38

Classycle GUI

17 of 38

Classycle CLI

18 of 38

missing: sbt plugin

19 of 38

ProGuard

...is primarily a JAR shrinker

but in order to shrink JARs…

...it does bytecode-level dependency analysis

if your bytecode has dangling references, it will tell you.

20 of 38

done: bytecode based tools

next: source code based tools

21 of 38

Acyclic

Scala compiler plugin

import acyclic.pkg

import acyclic.file

22 of 38

Sculpt

developed at Lightbend for a client

credits: Seth Tisue, Adriaan Moors, Mark Harrah, Stefan Zeiger

open-sourced January 2018

informational — no enforcement

also a compiler plugin, like Acyclic

23 of 38

Sculpt: core

code from Zinc listens to dependency information compiler emits

the exact same information guides sbt/Zinc incremental compilation

https://github.com/lightbend/scala-sculpt/blob/master/� src/main/scala/com/lightbend/� tools/sculpt/plugin/ExtractDependencies.scala

24 of 38

Sculpt: core

https://github.com/lightbend/scala-sculpt/.../ExtractDependencies.scala

...

...

25 of 38

Sculpt: user-level features

  • stores raw dependency data in memory as case class instances
  • outputs raw dependency data as JSON
  • analyzes at method level, or aggregates dependencies at class level
  • generates dependency layer report
  • generates dependency cycle report

26 of 38

Sculpt: sample output

(show samples in README.md)

27 of 38

Sculpt: missing features

  • package-level aggregation
  • source-file level aggregation

28 of 38

Sculpt: maintenance status

the Scala team will help keep the ExtractDependencies component correct and up-to-date, if users step forward

the rest of Sculpt is… there.�I can review modest fixes and improvements.

29 of 38

Sculpt: status

you might:

  • find Sculpt useful as-is?
    • strength: complete, accurate data
    • weakness: reporting
  • maintain and expand it?
  • be inspired to build your own thing around ExtractDependencies?

30 of 38

(aside on compiler plugins)

my other talk: “Compiler Plugins 101”

  • what are they, what can they do
  • what existing plugins are popular
  • how to get started writing your own

slides

video coming soon

31 of 38

IntelliJ Dependency Structure Matrix

32 of 38

Extractor

• https://github.com/CANVE/extractor� looks very Sculpt-like? no work in two years

33 of 38

keep it once you got it

  • separate repos
  • separate subprojects
  • enforcement via CI
  • custom tooling

34 of 38

questions for audience

  • share your experience with tools I covered…?
  • are there tools I missed?

35 of 38

questions for me?

related to talk?

and if we run out of those,

  • AMA about future of Scala, Lightbend, …?

36 of 38

Ideas & TO DOs

Andrew Hill told me Visual Studio has a “dependency map” tool: https://docs.microsoft.com/en-us/visualstudio/modeling/map-dependencies-across-your-solutions�This is very relevant, I could show a screen shot and say man, I wish we had this for Scala.

He also pointed out IntelliJ Ultimate Edition has this: https://www.jetbrains.com/help/idea/working-with-diagrams.html, also very relevant and worth showing at least briefly.

Somebody in Chicago — was it Steven Hicks? – wanted to hear more about remediation, in addition to the material on diagnosis. The talk is called “tools for taming tangled code”, but the material is all about diagnosis — we don’t actually tame anything, we just get insight into its tangledness. The talk is already too long so I hesitate to add a lot of material on this, but maybe at least one slide on techniques I typically used to cut dependencies. By far the most common technique being separating API from implementation. But also using events instead of direct method calls. Instead of A calling into B, A shouts something into the air, and B listens for it.

37 of 38

Ideas & TO DOs

some feedback I’ve received on this talk that I should consider incorporating if I give it again:

  • Q: does Classycle do method-level? can it show where A’s dependency on B is?
  • A: The command line does it. When a forbidden dependency is found, the output is pretty specific about where exactly it is. I could show an example of that in the talk. As for whether the web UI shows it, not sure, I didn’t see it right away, but maybe I’d find it if I tried again.
  • Q: What about Java 9 modules (aka Project Jigsaw) and jdeps?
  • A: Yes, that’s very relevant. Creating modules, and making them comply with the constraints the module system has, is one of the motivations one might have for using the tools in this talk. I don’t currently know enough about the module system to answer in more detail.
  • Q: What if you do an incremental compile and it takes too long and you don’t know why, will this tell you?
  • A: Great question — I’d have to try it, see what the output looks like when you run it on an incremental compile. But also, perhaps this is more of a Zinc and/or sbt question than a Sculpt thing, maybe Sculpt is overkill here, perhaps Zinc has logging you can enable?

38 of 38

Ideas & TODOs

Overall this talk is a bit problematic because it isn’t a super happy story. This isn’t a talk about how tons of amazing perfect tools exist, it's how about there are some imperfect things out there that may be useful if you need them enough. Am I framing this right in the talk abstract and in the opening minutes of the talk…? Also, is the talk too long? If the goal is to call attention to some tools that people might not know about it, maybe it’s not necessary to go into so much detail about them, maybe I could condense the whole thing into 15/20 minutes, still help the people I want to help, and not use up so much of other people’s time. (One person told me I went on too much about NetLogo, perhaps much of that material could be cut.)

On the “keep it once you got it” slide, it’s maybe worth expanding on the “separate subprojects” bullet and emphasizing that this is my most recommended solution for most people. Once you’ve used Sculpt et al to carve out a subproject, afterwards the compiler will keep you honest, and subprojects are a lot easier to deal with than separate repos. If you don’t know how to make subprojects in your build tool, learn! It’s a hugely valuable skill to have. In my early sbt days I shouldn’t have put off learning it for so long.

Do I have examples showing why bytecode analysis is insufficient? Constant inlining, type-level dependencies that disappear in bytecode? I should include them, it should be in the slide deck, even if I don’t dwell on it when speaking. And how insufficient is it, anyway? Maybe it’s actually fine for most people. I guess that’s somewhat covered already by how much time I spend on Classycle in this talk.