1 of 26

Taking out the trash…

All about the GC

2 of 26

Overview

  • MetaSpace
  • Brief history of GC in Java
  • Parallel GC
  • Concurrent Mark and Sweep
  • G1
  • What to use and where
  • Questions

3 of 26

Metaspace

The new Permgen

  • Classes loaded/unload automatically
  • No limits by default
  • Not part of “the heap” (less moving)

4 of 26

Brief History of GC

  • Reference Counting
  • Mark and Sweep
  • Tri-color
  • Generational Collection
  • Compaction

5 of 26

Reference Counting

  • Pointers keep track of who references them
  • Hitting 0 deallocates
  • Leaks memory if a cycle exists
  • Slow
  • shared_ptr in C++
  • RC/ARC in rust

6 of 26

Mark and Sweep

* Courtesy of wikimedia

7 of 26

Tri-color marking

* Courtesy of wikimedia

8 of 26

Generational GC

  • Heuristic, young allocations die quickly
  • Improves performance, less checks
  • All java collectors are generational
  • Divides Heap into
    • Eden
    • Survivor
    • Old generation

9 of 26

Compaction

  • Making memory contiguous
  • Increased allocation performance
  • Stop and Copy

10 of 26

Parallel GC Notes

  • 3 regions
    • Eden - All allocations go here
    • Survivor - Memory that survives a few minor GCs
    • Tenured/Old Gen - Things that survive several minor GCs, or are too big.
  • All GC is stop-the-world
  • Has fast allocation due to compaction
  • Very low overhead

11 of 26

Parallel GC

Pros

  • High throughput
  • Low CPU usage
  • Compaction

Cons

  • Unpredictable pause times

12 of 26

After a Major GC

13 of 26

CMS Notes

  • Minor GCs the same as Parallel GC
  • Promotion from Eden to old gen slower due to lack of compaction.
  • Falls back to parallel GC on CMS failure.
  • GC is done by a quick mark in stop the world and then sweep concurrently.

14 of 26

After a Major GC

15 of 26

Concurrent Mark and Sweep (CMS)

Pros

  • Shorter pause times

Cons

  • No compaction
  • Higher CPU usage
  • Unpredictable pause times
  • Partially Deprecated

16 of 26

G1 Memory layout

17 of 26

G1 Notes

  • Broken up into several young/survivor/old gen regions
  • In minor GCs, collects young and old gen along with some stats for major GCs
  • Keeps track of “liveliness” to determine what old gen regions to collect

18 of 26

G1 Notes Cont

  • Has much better pause time targeting
  • Moves objects from regions resulting in compaction (during minor and major gcs)
  • Large objects get their own regions
    • Prior to 1.8.0_40, only collected during full GC
    • Now collected every few minor GC

19 of 26

Tuning G1

  • Set min and max memory
    • --xmx, --xms
  • Set desired pause time
    • -XX:MaxGCPauseMillis=n
  • Other settings will cause the GC to ignore the Max pause time settings (eden size, region size, etc).

20 of 26

Are you gonna make another pro and con list? I'm gonna kill myself.

-- Michael Scott

21 of 26

Garbage First (G1)

Pros

  • Low pause times
  • Collects old gen on minor GCs
  • Compaction during all GCs
  • Fewer full GCs

Cons

  • Higher CPU usage
  • Works best with lots of memory
  • Really hates huge objects
  • Not as familiar

22 of 26

What should I use?

  • Parallel GC
    • ETLs
    • Batch processes
    • Small heaps
  • G1
    • Web services
    • Client facing apps
    • Anywhere pause time is important

23 of 26

What should I avoid?

  • CMS
    • Worse that G1 in almost every way
    • Slightly lower overhead
    • Handles large objects slightly better
    • Is officially deprecated
  • Serial GC
    • Only good for single core machines

24 of 26

Things GC hates

  • Large objects
  • Memory pools
  • Finalizers
  • Weak References

25 of 26

Useful links

26 of 26

Questions?