Why and how the external STG interpreter is useful
Csaba Hruska
GRIN Compiler
Optimizer for lazy and strict functional languages with optional whole program analysis support
Two sub-projects:
GRIN IR:� Urban Boquist’s PhD thesis (1999),� Thomas Johnsson (advisor, inventor of compiled graph reduction, G-Machine)
Haskell frontend: GHC whole program compiler project�https://github.com/grin-compiler/ghc-whole-program-compiler-project
GHC Haskell frontend: GHC-WPC, exports IR of the whole program
new output format: .modpak, .ghc_stgapp (module IR + program dependencies)�new output binary: _cbits.a, _stubs.a (package’s C + generated FFI glue code)
components (cabal projects):
Goal: Observability
GHC compiler pipeline is a black box�Haskell runtime evaluation is a black box
I’d like to know these:
.modpak & .fullpak are zip files
Zip file format = standard container�Zstd compression = speed and space efficiency
One .modpak for each Haskell module, containing:
�The .fullpak contains the full program, all .modpak content combined.
Compiler pipeline design
External STG
Small package, independent of GHC�External STG is a self-contained copy of GHC STG IR data type
STG defines the evaluation of Haskell programs (operational semantics)�STG is a simple functional language
Project wide unique names
Why not GHC Core? Core changes too often, STG is stable.�STG = Core without type lambdas in A-Normal Form, to make codegen simpler
External STG Interpreter
Goal: Run all Haskell applications that GHC can compile� Be independent of GHC�Design: Simple Haskell, purely functional design (StateT StgState IO)�Heap model: IntMap with monotonic address space, no address reuse
�High-level model of GHC primops and Runtime System�(manually extracted from the native backend implementation)
�Validation: GHC testsuite + quickcheck tests for simple primops�
When things go wrong we get nice pattern match error instead of a segfault
Experience & Insights
The interpreter is surprisingly useful for various things:
Interpreter = Learning tool
The STG interpreter has its own implementation of primops and RTS in Haskell. (independent of GHC)
If you can read Haskell, then you will understand:
Interpreter = Debugger & Profiler
It’s easy to collect and observe runtime data in the pure State monad.�New property to track = new field in StgState data type
Profiler features:
Debug features:
Tech stack for debugging
Gephi - visualization, exploratory data analysis�Souffle datalog - analysis of the exported program state and traces�Haskell - collect data during the interpretation
.tsv file format for traces and output data (tab separated values)
Ext-STG debugger defects
Perfect program-point precision in STG level, but not in Haskell level
Problem: GHC does not provide enough source location ➞� can not map STG program-points to Haskell source accurately
Solution: connect STG code to Haskell manually, check the STG/Core IRs
�Precise module level program-point granularity�Manual mapping is needed for module internals
Debug and profile experience
GHC
STG Interpreter
Haskell design defects
Requiring invasive code changes by design is wrong!
cost centres = bad developer experience, blocks debugging�profile mode & cost centre recompilation = bad UX/DX��GHC is black box�Runtime System is black box��It’s bad when people make their code less readable or lower level just to allow debugging or better performance.
Make the optimizer and the runtime evaluation observable instead.
It is bad when GHC optimizer capabilities shapes the Haskell programming style.
Accidental complexity = community damage
How much effort would you invest to learn the runtime evaluation of Haskell programs?�For GHC Haskell you need to understand:� core, stg, cmm, C, primops, GHC’s codegen, RTS and design decisions�What if this exceeds your resource budget?�Would you give up and switch language?�Would you wait for others to fix your problem or explore your research idea?
It is a myth that Haskell evaluation is complex and tricky.�It’s an engineering and implementation issue.
My experience with GHC dev community
no feedback
no brainstorming��
You think differently ➞ You are alone
DEMO
Simple program
OpenGL minigame
GHC compiling hello.hs
Conclusion
Developer experience matters
Persistent IR
Simplicity
Observability
Data visualization
Support on Patreon
patreon.com/csaba_hruska