Hierarchical Compilation
aka Regional compilation
Motivation
Two groups - Full model and Hierarchical compilation
Hierarchical Compilation
Dynamo - What happens with inbuilt torch nn modules?
Dynamo - What happens with inbuilt torch nn modules?
Compiled graph
UDF nn module object
(has linear module)
torch.compile
pointer to linear layer
UDF nn module class
instantiate
Repercussion - Recompilation
Compiled graph
UDF nn module object
(has linear module)
torch.compile
pointer to linear layer
UDF nn module class
instantiate
UDF nn module object
(has linear module)
instantiate
Compiled graph
torch.compile
Recompile!!!
Inline through inbuilt nn modules
No inlining
Inlining
Workstreams for Phase 1
Guard overhead
Cudagraphs problem
Fix Dynamo latent issues and graph breaks
Phase 1 is useful for non hierarchical compilation usecases
Even if we decide to not pursue hierarchical compilation, Phase 1 is still useful
Before Phase 2 - Focus on full model compile times