1 of 36

ESM Loaders

Bradley Farias

2 of 36

What is an ESM "Loader"

  • Way of using ESM loading spec hooks

3 of 36

What is an ESM "Implementation"

  • Way of providing the spec hooks a Loader uses
    • Generally enforces some specification invariants
      • Some specification invariants are left to be enforced by Loaders
    • Generally not the exact same API as spec text
    • Will not be discussed in this deck

4 of 36

A note on conformance

  • Hosts may perform optimizations, either through Loaders or Implementations, that do not exactly follow spec text but ensure semantics and observation of the optimization is not visible. See: conformance criteria and algorithm conventions.
  • Spec text is not implementation text. See: an example of expected optimization in contradiction with spec text implementation.

5 of 36

What HostResolveImportedModule does

  • Intercepts incoming `import` requests
  • Returns a Module Record for the request
    • May return different types of Modules, must extend Abstract Module Record
  • Always returns the same Module Record for a given pair of {referringModule, specifier}.
    • Note: referringModule is not a string, but often is represented as a string. It is a Module Record.
  • For rest of this deck, it will be called a entry point of the "resolve" hook

6 of 36

What HostImportModuleDynamically does

  • Intercepts incoming `import()` requests
  • Returns a Module Record for the request asynchronously
    • May return different types of Modules, must extend Abstract Module Record
  • Always returns the same Module Record for a given pair of {referringModule, specifier}.
    • Note: referringModule is not a string, but often is represented as a string. It is a Module Record.
  • For rest of this deck, it will be called a entry point of the "resolve" hook

7 of 36

What is the "resolve" hook

  • Way of intercepting all forms of `import/import()`
  • Follows requirements of all entry points to the resolve hook.

8 of 36

Resolve hook API - Requirements

  • Gets 2 parameters
    • referringModule
    • specifier
  • Returns Abstract Module Record subclass instance
    • For V8 this means converting to an Source Text Module (or a WASM Module soon)
      • E.g. CJS `module.exports = 123` could be transformed to:��`export default way_of_getting_module_exports();`
      • The use of a Module Record to represent a different Value (Source Text Module to represent a CommonJS Module) will be referred to as using a Module Facade
      • The alteration of source text to become ESM will be referred to as Compiling to ESM

9 of 36

A Note on Facades vs Spec

  • In order to achieve compliance we need to generate valid ESM Facades for all our source texts
  • If we transform valid and recognized ESM source text to an ESM Facade, it is no longer ESM. We can call it whatever we want (Node Modules sounds bad?), but we have changed the semantics of the grammar productions and are compiling to ESM.

10 of 36

Resolve hook API - Implemented

  • Gets 3 parameters
    • Specifier
    • parentModuleURL
    • defaultResolver
      • Support for multiple loaders have not landed on master yet, this needs some minor changes

11 of 36

Resolve hook API - Implemented

  • Returns
    • url : URL string
      • The absolute URL (or builtin name) that will be loaded
    • format : enum { "cjs", "esm", "dynamic", "builtin", "json", "addon" }
      • "dynamic" invokes a specialized facade factory to ease generation of ESM facades

12 of 36

Resolve hook API - Personal Review

  • Parameters
    • When we add multiple loaders we should use a parent dispatch instead of calling it `default`
      • See PR that needs rebase (stuck in moratorium)
  • Returns
    • url : URL string
      • This is not always a URL, it should not be called `url`
    • format : enum { "cjs", "esm", "dynamic", "builtin", "json", "addon" }
      • `dynamic` is limited in facade capabilities, allowing creation of source text + format is ideal when generating some complex ESM facades
      • `dynamic` is problematic for composition, if you receive `dynamic` from another loader, what do you do with it?
      • Usage of a special enum seems limiting and coordination might get complex
        • Moving to shared content negotiation with web would allow easier mingling

13 of 36

Resolve hook API - Personal Proposed Path

  • Gets 2 parameters
    • referringModule : URL? string
      • Returns the specifier in the "global Module Map" that stores the referringModule Module Record
    • specifier : string
      • Non-controversial, import specifiers are strings
  • Use a global to defer to the parent loader
    • More on this later, relates to isolation / threads
    • await parent.resolve(parameters) -> returnValue;

14 of 36

Resolve hook API - Personal Proposed Path

  • Returns:
    • key : URL? String
      • Cache key to load from the global module map
      • Needed for things that manipulate where things are stored for differing cache mechanisms
    • body : Blob?
      • If present, and if the global module map does not have `key`, create a new Module Record corresponding to this Blob at `key`
        • Used to create "synthetic modules"
        • Cannot replace existing entries in the global module map.
      • type : string
        • Similar API to how Web uses `Content-Type` for specifying type of resource / module body
      • #blobData : Opaque

15 of 36

Resolve hook API - Personal Proposed Path

  • Loader Coordination
    • Use transferrable compatible data for all the input/output
      • Allows assurance of Worker compatibility and usable cross process / serialization compatibility for a subset of functionality
    • Allow inter-loader communication by adding "HostDefined" structures to input/output
      • params.data : any
      • returnValue.data : any

16 of 36

Resolve hook API - Personal Proposed Path

  • Loader Isolation
    • Always spin up loaders in their own Realm (Context)
    • Document the safe subset of transferrables for serializing request/response to disk
      • No SharedArrayBuffer, MessagePort, ???

17 of 36

Resolve hook API - Personal Proposed Path

  • Loader Isolation
    • Always use threads to run userland loader code
      • Allows loaders to be scaled up/down
        • Only of real benefit for compute heavy loaders

18 of 36

Resolve hook API - Personal Proposed Path

  • Loader Isolation
    • Allow per package isolation of loaders
      • Allows packages to use loaders without conflict, similar to local dependencies
        • E.g. Useful for packages wanting web cache / web resolution to be loaded without requiring entire process to use web cache / web resolution
      • Outstanding question of coalescing loaders
        • E.g. if package `a` uses web caching and package `b` uses web caching, can we use same loader instance for both?
        • Need to document what is "unsafe" to do in loader, like storing data in globals
          • Can mitigate if we document that Node.js may kill your loader at times, similar to service workers

19 of 36

Resolve hook API - Integration

  • CLI - implemented for single + proposed for multiple loaders
    • Useful for application specific options
    • Not suitable for hashbang style execution due to some shells not parsing the full hashbang
    • node --loader babel-loader --loader web-style-caching --loader web-style-resolution
  • ENV - implemented for single + proposed for multiple loaders
    • Useful for generic tooling options
      • loads before CLI
      • APMs / code coverage / etc. should use this
    • Not suitable for hashbang style execution
    • NODE_OPTIONS='--loader babel-loader ...' node
  • Package.json? - proposed
    • Useful for package specific options
    • Suitable for hashbang style execution
    • Allows per package isolation of loaders

20 of 36

Resolve hook API - Impl significant diff with Proposed

  • Parameters / return value
    • Remove term URL since it might be a platform provided specifier
    • More tightly couple facades and resolution together
      • Allows easier creation of synthetic modules
      • More closely encapsulates the request/response structure of the spec
  • Loader coordination
    • Must be done over a message passing system, not in same Realm due to isolation
      • Can have a "meta" loader that recreates the same Realm loader coordination
  • Loader Isolation
    • Allow threading / easier serialization enforcement
    • Can't easily integrate with vm.Module
      • Can be done with some postMessage tricks if we expose something like clients… needs more research

21 of 36

Resolve hook - My research branch

  • Ongoing research / attempts at proofs of concepts going on using userland tests in my fork for the resolve hook for the Proposed API.
    • Important proofs
      • DONE: Babel loader + scaling meta-loader
        • Babel AST loading is compute heavy and has per file isolation, an ideal candidate for a generic scaling meta-loader
      • WIP: Logging / Cache
        • Shows how a persisted cache could get necessary data in/out
      • WIP: Blacklisting
        • Shows privilege granting mechanism / per module specialized behavior

22 of 36

Resolve hook - My research branch

  • Ongoing research / attempts at proofs of concepts going on using userland tests in my fork for the resolve hook for the Proposed API.
    • Important proofs
      • TODO: package.json integration
        • Allows hashbang style executables to use loaders reliably
      • TODO: Web Resolution + Web Cache
        • Shows custom resolution & cache system that differ from Node implementations
      • DONE: Atomics.wait lock on main thread
        • Shows using async loaders that act as if they were synchronous and blocking from main thread perspective

23 of 36

Loader Threading Investigations

Bradley Farias

24 of 36

Composition Layout

Main Thread

Loader Thread

"first"

referrer

specifier

data

key

body

data

25 of 36

Composition Layout

Main Thread

Loader Thread

"first"

Loader Thread "parent"

referrer

specifier

data

referrer

specifier

data

key

body

data

key

body

data

26 of 36

Composition Layout

Main Thread

Loader Thread

"first"

Loader Thread "parent"

Loader Thread "top"

referrer

specifier

data

referrer

specifier

data

referrer

specifier

data

key

body

data

key

body

data

key

body

data

27 of 36

Composition Layout

Main Thread

Loader Thread

"first"

Loader Thread "parent"

Loader Thread "top"

referrer

specifier

data

referrer

specifier

data

referrer

specifier

data

key

body

data

key

body

data

key

body

data

28 of 36

Composition Layout

  • Chain should be allowed to eagerly return / not delegate if delegation undesirable
  • Re-serialization costs are visible, reduced if we can make `body` transfer less costly
    • Cache performance benchmark for replay has not been researched yet
  • Isolation of values between loaders allows generic IPC implementation, loaders can call out to other processes or even network with well defined payloads.

29 of 36

Scaling Layout

Main Thread

Loader Thread 1

"first"

referrer

specifier

data

key

body

data

Loader Thread 2

"first"

Loader Thread 3

"first"

Wait for available

30 of 36

Scaling Layout

Main Thread

Loader Thread 1

"first"

referrer

specifier

data

key

body

data

Loader Thread 2

"first"

Loader Thread 3

"first"

Wait for available

31 of 36

Scaling Layout

Main Thread

Loader Thread 1

"first"

referrer

specifier

data

key

body

data

Loader Thread 2

"first"

Loader Thread 3

"first"

Wait for available

32 of 36

Scaling Layout

  • Does not greatly affect IO heavy loaders, Node already handles single threaded IO well.
  • Non-singleton guarantee could be applied to greater ecosystem than threads if we do cross process or even network based protocol. See: what Cloudflare did with service workers

33 of 36

Head of Line Mitigations

Main Thread

User Loader Chain

Default Loader

Off thread parsing

34 of 36

Head of Line Mitigations

  • Running main resolver on main thread causes head of line issue for "top" resolve operation
    • Move main resolver to own thread, or run at top userland thread
  • JS/WASM parser can be run off main thread
    • v8::Locker means that main thread is still locked from V8 compute during parsing
      • Still have a head of line issue :-(
    • V8 does not support streaming parser for Modules (yet)

35 of 36

Acting Synchronously

Main Thread

User Loader Chain

Atomics.wait

36 of 36

Acting Synchronously

  • Main thread can request a resolution using a SharedArrayBuffer for blocking behavior
    • Use SharedArrayBuffer as message bus and spin lock
    • Requires messages between Loaders be serializable to binary format
      • Bus size matters here and we might want it to be configurable? Forcing IO Block size seems sane though.