A Strong Mode for JavaScript (Strawman proposal)
Author: Andreas Rossberg <rossberg@google.com>
If it is too strong, then you are too weak.
With the advent of EcmaScript 6 (ES6), JavaScript sees many significant improvements. Classes, modules, block scoping, iterators, default and rest parameters and the like can replace older, less well-behaved alternatives. For newly-written JavaScript code, it is finally possible to retire some of the “bad parts” of the language. … But by fully endorsing ES6, we can go even further!
Other aspects and properties of the language still make JavaScript development difficult at scale. JavaScript’s strength is its flexibility, but for larger applications, the cost of this flexibility can sometimes outweigh its benefits: real errors can easily be masked by “sloppy” semantics, complex implicit behaviour can cause various surprises, and producing decent performance predictably is hard. Unfortunately, none of these can be changed or removed without “breaking the web”.
We hence propose an experimental clean-up of the language in the form of a new strong mode. It essentially “subsets” the language, removing behaviours that are common correctness or performance pitfalls, or have been superseded by more structured alternatives in ES6.
Strong mode defines a subset of JavaScript with a stronger semantics. The goal is twofold:
Strong mode hence not only should make development easier, it also codifies a reliable “contract” for predictable performance. The focus is on removing features, not adding any. Any program running “correctly” in strong mode should run unchanged as ordinary JS. Thus, strong mode programs can be executed on all VMs that can handle ES6, even ones that don’t recognise the mode.
Moreover, Strong mode is fully interoperable with good old JavaScript, in the same manner that strict mode is. Strong functions can call "weak" functions or use "weak" objects, and vice versa.
Strict Mode: ES5 introduced strict mode as a remedy for some of the more outrageous design mistakes in JavaScript. However, strict mode is very limited in scope, as its design was primarily driven by the desire to still allow all existing JavaScript idioms. As a consequence, strict mode offers far less value than it could have. The strong mode we propose here implies strict mode, but is a somewhat more radical renovation of JavaScript semantics.
Types: We also intend to propose a (sound) gradual type system for JavaScript. This addresses similar goals, and can benefit significantly from some of the restrictions we propose here. However, it is a much more complex feature, and potentially interesting independent of strong mode. We hence defer such a type system to a separate proposal. Strong mode can be viewed as a transition path to such a typed JavaScript.
Implementation: Throughout this document, we have marked features that are already implemented in V8 or Traceur with [V] and [T], respectively.
To maintain backwards compatibility, programmers have to opt into strong mode by means of an explicit directive:
Note: This mechanism is analogous to ES5 strict mode.
Discussion: A viable alternative might be a module-level mode through a magic import declaration, such as import "strong". However, a mode directive has the significant advantage that any program not hitting any of the strong mode restriction should run unchanged in a VM not recognising the directive, and no translation step should be required.
The primary purpose of strong mode is not adding features to the language, but removing features from the language. It subsets the language in the sense that various constructs or behaviours that are legal in classic JavaScript will be flagged as errors. To that end, strong mode imposes restrictions that can broadly be classified into the following categories:
Details are described in the following section (see Language Restrictions).
There will also be changes to the ECMAScript library: when applied to strong objects, some methods will have more rigorous behaviour, and moreover, will create strong objects in reply (see Library Built-ins).
Strong mode implies strict mode. Like strict code, strong code can freely interact with conventional code, whether strict or sloppy. In particular, strong code can call out into “weak” code and vice versa, and objects created in strong code can be passed to and accessed from weak code and vice versa.
Strong code and weak code will also share the same global object and the same set of built-in objects. However, new objects created in strong code will have the aforementioned restrictions in behaviour (“per-object restrictions”), and those apply both inside and outside strong code.
Strong mode implies strict mode.
Motivation: Strict mode already rules out some particularly bad behaviours, and there is no reason to regress to sloppy.
Strong mode embraces lexical scoping and completely disallows use-before-definition, avoiding any need for runtime checks when accessing variables.
Note: For compatibility with JS, the use-after-def restriction does not affect visibility. That is, variables are still visible before their declaration, including potential shadowing of variables of the same name from outer scopes. However, the "temporal dead zone" from ES6 is turned into a static dead zone.
Motivation: Undeclared variables are a common source of errors in JavaScript, as are the bizarre rules for 'var' declarations. Disallowing use-before-def eliminates another source of runtime errors, and completely eliminates the cost of runtime checks on variable accesses. The reason to extend this rule to functions is that function “hoisting” can indirectly violate use-after-def for variables they close over.
Implications: Programmers have to be slightly more careful about order of declaration, especially with respect to functions. Moreover, feature detection in the global scope has to be done via explicit property access on the global object, or using typeof. “Polyfilling” global bindings has to be performed via a staged script.
Discussion: Instead of ruling out 'var' altogether, we could merely disallow hoisted 'var' declarations, or multiple 'var' declarations in the same scope. However, it seems cleaner to just move to 'let' consistently.
While known from other languages, the rules for function declarations are relatively restrictive. In particular, no auxiliary declaration can be interspersed into a block of mutually recursive functions. An alternative would be to allow other declarations in between, as long as they don’t refer to any of the functions. However, that would require information about free variable references and dependency analysis.
In strong code, accessing objects (strong or not) throws on missing properties. New object properties have to be defined explicitly and cannot be removed from strong objects.
Note: New properties can still be introduced on extensible objects, but only through the explicit use of ‘Object.defineProperty’.
Duplicate properties in object literals were an early error in ES5, but in ES6 this was removed, because of computed property names.
Note: Special rules apply to the 'length' property on strong arrays to make it compatible with the restriction on writability reconfiguration (see Arrays).
Motivation: Silent property failures and the resulting proliferation of 'undefined' are the most prominent mistake in JavaScript, and can be rather tedious to debug (very much like null pointer exceptions in other languages, but much more common). The ability for every read to deliver the value ‘undefined’ also makes compile-time type analysis more difficult and thus some code more expensive than necessary.
Implicitly adding properties via assignment is brittle, because it will (silently) fail or does the wrong thing unexpectedly when there are setters or read-only properties on the prototype chain. Using ‘defineProperty’ is verbose but robust. Also, ES6 weak maps provide an alternative to expando properties.
Property deletion introduces a drastic performance penalty on most VMs, it is almost always preferable to null out unneeded properties instead. Banning it also immediately eliminates the ability to create holes in arrays after the fact (see Arrays). Use cases that abuse objects as maps can instead migrate to the more adequate ES6 map and set classes.
Prototype mutation is widely considered bad style and was only introduced into the ES6 standard for compatibility with the existing mobile web. It has rather very unfortunate implications on implementations and their performance, as well as on robustness.
JavaScript’s existing reconfiguration loop hole regarding writability (a non-configurable property can still be reconfigured from writable to non-writable) allows arbitrary client code to break objects and their assumed contracts even when they were explicitly sealed. Closely related, it induces the need for extra runtime checks, and makes it impossible to have a type system tracking mutability properly.
Implications: Due to property access throwing, the common feature detection pattern for properties, “var x = o.a || v”, will no longer work. It has to be replaced by either an explicit test (“let x = ‘a’ in o ? o.a : v”, or “let x = ‘a’ in o && o.a ? o.a : v”, depending on the use case) or by destructuring with a default (“let {a: x = v} = o”). This is actually preferable because the current pattern potentially misbehaves on properties that are present but happen to have a falsy value (which may or may not be desired).
Implications for Implementations: Maps and sets need to be optimised to a competitive degree.
Array objects behave more like conventional arrays. They are not allowed to contain holes, or have their ‘length’ property get out of sync with their actual length. Properties cannot be accessors or inherited. Array access is thereby guaranteed to be fast and reliable.
All restrictions for objects in general apply as well (see Objects).
Note: This does not affect the ability to write a trailing comma in an array literal, since that is not an elision syntactically.
Note: “Reject” is terminology from the ES specification. Its effect depends on the mode in which the respective operation is performed: in strict mode (and thus strong mode) it throws a TypeError, in sloppy mode it is silently ignored. Both can happen, because strong objects can be accessed in sloppy mode.
Motivation: Holes in arrays have very unfortunate implications on performance, as they require extra checks, representation changes, and enable inheritance of index properties. Thus, holes are completely ruled out for strong arrays; they can neither be created initially nor introduced later. New index properties can only be created one past the end (using ‘.push’ or ‘Object.defineProperty’, see Objects), and the only possible way to remove index properties is by shrinking the array’s length. ES6 maps are an adequate replacement for sparse arrays.
Similarly, the ability to reconfigure individual index properties makes access to them slow. In particular, removing this ability eliminates accessor properties on arrays, which is a very complicated but almost entirely useless feature of JavaScript.
The restriction on setting ‘length’ is necessary because growing an array via setting 'length' would necessarily introduce holes at the end. The other changes regarding reconfiguration of ‘length’ are required to maintain the invariant that only configurable properties can change from writable to non-writable (see Objects), while keeping the length in sync.
Essentially, array properties (indices and ‘length’) can only be reconfigured indirectly, e.g. by sealing or freezing the whole array.
Ruling out non-canonical number properties has two effects: it eliminates wrap-around semantics for array indexing (as is the case in current JavaScript semantics with its use of ToUint32 conversion), thus ruling out a class of subtle bugs; and it eliminates the corner case of out-of-bounds numbers silently becoming expando properties, preventing yet more subtle bugs, and also allowing more efficient read access. The same semantics already applies to ES6 typed arrays.
Open Questions: In strong mode, a different array constructor or factory would be needed to create arrays with a given length, which takes an additional initialiser argument along with the size. Alternatively, one could rely on instance methods to create larger arrays from small literals, e.g., a hypothetical “[0].repeat(100)”.
Strong functions require passing sufficient arguments. ES6 default and optional arguments are a better substitute for under application. The ‘arguments’ object is removed. Functions are not constructors.
Note: In order to be a restriction and not a change in semantics (and to avoid mistakes), referring to ‘arguments’ is an error even if it is bound in an outer scope.
Note: Needless to say that functions also won't have any of the non-standard 'arguments', 'caller', or 'callee' properties (see Non-standard Extensions).
Motivation: The 'arguments' object is superseded by ES6 default and rest arguments. It could not be handled by a type system in a useful way either.
Allowing to pass too few parameters is a source of errors. Parameter count mismatch also creates significant extra complexity and cost in the implementation. ES6 default and rest arguments usually make this pattern unnecessary.
From the point of view of implementation, it would be beneficial if passing too many arguments could also be disallowed (and it might catch more errors, too). However, argument list width subtyping is sometimes beneficial.
Classes should be used consistently to define constructors, since they are more declarative and can provide better performance.
Implications for Implementations: Default and rest parameters need to be implemented and sufficiently optimised.
Classes become the main object creation facility. At the same time, classes and their instances are locked down in several ways. This allows better performance, as well as more accurate typing (in presence of a type system).
Motivation: Freezing classes and sealing instances allows more accurate type checking and significantly more efficient handling of property lookup and method calls. In particular, a fixed prototype chain can be flattened into a "vtable"-like indexed method table, turning method calls into simple indirect jumps, even under subclassing (instead of the necessary runtime checks and polymorphism currently necessary).
Likewise, sealed instances and type-fixed methods allows more compact object representation and cheap field access on instances, because the layout of instance objects is fixed, no additional backing stores are needed, and no additional runtime checks are required.
The restricted use of ‘this’ inside the constructor prevents partially initialised (and not yet sealed) instance objects from leaking or being otherwise observable. This is useful both to prevent errors and to allow more aggressive optimisations. It is a prerequisite for sound typing.
Discussion: Restricting ‘this’ in constructors might be a rather severe restriction, because it rules out abstracting any parts of the initialisation work into other functions. Alternatively, the partially constructed object could be made available, but would not yet be regarded an instance (i.e., any attempt to invoke methods of the class on it would fail). This would induce (bearable?) extra checks in cases where it actually is used in a first-class manner. However, it is not entirely clear how it would interact with custom-allocating built-in classes. Yet a weaker option would be to allow any use of such partially constructed objects, except defining new properties on them, other than by the assignment syntax in the constructors (and access to not yet constructed properties would throw as usual, see Objects).
The identifier ‘undefined’ cannot be rebound to a value other than ‘undefined’.
Motivation: Rebinding 'undefined' just leads to obfuscated code. It also can make uses of the constant slightly more costly. The restrictions effectively turn 'undefined' into a keyword. It can be used like a pattern, but will only match the value 'undefined'.
Explicit use of direct ‘eval’ is disallowed.
Note: Indirect calls to ‘eval’ passed as a function from outside strong mode can still occur, naturally. In particular, this allows using (indirect) eval in strong code by merely renaming it.
Motivation: Direct calls to ‘eval’ prevent many useful optimisations on the caller. Eval is generally considered a dangerous feature (esp. with respect to security) and abused more often than not. Its most important use case is displaced by ES6 modules. Realms also provide explicit ‘eval’ methods.
Discussion: Alternatively, it would be sufficient to just ban direct calls to ‘eval’, but few programmers are aware of the distinction. (Or one could treat all calls as indirect but that is not strictly a restriction anymore but a silent change in (scoping) semantics).
Switch is restricted to constant-time branching with no fall-through.
Motivation: Non-structured control-flow in 'switch' statements cannot always be optimised. They can easily be replaced by 'if' conditionals. Fall-through cases are a common pitfall as well; disallowing them fixes one of the most serious design bugs inherited from C. In ES6, it is always possible to avoid code duplication with a local function.
Error-prone, costly or redundant constructs are disallowed.
Motivation: The conversion semantics of sloppy equality is hard to predict. It has long been best practice in JavaScript to use '===' and '!==" instead. It would make sense to deprecate ‘==’ and ‘!=’ altogether, but that might be too harsh.
For-in loops are underspecified by the language and inflexible. Moreover, like ‘in’, they conflate language domain and problem domain (i.e., cannot reliably distinguish between properties that are part of the program, and properties that are part of the data). This is a frequent source of subtle bugs in JavaScript code. ES6 offers 'for-of' loops as a more reliable and more powerful substitute for 'for-in'.
Property deletion is disallowed in strong mode (see Objects).
The empty statement is both unnecessary and difficult to make out. In places where a no-op statement is needed, '{}' is more explicit.
Discussion: It would be possible to allow ‘==’ and ‘!=’ but make them throw if the operands have different types. However, that would still preclude popular patterns like “x == null”. Introducing an exception for that seems like a slippery slope.
Implications for Implementations: Iteration via 'for-of' needs to be optimised to a competitive degree. Likewise maps and sets.
Open Questions: Should there be a restriction on automatic semicolon insertion? For example, only allow it before a closing brace (to enable “function() { return 5 }”), or not allow it at all?
The number of implicit conversions is significantly reduced.
Note: To keep with the C tradition, implicit ToBoolean conversions is still allowed.
Motivation: The various implicit conversions in JavaScript are error-prone and potentially require more costly code or runtime profiling, and unexpected deoptimisation. Many should be removed for good.
Open Questions: Figure out which conversions to disable exactly, and which ones to keep. What about ‘+’ in particular?
Open Questions: How can strong mode provide a way to create strong arrays or regexps in cases where a literal does not suffice? Sanitising built-ins could be realised by a number of means. Unfortunately, they all suck:
A slightly separate problem are non-constructor methods that return fresh objects. In some cases, they could dynamically inspect the receiver, and if it is strong, return strong results. But this does not work for “static” methods.
TODO