Concise Instance Creation Expressions: Closures without Complexity
Bob Lee, Doug Lea, and Josh Bloch
I. Introduction
Since release 1.1, Java has had closures in the form of anonymous class instance creation expressions (JLS 15.9.5) that extend a class or interface with a single abstract method. We call such classes and interfaces single-abstract-method types, or SAM types for short. SAM types include Runnable, Callable, Comparator, and TimerTask. The resulting closures are often used to parameterize the behavior of collections (e.g., a Comparator specifies the order of a SortedSet), and to submit code for concurrent execution (e.g., a Runnable may be executed by a Thread or an Executor). There are many other uses, including callbacks, factories, predicates, and strategies.
Unfortunately, the verbose and ungainly syntax of anonymous class instance creation expressions frustrates programmers. The problem has become more acute due to the ongoing concurrency revolution. We therefore propose a more concise syntax to instantiate anonymous classes. The basic idea is to omit the keyword new, the class declaration, and the method name from class instance creation expressions. This may not sound like much, but it results in significantly less boilerplate and enhanced readability.
For example, here is the Java 5 code to start a thread whose run method invokes a method named foo:
new Thread(new Runnable() {
public void run() {
foo();
}
}).start();
If we adopt this proposal, the following code would be equivalent:
new Thread(Runnable(){ foo(); }).start();
Here is the Java 5 code to sort a list of strings by length (from shortest to longest):
List<String> ls = ... ;
Collections.sort(ls, new Comparator<String>() {
public int compare(String s1, String s2) {
return s1.length() - s2.length();
}
});
Here is the same code rewritten using the proposed syntax:
List<String> ls = ... ;
Collections.sort(ls,
Comparator<String>(String s1, String s2){ return s1.length() - s2.length(); });
From the programmer's perspective, that's pretty much all there is to it: no new concepts to learn, just a more concise syntax for something they already do.
II. Syntax and Semantics
We introduce a new kind of expression, called a
concise instance creation expression (CICE):
ConciseInstanceCreationExpression:
ClassOrInterfaceType ( FormalParameterListopt ) MethodBody
You can use a concise instance creation expression everywhere that an instance creation expression is currently legal. The construct is legal only if ClassOrInterfaceType represents a class or interface type with a single abstract method (a SAM type). The construct behaves as if replaced by this Java 5 code:
new ClassOrInterfaceType () {
AccessModifier ResultType MethodName ( FormalParameterListopt ) Throwsopt
MethodBody
}
The compiler copies the
AccessModifier,
ReturnType, and
Throwsopt clause (if it exists) from the declaration of the sole abstract method in the class or interface type. The expression generates a compile-time error if the formal parameter list or the method body is inconsistent with the sole abstract method in the type, or if the type is a class type without an accessible parameterless constructor.
III. Local Variables From the Enclosing Scope
Programmers often complain about having to declare local variables
final in order that an instance creation expression's method body is permitted to access them. We propose making such variables
final by default. Programmers also complain that instance creation expression method bodies are not permitted to share mutable local variables with the enclosing scope. Therefore, we further propose allowing such access if the local variables in question are explicitly declared
public. (If the instance executes in a different thread, it's up to the programmer to manage concurrent access.)
More specifically:
- Any visible local variable that is initialized or assigned exactly once in the enclosing scope, as well as any visible parameter to an enclosing method that is never otherwise assigned, is accessible but not assignable within the body of the CICE, whether or not it is explicitly qualified as final.
- Any local variable that is explicitly qualified as public is accessible, and also assignable (unless also qualified as final), within the body of the CICE. Formal parameters and for-loop variables may not be qualified as public.
- Access to any other local variable or parameter from an enclosing scope is illegal within the body of a CICE.
Here's an example of the "annoying
final" in Java 5. This method takes a comparator and returns a comparator that induces the reverse ordering:
static <T> Comparator<T> reverseOrder(final Comparator<T> cmp) { return new Comparator<T>() { public int compare(T t1, T t2) { return cmp.compare(t2, t1); } }; }Here's how the method would look if this proposal were adopted:
static <T> Comparator<T> reverseOrder(Comparator<T> cmp) { return Comparator<T>(T t1, T t2){ return cmp.compare(t2, t1); }; }Here's an example of the contortions required to get around the Java 5 requirement that local variables accessed by inner classes must be final. This snippet sorts an array and prints out how many element comparisons were performed in the process:
final int[] numCompares = new int[1]; Arrays.sort(a, new Comparator<Integer>() { public int compare(Integer i1, Integer i2) { numCompares[0]++; return i1.compareTo(i2); } }); System.out.println(numCompares[0]);Note the use of the single element array (numCompares) to pass a value back from the closure to to the surrounding scope. This is necessary because all local variables must be final. If this proposal were adopted, the above example could be replaced by this:
public int numCompares = 0; Arrays.sort(a, Comparator<Integer>(Integer i1, Integer i2) { numCompares++; return i1.compareTo(i2); }); System.out.println(numCompares);Why the restriction that
for-loop indices may not be labeled public? Because any code that does so is almost certainly broken. For example, consider this loop:
for (public int taskId = 0; taskId < NUM_TASKS; taskId++) { executor.execute(Runnable(){ newTask(taskId); }); }It is almost certainly the author's intent that each runnable get its own
taskId from
0 to
NUM_TASKS - 1. If the above code were legal, all of the tasks would share a single taskId
, which would be
NUM_TASKS. Here is a fixed version of the loop:
for (int i = 0; i < NUM_TASKS; i++) { int taskId = i; executor.execute(Runnable(){ newTask(taskId); }); }Note that the loop index (
i) need not be made
public. Each
Runnable gets its own
taskId, which is implicitly
final.
IV. Library Support
At the same time as this facility is added to the language, it would make sense to add a few more single method interface types, such as
Predicate,
Function, and
Builder. Perhaps a few utility methods in
java.util.Collections would not be amiss.
We can also introduce alternatives to existing concrete classes with methods that are meant to be overridden. For example, consider the following typical Java 5 use of ThreadLocal:
private static final AtomicInteger nextId = new AtomicInteger(0);
private static final ThreadLocal<Integer> threadId =
new ThreadLocal<Integer>() {
@Override protected Integer initialValue() {
return nextId.getAndIncrement();
}
}
};
Because ThreadLocal is a concrete class rather than a SAM type, it is not amenable to the CICE. If we introduce AbstractThreadLocal, a subclass of ThreadLocal with an abstract initialValue method, the following code would be equivalent:
private static final AtomicInteger nextId = new AtomicInteger(0);
private static final ThreadLocal<Integer> threadId =
AbstractThreadLocal<Integer>(){ return nextId.getAndIncrement(); };
A similar treatment might be desirable for LinkedHashMap and its removeEldestEntry method.
V. Further Ideas
It is, in many cases, technically feasible to infer types from the formal parameter list and method body of a CICE. It is worth exploring the pros and cons of doing so.
If we do end up allowing public local variables, we should consider allowing them to be made volatile. Consistency dictates that this should be legal, as all other variables that are accessible by multiple threads can be made volatile. On the other hand, we don't necessarily want to encourage multiple threads to access local variables without proper synchronization.