GWT Benchmarking Design
GWT touts its capability to produce extremely optimized JavaScript
applications as one of its main strengths. To keep ourselves true to
that goal, we need a suite of benchmarks which can help us continuously
drive GWT optimizations and spot performance regressions that otherwise
might leak through.
Goals
-
Support micro-benchmarks (for example, similar to speed-tests).
-
Support benchmarking both Java and JSNI.
-
Automatically avoid slow script warnings in benchmarks.
-
Be as accurate as reasonably possible. Don't include overhead from
JUnit infrastructure or setup and teardown functionality.
-
Make simple benchmarks simple to write and complex benchmarks as simple as possible to write.
-
Provide complete report data files. The reports should contain
configuration information (host machine, version of browser, version of
GWT), full benchmark results, and source code. These can be saved and
analyzed offline to compare results over time (e.g. regressions),
shared with others for comparison, etc...
-
Provide a reporting application. The application should provide useful visualizations of the data in the reports.
-
Support benchmarking on any browser and platform that GWT apps can be
deployed to. Specifically, don't limit support to just the hosted
browsers.
- Support distributed benchmarking. We should be
able to run benchmarks across several different browsers in parallel,
automatically.
Non-Goals
-
Compilation-speed benchmarks (i.e. how long it takes to compile an application).
-
Load testing. This can be performed with existing web-application load testing tools.
-
Hosted mode benchmarking. Benchmarking should operate in hosted mode,
but only so that benchmarks can be debugged. Hosted mode benchmarking
results will not be useful, because they include the wrong information:
execution of Java rather than JavaScript, JSNI boundary crossing, etc...
For future consideration
- Memory usage - There isn't a standard method for retrieving
memory usage from JavaScript. After some investigation, it appears that
we'd only able to measure memory coarsely for the entire browser
process without resorting to intrusive techniques (e.g. hosted browser,
special browser plugins, etc...).
-
Code size (e.g. how large is the generated JavaScript for a function)
Features
We add a new
Benchmark class as a subclass of
GWTTestCase. Users extend this class when creating their own
Benchmark.
Benchmarks differ from regular test cases in the following ways:
-
Benchmark results are collected and stored in XML reports viewable with a benchmarkViewer application.
-
Benchmark test methods are automatically timed.
-
Each Benchmark test method can have it's own corresponding lifecycle methods. Functions named begin[MethodName] and end[MethodName]
will be executed before and after every execution of the corresponding
test method. The timings of these lifecycle methods are specifically
excluded from the overall timing of the test method.
-
To ensure accurate timing, Benchmark
test methods are automatically run for a minimum period of time
(150ms). This alleviates the problems associated with coarse-grained
timers in JavaScript.
-
Benchmark test methods can have parameters. GWT will execute each Benchmark
method multiple times in order to exhaustively test all the possible
permutations of the parameter values. Each parameter must have its
range of values documented using the GWT annotation, @gwt.benchmark.param. The syntax for gwt.benchmark.param is <param name> = <Range>. For example,
public static final Range positions = new Range() {
public Iterator iterator() {
return Arrays.asList( new Position[] {BEGIN, END, NONE, VARIED } ).iterator();
}
};
final IntRange insertRemoveRange = new IntRange(64, Integer.MAX_VALUE, Operator.MULTIPLY, 2);
/**
* @gwt.benchmark.param where = positions
* @gwt.benchmark.param size -limit = insertRemoveRange
*/
public void testArrayListRemoves(Position where, Integer size) { ... }
In this example, testArrayListRemoves is executed with all the possible permutations of positions and insertRemoveRange.
-
Limit benchmark execution to a maximum period of time (1000ms). The
above example demonstrates an automatic limit on the number of
executions of a benchmark method. The final parameter of any benchmark
method (in this example, size) can optionally be decorated with -limit
to stop execution of additional permutations of the test when the
execution time becomes too long (over 1000ms). So, in this example, for
each value of Position, testArrayListRemoves will be executed for increasing values of size (beginning with 64 and increasing in steps of 2), until either it reaches Integer.MAX_VALUE or the execution time for the last permutation is > 1000ms.
-
Both Benchmark test methods and classes can be decorated with a category annotation to indicate the reporting Category they belong to:
@gwt.benchmark.category - The class name of the Category the benchmark test method belongs to. This property may also be set at the Benchmark class level.
Design
Extending GWT's JUnit support
GWT has existing support for executing JUnit tests within a browser in
a reasonably transparent fashion. In general, the requirements around
executing performance tests are very similar to the requirements for a
general testing mechanism: We create suites of tests which we can
execute both locally and remotely in different environments in an
automated fashion. Projects such as
JUnitPerf
demonstrate the common desire to extend testing frameworks to support
performance testing. Based on the similarity between functional unit
testing and performance testing, it makes the most sense to modify
GWT's existing JUnit infrastructure to support performance testing's
special needs (rather than implementing entirely separate support from
scratch).
GWT's existing support for JUnit involves several different working pieces:
-
Translatable JUnit classes (found in com.google.gwt.junit.translatable.junit.framework)
-
GWTTestCase - a special base class for test cases which also has a special "translatable" version for the browser.
-
JUnitShell, which is a specialization of GWTShell. This class generally drives the unit tests and is invoked through GWTTestCase.
-
The JUnitHost service which supports GWTTestCase by answering requests for test cases and accepting test results.
-
JUnitMessageQueue, which acts as a mediator between JUnitShell and JUnitHost.
-
JUnitTestCaseStubGenerator, which is a Generator that provides some extra support for running test cases in a browser.
Benchmarking extends JUnit support by:
-
Introducing a new Benchmark class as a subclass of GWTTestCase. Benchmarks have special behavior in regards to test method setup and teardown, timing, and parameters. These features of the Benchmark class are implemented in a new subclass of JUnitTestCaseStubGenerator, BenchmarkGenerator.
-
Introducing a new TestResults class which contains trial, timing, and configuration information. Generated subclasses of Benchmarks populate these objects and send them via the JUnitHost RPC service, where they are eventually received by a BenchmarkReport.
-
Modifying JUnitShell so that it creates a BenchmarkReport and adds the TestResults of each Benchmark execution to the report.
-
Modifying GWTTestCaseImpl to accept TestResults from Benchmarks.
-
Creating a ReportViewer - A GWT application that displays the generated XML benchmark reports. The ReportViewer depends on two services:
-
ReportServer RPC service - Serves up reports from a folder.
-
ReportImageServer servlet - Generates graphs for Benchmarks
Benchmark Generation
Benchmark
This base class doesn't contain any new functionality in and of itself. It's simply a marker subclass for the BenchmarkGenerator.
BenchmarkGenerator
The BenchmarkGenerator is the heart of the benchmarking subsystem. It
generates the logic which provides the benchmark execution
functionality. Rather than dive through the implementation details of
the generator, it's more instructive to examine the code it generates.
Given the following example test method:
/**
* Performs <code>size</code> removes at position, <code>where</code>, on an
* ArrayList of size, <code>size</code>.
* @gwt.benchmark.param where = Position.positions2
* @gwt.benchmark.param size -limit = insertRemoveRange
*/
public void testArrayListRemoves(Position where, Integer size) {
removeFromCollection(size, where, list);
}
The generator would create the following code:
public void testArrayListRemoves() {
privateDelayTestFinish(2000);
final List ranges = Arrays.asList( new Range[]{Position.positions2, insertRemoveRange});
final PermutationIterator permutationIt = new PermutationIterator(ranges);
DeferredCommand.addCommand(new IncrementalCommand() {
public boolean execute() {
privateDelayTestFinish(10000);
if (permutationIt.hasNext()) {
PermutationIterator.Permutation permutation =
(PermutationIterator.Permutation) permutationIt.next();
ArrayListAndVectorBenchmark.Position where =
(ArrayListAndVectorBenchmark.Position) permutation.getValues().get(0);
Integer size = (Integer) permutation.getValues().get(1);
double __setupTiming = 0;
double __testTiming = 0;
for (int numLoops = 1; true;) {
long start = System.currentTimeMillis();
for (int i = 0; i < numLoops; ++i) {
beginArrayListRemoves(where, size);
__emptyFunc();
endArrayListRemoves(where,size);
}
long duration = System.currentTimeMillis() - start;
if (duration < 150) {
numLoops += numLoops;
continue;
}
double durationMillis = duration * 1.0;
double numLoopsAsDouble = numLoops * 1.0;
__setupTiming = durationMillis / numLoopsAsDouble;
break;
}
for (int numLoops = 1; true;) {
long start = System.currentTimeMillis();
for (int i = 0; i < numLoops; ++i) {
beginArrayListRemoves(where, size);
testArrayListRemoves(where, size);
endArrayListRemoves(where,size);
}
long duration = System.currentTimeMillis() - start;
if (duration < 150) {
numLoops += numLoops;
continue;
}
double durationMillis = duration * 1.0;
double numLoopsAsDouble = numLoops * 1.0;
__testTiming = durationMillis / numLoopsAsDouble;
TestResults results = getTestResults();
Trial trial = new com.google.gwt.junit.client.Trial();
trial.setRunTimeMillis(__testTiming - __setupTiming);
java.util.Map variables = trial.getVariables();
variables.put("where", where.toString());
variables.put("size", size.toString());
results.getTrials().add(trial);
if (numLoops == 1 && duration > 1000) {
permutationIt.skipCurrentRange();
}
break;
}
return true;
}
privateFinishTest();
return false;
}
});
}
This code implements the following functionality:
-
Parametric test methods.
The generated benchmark executes all permutations of the values provided for each parameter. The values are specified as Ranges in the @gwt.benchmark.param annotation.
-
Slow script warning avoidance.
The generator ensures that only one test permutation is executed before
yielding back to the event loop. The implementation of this involves
GWT's asynchronous test support, the new IncrementalCommand functionality, and a specialized PermutationIterator which turns iterations of the permutations inside-out into an Iterator. Because this functionality uses GWT's asynchronous test support, users can not currently benchmark asynchronous operations.
-
Removal of lifecycle overhead.
The generated benchmark determines how much overhead the begin and end
methods are introducing and removes it from the total execution time.
Lifecycle methods are optional and are only invoked if the user
provided them in their Benchmark.
-
Accuracy in results.
The generated benchmark iterates over both the lifecycle and test
methods in a tight loop until a minimum of 150ms has passed. This
ensures that enough time has elapsed that the coarse JavaScript timer
(typically 15ms) doesn't have a large impact upon the final results.
-
Time bounded Ranges.
Specifying -limit in the metadata for the last parameter causes the BenchmarkGenerator to introduce a time limit on the range of values for that last parameter:
if (numLoops == 1 && duration > 1000) {
permutationIt.skipCurrentRange();
}
If any particular permutation takes longer than 1000ms to execute, the
benchmark skips the rest of the values in the last range.
-
Results reporting. The results of each permutation are added as a trial on the test results:
TestResults results = getTestResults();
Trial trial = new com.google.gwt.junit.client.Trial();
trial.setRunTimeMillis(__testTiming - __setupTiming);
java.util.Map variables = trial.getVariables();
variables.put("where", where.toString());
variables.put("size", size.toString());
results.getTrials().add(trial);
Report Viewing
The ReportViewer application is composed of the following pieces:
-
The ReportViewer GWT client application
-
The ReportDatabase
-
The ReportServer RPC service
-
The ReportImageServer
ReportViewer
This GWT application presents a summary list of all available
reports in a folder, and allows the user to select one for detailed
viewing. In the detailed view, the application presents the code for
each benchmark method along with its results on each browser. The
results are displayed in a graph and also in tabular format.
ReportDatabase
The
ReportDatabase is the source of benchmark report data. It scans a folder periodically and provides an API to retrieve the available reports:
public List<ReportSummary> getReportSummaries();
public Report getReport(String reportId);
ReportServer
The
ReportServer provides an RPC interface for the
ReportViewer to retrieve benchmark data. It is mostly a remote wrapper around the
ReportDatabase.
public List<ReportSummary> getReportSummaries();
public Report getReport(String reportId);
ReportImageServer
The ReportImageServer provides the graphs for a report. Given a URL identifying a specific benchmark result, the ReportImageServer reads the ReportDatabase, constructs a graph, and writes it to the HTTP stream as a formatted image.
The ReportImageServer accepts URLs in the following format:
/com.google.gwt.junit.viewer.ReportViewer
/test_images
/report-12345.xml
/RemoveCategory
/com.google.gwt.junit.client.ArrayListAndVectorBenchmark
/testArrayListRemoves
/Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050920/
It indicates the report id, the category of the benchmark, the
benchmark class, the benchmark method, and the agent for the results.