warmup_stats
This system was originally developed as part of the
Virtual
Machine Warmup Blows Hot and Cold paper. It analyses VM performance data and
outputs various statistics (e.g. if the VM warmed up or slowed down; how long it
took to warmup; what (if any) the steady state performance is).
The system can be found at the warmup_stats repository.
Running benchmarks
High quality VM benchmarking is hard to achieve: it is easy to accidentally end
up measuring something other than was intended. We wrote
Krun to help make reliable VM measurement easier. Krun is
not suitable for all purposes, but is probably the easiest way to obtain high
quality benchmarking data.
A large-scale example
The
Virtual
Machine Warmup Blows Hot and Cold paper also serves as an example of a
large-scale VM experiment. You can find that experiment at the
warmup_experiment repository.
Generating output
The warmup_stats
repository provides scripts for generating tables of benchmarking results
(similar to those in the paper), plots, and 'diffs' between different versions
of a VM. Results and diff tables can be rendered as either HTML (to integrate
with a continuous integration system), or LaTeX / PDF. Plots can be produced in
PDF format only.
Example results from warmup_stats
Hover over the table below to see a diff between these results and the same
benchmarks run on a variant of the same VM.
Benchmark results
Symbol key:
bad inconsistent,
flat,
good inconsistent,
no steady state,
slowdown,
warmup.
Results for Normal
Benchmark |
Classification |
Steady iteration (#) |
Steady iteration (secs) |
Steady performance (secs) |
binarytrees | | | | |
capnproto_decode | | | | |
capnproto_encode | | | | |
fannkuch_redux | | | | |
fasta | | | | |
jsonlua_decode | (7 , 2 , 1 ) | | | |
jsonlua_encode | | | | |
luacheck | | | | |
luacheck_parser | (7 , 2 , 1 ) | | | |
luafun | | | | |
md5 | | | | |
nbody | | | | |
richards | | | | |
series | | | | |
spectralnorm | | | | |
Benchmark results
Symbol key:
bad inconsistent,
flat,
good inconsistent,
no steady state,
slowdown,
warmup.
Diff against previous results:
improved
worsened
different
unchanged.
Results for Normal
Benchmark |
Classification |
Steady iteration (#) |
Steady iteration (secs) |
Steady performance (secs) |
binarytrees | (13 , 2 ) | | | |
capnproto_decode | | | | |
capnproto_encode | | | 0.105 δ=-18.177 (0.105, 0.118) | 0.03364 δ=-0.08744 ±0.000528 |
fannkuch_redux | | | | 0.12620 δ=0.00669 ±0.001134 |
fasta | | | | |
jsonlua_decode | | | | |
jsonlua_encode | | | | |
luacheck | | | | |
luacheck_parser | (14 , 1 ) | | | |
luafun | (14 , 1 ) | | | 0.15423 δ=-0.01332 ±0.000904 |
md5 | | | | 0.11168 δ=-0.00112 ±0.000947 |
nbody | | | | 0.18291 δ=0.02386 ±0.004045 |
richards | | | | |
series | | | | |
spectralnorm | | | | |