warmup_stats

This system was originally developed as part of the Virtual Machine Warmup Blows Hot and Cold paper. It analyses VM performance data and outputs various statistics (e.g. if the VM warmed up or slowed down; how long it took to warmup; what (if any) the steady state performance is).

The system can be found at the warmup_stats repository.

Running benchmarks

High quality VM benchmarking is hard to achieve: it is easy to accidentally end up measuring something other than was intended. We wrote Krun to help make reliable VM measurement easier. Krun is not suitable for all purposes, but is probably the easiest way to obtain high quality benchmarking data.

A large-scale example

The Virtual Machine Warmup Blows Hot and Cold paper also serves as an example of a large-scale VM experiment. You can find that experiment at the warmup_experiment repository.

Generating output

The warmup_stats repository provides scripts for generating tables of benchmarking results (similar to those in the paper), plots, and 'diffs' between different versions of a VM. Results and diff tables can be rendered as either HTML (to integrate with a continuous integration system), or LaTeX / PDF. Plots can be produced in PDF format only.

Example results from `warmup_stats`

Hover over the table below to see a diff between these results and the same benchmarks run on a variant of the same VM.

Benchmark results

Symbol key: bad inconsistent, flat, good inconsistent, no steady state, slowdown, warmup.

Results for Normal

Benchmark	Classification	Steady iteration (#)	Steady iteration (secs)	Steady performance (secs)
binarytrees				0.12490 ±0.002494
capnproto_decode				0.12721 ±0.017089
capnproto_encode		119.5 (64.2, 268.1)	18.282 (7.799, 46.709)	0.12109 ±0.050785
fannkuch_redux				0.11952 ±0.000665
fasta				0.12133 ±0.000200
jsonlua_decode	(7 , 2 , 1 )
jsonlua_encode				0.13302 ±0.002619
luacheck				1.02083 ±0.044730
luacheck_parser	(7 , 2 , 1 )
luafun		52.0 (36.6, 68.5)	8.657 (5.969, 11.388)	0.16755 ±0.007539
md5				0.11280 ±0.000011
nbody				0.15906 ±0.001202
richards		2.0 (2.0, 2.0)	0.148 (0.137, 0.150)	0.14340 ±0.006683
series		2.0 (2.0, 2.0)	0.345 (0.345, 0.345)	0.33401 ±0.000171
spectralnorm				0.13988 ±0.000002

Benchmark	Classification	Steady iteration (#)	Steady iteration (secs)	Steady performance (secs)
binarytrees	(13 , 2 )	1.0 (1.0, 701.3)	0.000 (0.000, 88.313)	0.12555 ±0.001198
capnproto_decode				0.12813 ±0.001797
capnproto_encode		4.0 δ=-115.5 (4.0, 4.3)	0.105 δ=-18.177 (0.105, 0.118)	0.03364 δ=-0.08744 ±0.000528
fannkuch_redux		2.0 (2.0, 71.3)	0.129 (0.129, 9.026)	0.12620 δ=0.00669 ±0.001134
fasta				0.12122 ±0.000403
jsonlua_decode				0.11448 ±0.002019
jsonlua_encode				0.13125 ±0.002070
luacheck				1.00955 ±0.004525
luacheck_parser	(14 , 1 )
luafun	(14 , 1 )	42.0 (29.7, 42.3)	6.364 (4.431, 6.447)	0.15423 δ=-0.01332 ±0.000904
md5				0.11168 δ=-0.00112 ±0.000947
nbody				0.18291 δ=0.02386 ±0.004045
richards		2.0 (2.0, 2.0)	0.147 (0.146, 0.149)	0.14306 ±0.002472
series		2.0 (2.0, 2.0)	0.347 (0.347, 0.347)	0.33403 ±0.000320
spectralnorm				0.13988 ±0.000001

Internal

info@soft-dev.org

warmup_stats

Running benchmarks

A large-scale example

Generating output

Example results from warmup_stats

Results for Normal

Results for Normal

Example results from `warmup_stats`