Benchmarking of different NetBSD versions

Discussion:

Martin J. Laubach

2005-12-17 14:34:53 UTC

I'd like to redo fefe's benchmarks, so that we have a stable
baseline from which we can re-run the benchmarks after major
changes have gone in.

My question is, is it worthwhile to do that under Qemu, or
will I just be measuring emulator overhead and the real results
will be lost in the noise?

mjl

Manuel Bouyer

2005-12-17 15:55:37 UTC

Permalink

Post by Martin J. Laubach
I'd like to redo fefe's benchmarks, so that we have a stable
baseline from which we can re-run the benchmarks after major
changes have gone in.
My question is, is it worthwhile to do that under Qemu, or
will I just be measuring emulator overhead and the real results
will be lost in the noise?

I don't think it'll be really good, because the time to execute an
instruction may vary depending on the load of the system hosting
qemu.

A Xen guest may be a better choise, because in this case the
user and system time should be accurate (the time won't be incremented
when the CPU is scheduled for another domain or for the hypervisor).
Of course the elapsed time won't be accurate either.

the NetBSD CVS has support for Xen2 only for 3.0 and current,
but there are patches for NetBSD-2.0 in the Xen svn repository (and source
tarballs).

--
Manuel Bouyer <***@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--

Hubert Feyrer

2005-12-17 18:59:54 UTC

Permalink

Post by Martin J. Laubach
My question is, is it worthwhile to do that under Qemu, or
will I just be measuring emulator overhead and the real results
will be lost in the noise?

Given what fefes benchmark does, you definitely do not want to run two
operating systems on your CPU. -> no qemu.

- Hubert

Thor Lancelot Simon

2005-12-17 20:20:07 UTC

Permalink

Qemu is a bad idea, and so is Xen. Both introduce multiple confounds
into the benchmarking process.

The best thing to do is to go out and get a few identical, cheap
hard drives, and simply tuck them away with each OS and revision to
be benchmarked installed in the test configuration. Then you can
rerun your old benchmarks any time. Of course, you can use ghost
to do something similar to this without the physical disks -- but
they are cheap, and convenient, so I just use the disks.

--
Thor Lancelot Simon ***@rek.tjls.com

"The inconsistency is startling, though admittedly, if consistency is to be
abandoned or transcended, there is no problem." - Noam Chomsky

David Maxwell

2005-12-18 06:56:16 UTC

Permalink

Post by Thor Lancelot Simon

Post by Martin J. Laubach
I'd like to redo fefe's benchmarks, so that we have a stable
baseline from which we can re-run the benchmarks after major
changes have gone in.

The best thing to do is to go out and get a few identical, cheap
hard drives, and simply tuck them away with each OS and revision to
be benchmarked installed in the test configuration. Then you can
rerun your old benchmarks any time. Of course, you can use ghost
to do something similar to this without the physical disks -- but
they are cheap, and convenient, so I just use the disks.

Just a note - I recently did a bunch of profiling on 200GB drives. I found
that even the same brand, same model, of hard drive, when installed on the
same channel of the same controller (i.e. shutdown, swap drives, boot) can
show a significant variance in throughput under test.

I saw a range of 5MB/s difference. If your tests depend on disk much, you
may want to profile the bare drives first.

--
David Maxwell, ***@vex.net|***@maxwell.net -->
(About an Amiga rendering landscapes) It's not thinking, it's being artistic!
- Jamie Woods