Discussion:
mmap(2) performance netbsd-1-6 vs. -current
Bang Jun-Young
2003-10-19 16:51:19 UTC
Permalink
Hi,

After reading a Slashdotted article currently in the flame, I performed
a quick and dirty benchmark test to see mmap(2) performance difference
between netbsd-1-6 and -current. The result is interesting:

netbsd-1-6 (latest as of this writing):
real 25.357s user 0.150s sys 24.984s

-current (1.6ZD, updated yesterday):
real 22.603s user 0.180s sys 22.404s

As you can see, -current mmap(2) was faster than the netbsd-1-6 one
by 11% in the test. The numbers hardly varied during multiple tests.

The test program used is as follows (please don't blame me for its
silliness, I just wanted to see a rough difference in minutes :-):

#include <stdio.h>
#include <sys/mman.h>

main()
{
void *ptr[15];
int i, j;
size_t size;

for (j = 0; j < 40960; j++) {
for (i = 0; i < 15; i++) {
size = 4096 << i;
ptr[i] = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_ANON, -1, 0);
if (ptr == NULL)
err("NULL returned, i=%d\n", i);
*((int *)ptr) = 0xdeadbeef;
}
for (i = 0; i < 15; i++) {
size = 4096 << i;
munmap(ptr[i], size);
}
}
}

Jun-Young
--
Bang Jun-Young <***@NetBSD.org>
Jason Thorpe
2003-10-19 18:18:18 UTC
Permalink
Post by Bang Jun-Young
The test program used is as follows (please don't blame me for its
Ok, I'll ignore the silliness, but I will point out a couple of bugs :-)
Post by Bang Jun-Young
#include <stdio.h>
#include <sys/mman.h>
main()
{
void *ptr[15];
int i, j;
size_t size;
for (j = 0; j < 40960; j++) {
for (i = 0; i < 15; i++) {
size = 4096 << i;
ptr[i] = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_ANON, -1, 0);
if (ptr == NULL)
err("NULL returned, i=%d\n", i);
There are two problems with this code:

- mmap() returns MAP_FAILED, not NULL, on faulure.
- "ptr" will never be null anyway! I think you really want to
be testing ptr[i].
Post by Bang Jun-Young
*((int *)ptr) = 0xdeadbeef;
...and, here again, you want to be dereferencing ptr[i].
Post by Bang Jun-Young
}
for (i = 0; i < 15; i++) {
size = 4096 << i;
munmap(ptr[i], size);
}
}
}
Jun-Young
--
-- Jason R. Thorpe <***@wasabisystems.com>
Bang Jun-Young
2003-10-19 20:06:33 UTC
Permalink
Post by Jason Thorpe
Post by Bang Jun-Young
The test program used is as follows (please don't blame me for its
Ok, I'll ignore the silliness, but I will point out a couple of bugs :-)
Post by Bang Jun-Young
#include <stdio.h>
#include <sys/mman.h>
main()
{
void *ptr[15];
int i, j;
size_t size;
for (j = 0; j < 40960; j++) {
for (i = 0; i < 15; i++) {
size = 4096 << i;
ptr[i] = mmap(NULL, size, PROT_READ|PROT_WRITE,
MAP_ANON, -1, 0);
if (ptr == NULL)
err("NULL returned, i=%d\n", i);
- mmap() returns MAP_FAILED, not NULL, on faulure.
- "ptr" will never be null anyway! I think you really want to
be testing ptr[i].
Post by Bang Jun-Young
*((int *)ptr) = 0xdeadbeef;
...and, here again, you want to be dereferencing ptr[i].
Oops, I was too lucky... :-)

After fixing the bugs you pointed out, the result is a bit more
interesting (and rather disappointing):

-current:
real 17.606s user 0.580s sys 17.026s

netbsd-1-6:
real 17.831s user 0.648s sys 16.985s

This time numbers varied slightly, so I took the median values of 10
trials. Here, -current was only marginally faster than netbsd-1-6,
and it's obvious that it had nothing to do with improvements made in
UVM in -current. This result makes me wonder why the previous silly
test showed such a noticable difference.

Jun-Young
--
Bang Jun-Young <***@NetBSD.org>
Jason Thorpe
2003-10-19 20:19:27 UTC
Permalink
Post by Bang Jun-Young
This time numbers varied slightly, so I took the median values of 10
trials. Here, -current was only marginally faster than netbsd-1-6,
and it's obvious that it had nothing to do with improvements made in
UVM in -current. This result makes me wonder why the previous silly
test showed such a noticable difference.
I think what it tells us is:

- If you only count the mapping of the region, -current is
faster. This is probably due to Andrew Brown's map entry
merging changes.

- Servicing the page fault later is not faster in -current
than it was in 1.6.

(In your first test, you were still performing the mapping operations,
you just weren't faulting in any pages except the first one.)

-- Jason R. Thorpe <***@wasabisystems.com>
enami tsugutomo
2003-10-19 22:21:08 UTC
Permalink
Post by Jason Thorpe
(In your first test, you were still performing the mapping operations,
you just weren't faulting in any pages except the first one.)
There was no fault even in the first mapping. And since the first one
isn't unmapped (ptr[0] is 0xdeadbeef), number of mapping increases on
each outer iteration. So, the time difference in the first test is
also due to map entry merging, I guess.

enami.
enami tsugutomo
2003-10-20 01:41:41 UTC
Permalink
So, the time difference in the first test is also due to map entry
merging, I guess.
Hmm, but the direction is forward. So, there is merge even in 1.6.
Actually, if i put a guard page to prevent merge with existing anon
(like shlib's), it behaves similarly on both 1.6 and -current.

enami.

Loading...