NetBSD 2 vs the rest with MySQL

Post by Joel Macklow
We do really well with the select-key test on a single CPU. Any
comments regarding the other tests from those who know? This is the kind
of info the advocacy group needs - as long as we look good in it ;)

Hm.. is this a threaded test? It looks like it. If so, the lack of
PTHREAD_CONCURRENCY > 1 explains the crappy 2-CPU results.

- Frank

Jason Thorpe

2005-02-09 21:20:38 UTC

Hm.. is this a threaded test? It looks like it. If so, the lack of
PTHREAD_CONCURRENCY > 1 explains the crappy 2-CPU results.

I would definitely suggest to the author of the article to increase
PTHREAD_CONCURRENCY for his tests and run them again.

-- thorpej

Frank van der Linden

2005-02-09 21:45:22 UTC

Post by Frank van der Linden
Hm.. is this a threaded test? It looks like it. If so, the lack of
PTHREAD_CONCURRENCY > 1 explains the crappy 2-CPU results.

I would definitely suggest to the author of the article to increase
PTHREAD_CONCURRENCY for his tests and run them again.

Yeah, I dropped him a note. I'm curious how well it will perform
(if stable). I'm glad to see that we kick ass in the single-CPU
case.

- Frank

Hubert Feyrer

2005-02-09 21:46:30 UTC

Post by Frank van der Linden
(if stable). I'm glad to see that we kick ass in the single-CPU
case.

Um, have you looked at the 10M Row test?
http://www.newsforge.com/blob.pl?id=82311a9e7896a841032c395f270d6a0f
Why so we not kick ass there at all?

- Hubert

--
NetBSD - Free AND Open! (And of course secure, portable, yadda yadda)

Frank van der Linden

2005-02-09 21:58:33 UTC

Post by Hubert Feyrer
Um, have you looked at the 10M Row test?
http://www.newsforge.com/blob.pl?id=82311a9e7896a841032c395f270d6a0f
Why so we not kick ass there at all?

I was talking about the first test, and the 1- to 2-CPU comparison,
which is interesting to look at if PTHREAD_CONCURRENCY=2 should work,
since we do so well in the 1-CPU case.

The I/O problem is a whole different matter, I don't know what's going
on there.

- Frank

Andy Ruhl

2005-02-09 22:16:15 UTC

Post by Frank van der Linden
The I/O problem is a whole different matter, I don't know what's going
on there.

Not sure if this is related...

I work on a database product and the forcedirectio option on Solaris
is very well known by us to improve performance.

The writeup sort of shows a parallel between Solaris 10 and NetBSD in
performance with the large row... That is until they set forcedirectio
on Solaris.

Maybe this is more specifically a filesystem problem, not a general I/O problem?

Might be interesting to see this done again with various different
parameters on the filesystem, or just a different filesystem.

Andy

Rui Paulo

2005-02-10 00:19:47 UTC

Post by Frank van der Linden
The I/O problem is a whole different matter, I don't know what's going
on there.

Maybe NEW_BUFQ_STRATEGY helps ?

"The FFS2 file system was used for all of the NetBSD 2.0 partitions,
with soft updates enabled. I built two separate kernels, one for
single-CPU and one for dual-CPU. They were based on GENERIC and
included the process size increases I mentioned above."

The author doesn't mention if it was enabled or not.

--
Rui Paulo <***@netbsd-pt.org> http://www.netbsd-pt.org/users/rpaulo/

Andy Ruhl

2005-02-10 05:18:54 UTC

On Thu, 10 Feb 2005 00:19:47 +0000 (UTC), Rui Paulo

Post by Rui Paulo

Post by Frank van der Linden
The I/O problem is a whole different matter, I don't know what's going
on there.

Maybe NEW_BUFQ_STRATEGY helps ?
"The FFS2 file system was used for all of the NetBSD 2.0 partitions,
with soft updates enabled. I built two separate kernels, one for
single-CPU and one for dual-CPU. They were based on GENERIC and
included the process size increases I mentioned above."
The author doesn't mention if it was enabled or not.

If I wanted to set up some simulation of the problem in a simple way,
is there any particular way I could do this? I want to test a few
things. If I can show that some small write is proportionally a lot
faster than a larger one, that would be all it would take.

Is it as easy as doing something like this:

dd if=/dev/zero of=somefile bs=1024k count=1

versus

dd if=/dev/zero of=somefile bs=1024k count=10

?

Not sure if block sizes have anything to do with this...

If someone gives me something to chew on, I can test this on 2.0 on
i386 or amd64, both on the same amd64 machine. It's only 1 processor
but I don't think that makes any difference here. Figured I'd try
various filesystem options, or even a different filesystem.

Andy

Jeff Rizzo

2005-02-09 22:10:42 UTC

Post by Frank van der Linden
(if stable). I'm glad to see that we kick ass in the single-CPU
case.

Um, have you looked at the 10M Row test?
http://www.newsforge.com/blob.pl?id=82311a9e7896a841032c395f270d6a0f
Why so we not kick ass there at all?
- Hubert

Well, I'm not really sure, since this is pretty far outside my area of
expertise, but since he claims the Solaris numbers on the 10M row test
were also horrid until turning off some FS cacheing, I wonder if tuning
the VM system might help some. I just sent the author a link to Arto
Selonen's VM tuning page
(http://www.selonen.org/arto/netbsd/vm_tune.html) in hopes that he might
be able to squeeze some better numbers out of it...

Of course, someone who knows better than I should feel free to let me
know if I'm full of it...

+j

Pavel Cahyna

2005-02-09 22:53:56 UTC

Post by Jeff Rizzo

Post by Frank van der Linden
(if stable). I'm glad to see that we kick ass in the single-CPU
case.

Um, have you looked at the 10M Row test?
http://www.newsforge.com/blob.pl?id=82311a9e7896a841032c395f270d6a0f
Why so we not kick ass there at all?
- Hubert

I don't think that's the reason. He explicitely says that NetBSD wasn't
swapping at all, so turning the caching off wouldn't help IMHO.

But, doesn't NetBSD need to read data to cache when they are rewritten,
since UBC was introduced? Could this explain that behaviour?

Bye Pavel

SODA Noriyuki

2005-02-10 23:49:50 UTC

Post by Pavel Cahyna

On Wed, 09 Feb 2005 23:53:56 +0100,

Um, have you looked at the 10M Row test?
http://www.newsforge.com/blob.pl?id=82311a9e7896a841032c395f270d6a0f
Why so we not kick ass there at all?

But, doesn't NetBSD need to read data to cache when they are rewritten,
since UBC was introduced? Could this explain that behaviour?

This sounds right, indeed.

Isn't anyone working to implement chuq's idea?

http://mail-index.netbsd.org/tech-kern/2003/05/27/0000.html
--
soda

Chuck Silvers

2005-02-15 17:34:56 UTC

Post by SODA Noriyuki

Post by Pavel Cahyna

On Wed, 09 Feb 2005 23:53:56 +0100,

Um, have you looked at the 10M Row test?
http://www.newsforge.com/blob.pl?id=82311a9e7896a841032c395f270d6a0f
Why so we not kick ass there at all?

But, doesn't NetBSD need to read data to cache when they are rewritten,
since UBC was introduced? Could this explain that behaviour?

This sounds right, indeed.
Isn't anyone working to implement chuq's idea?
http://mail-index.netbsd.org/tech-kern/2003/05/27/0000.html

well, the reason I haven't done it myself yet is that it'll be a fair bit
of work to implement. but it occurs to me that there's an easier way to
achieve most of the same effect: use UVM loaning instead of uiomove()
to transfer the data in write(). this would be a bit more restricted
than what I described earlier, since it would require that the address
of the memory being written to the file be page aligned (whereas there's
no restriction on the memory address in the previous scheme), but I bet
that many applications that do multi-page write()s will happen to be using
page-aligned memory addresses as well. I verified that this is the case
for mysql, at least in the default configuration.

the past few days I've been finishing my code for loaning-for-read()
that I've been infrequently fiddling with for probably a couple years now,
I'll post about it on tech-kern shortly. loaning for write() shouldn't
be much harder than for read().

I suspect that in addition to loaning-for-write() we would also need to
improve the flush-behind and page reuse policies some more, but we'll
find out if we go down that path.

(of course, what databases that do their own caching really want is
unbuffered I/O, but mysql appears to rely on the OS file cache since
the mysqld process is only 60MB on this machine with 3GB of RAM.
maybe that's just how it's configured by default, I don't know much
about mysql. unbuffered I/O would also be a fair bit of work to
implement, but I'd like to see that in netbsd eventually.)

-Chuck

SODA Noriyuki

2005-02-15 19:18:41 UTC

Post by Chuck Silvers
but it occurs to me that there's an easier way to
achieve most of the same effect: use UVM loaning instead of uiomove()
to transfer the data in write().
but I bet
that many applications that do multi-page write()s will happen to be using
page-aligned memory addresses as well. I verified that this is the case
for mysql, at least in the default configuration.
the past few days I've been finishing my code for loaning-for-read()
that I've been infrequently fiddling with for probably a couple years now,
I'll post about it on tech-kern shortly. loaning for write() shouldn't
be much harder than for read().

Mmmm, great.

Post by Chuck Silvers
(of course, what databases that do their own caching really want is
unbuffered I/O, but mysql appears to rely on the OS file cache since
the mysqld process is only 60MB on this machine with 3GB of RAM.
maybe that's just how it's configured by default,

The MySQL configuration that the author of the report used is
described in previous article:
http://software.newsforge.com/software/04/12/27/1238216.shtml?tid=72&tid=29

i.e.
innodb_buffer_pool_size=256M
innodb_log_file_size=128M
innodb_log_buffer_size=8M
innodb_flush_log_at_trx_commit=1

Since the machine only has 512MB RAM, more than half of the memory
(256MB+8MB) are allocated to the database cache.

Post by Chuck Silvers
unbuffered I/O would also be a fair bit of work to
implement, but I'd like to see that in netbsd eventually.)

Yeah, direct I/O must be better for database applications.
--
soda

Chuck Silvers

2005-02-16 02:32:10 UTC

Post by SODA Noriyuki

The MySQL configuration that the author of the report used is
http://software.newsforge.com/software/04/12/27/1238216.shtml?tid=72&tid=29
i.e.
innodb_buffer_pool_size=256M
innodb_log_file_size=128M
innodb_log_buffer_size=8M
innodb_flush_log_at_trx_commit=1
Since the machine only has 512MB RAM, more than half of the memory
(256MB+8MB) are allocated to the database cache.

hmm, I had applied those setting without paying attention to what they were.
even with those, the mysqld process never got above 60-odd MB in size.
it's not using any sysv shared memory, so where is the cached data going?
I'm sure I'm just being dense here. :-)

-Chuck

Geert Hendrickx

2005-04-03 08:51:47 UTC

Post by Frank van der Linden
Hm.. is this a threaded test? It looks like it. If so, the lack of
PTHREAD_CONCURRENCY > 1 explains the crappy 2-CPU results.

I would definitely suggest to the author of the article to increase
PTHREAD_CONCURRENCY for his tests and run them again.

Yeah, I dropped him a note. I'm curious how well it will perform
(if stable). I'm glad to see that we kick ass in the single-CPU
case.

Have these benchmarks be re-run yet? The author promised to do so, and
post the results, but I haven't heard anything about it anymore. Which
is a pity.

GH

--
:wq

Chuck Silvers

2005-04-10 15:22:35 UTC

Post by Geert Hendrickx

Post by Frank van der Linden
Hm.. is this a threaded test? It looks like it. If so, the lack of
PTHREAD_CONCURRENCY > 1 explains the crappy 2-CPU results.

I would definitely suggest to the author of the article to increase
PTHREAD_CONCURRENCY for his tests and run them again.

Yeah, I dropped him a note. I'm curious how well it will perform
(if stable). I'm glad to see that we kick ass in the single-CPU
case.

Have these benchmarks be re-run yet? The author promised to do so, and
post the results, but I haven't heard anything about it anymore. Which
is a pity.

tony has posted an addendum to his benchmark articles on his blog:

http://vegan.net/tony/blog/index.php?Type=Article&ArticleID=2

-Chuck

James Chacon

2005-02-10 07:28:49 UTC

Hm.. is this a threaded test? It looks like it. If so, the lack of
PTHREAD_CONCURRENCY > 1 explains the crappy 2-CPU results.

I would definitely suggest to the author of the article to increase
PTHREAD_CONCURRENCY for his tests and run them again.

Is there any reason the current system doesn't auto adapt to # cpus for
default concurrency? Seems very non-intuitive to have to do this manually..

James

Mike M. Volokhov

2005-02-10 08:15:45 UTC

On Wed, 9 Feb 2005 20:48:16 +0100

Hm.. is this a threaded test? It looks like it. If so, the lack of
PTHREAD_CONCURRENCY > 1 explains the crappy 2-CPU results.

Just wondering, why not set PTHREAD_CONCURRENCY to number of CPUs by
default?

--
Mishka.

Hubert Feyrer

2005-02-10 12:21:59 UTC

Post by Mike M. Volokhov
Just wondering, why not set PTHREAD_CONCURRENCY to number of CPUs by
default?

IIRC there were (are?) problems with that.
But as they won't be found with the current default, that's useless, and
AFAIK that default was changed yesterda (in -current, no idea if it will
be pulled into 2.1).

- Hubert

--
NetBSD - Free AND Open! (And of course secure, portable, yadda yadda)

Jeff Rizzo

2005-02-09 20:21:02 UTC

Post by Joel Macklow
Anyone seen this?
http://software.newsforge.com/article.pl?sid=04/12/27/1243207&from=rss
We do really well with the select-key test on a single CPU. Any
comments regarding the other tests from those who know? This is the
kind of info the advocacy group needs - as long as we look good in it ;)

I wonder if the crappy performance on the 10M row test would be affected
by the tuning of vm.* sysctls, especially since Solaris problems on the
same test were mitigated by forcedirectio...

Hm.

+j

Dmitri Nikulin

2005-04-10 17:28:45 UTC

I think he invalidated himself when he said:
"The other kernel only included the i686 definition, making it specific
to Pentium III processors or higher."

I can only assume that this is the kind of thing Hubert was talking
about when his blog mentioned "Aaah, the joy of people who have no idea
of what they're writing about still being bold enough to write about
things they have no idea about."

NetBSD's performance is anomalous, since I have never found anywhere any
evidence that it is measurably slower than Linux: in fact on some of my
machines it runs circles around it. MySQL is a horrible benchmark
platform since it is highly Linux-centric. It's not even good software.
Postgres could have given a more balanced result.

I threw www/apache2 + lang/php5 + databases/mysql4-server +
databases/php-mysql onto a NetBSD 3.0-beta machine to run Invision Power
Boards (not my idea, just my server). The machine also does a thorough
pf and IPSec setup, and is an NFS server. It happens to be a P3 1Ghz
laptop. Everyone agrees it's blazing fast. Whatever "poor performance"
NetBSD has in this kind of setup must be restricted to their tests. Out
of curiosity: did the investigation into increasing (quadrupling..)
performance for the (IIRC) 10M Rows benchmark actually include testing
Linux on the same machine and seeing if the performance difference was
just as the public benchmark showed? I have a lot of trouble believing
the findings without evidence from the other side. But I admit my time
spent browsing the mailing list archives isn't that extensive.

Worse shame is that benchmarks like this are often taken as gospel and
used to judge the entire merits of an operating system. People don't
seem to care about quality, security, stability, or indeed anything you
can't measure with a few lines of someone's language of choice. I have
heard people say "Why would I use BSD for a server? Linux 2.6 is O(1)
everywhere!", even though the kinds of loads the would deal with would
never emerge into the zone where Linux' O(1) becomes faster than an O(n)
with a lower starting point (which is often the case). So not only are
they losing micro performane (and macro: has anyone here actually tried
RUNNING a real server on Linux? Nightmare) but they are throwing away
quality assurance and security. Their loss, but it's hard to deny that
public opinion also sways donors /and/ developers, possibly deducting
from the foundation's available resources. A little bad PR is all it takes.

It's good that on the other hand that NetBSD is gaining recognition as a
competitor for FreeBSD's vaunted performance titles, but the real
corporate- and government-recognition giant is Linux. You know what I
heard on Slashdot? That some people are condemning NetBSD for moving to
replace Linux, for instance, implementing PAM (because, you know, only
Linux has PAM, and there are no real uses for it. Right?), having a
corporate-friendly logo, new numbering scheme, and so on. I only hope
potential future developers don't see it the same way. Even if it is so,
good, it's high time Linux' most popular advantage (basically, that it's
well known) got some real competition too.

Wow, long rant.

-Dmitri Nikulin

Hubert Feyrer

2005-04-10 21:10:12 UTC

Post by Dmitri Nikulin
"The other kernel only included the i686 definition, making it specific
to Pentium III processors or higher."
I can only assume that this is the kind of thing Hubert was talking
about when his blog mentioned "Aaah, the joy of people who have no idea
of what they're writing about still being bold enough to write about
things they have no idea about."

I'd appreciate it if you would not cite me completely out of context to
support your flames. I recommend you go back and read the article about
which I wrote what you cite.

FWIW, I think Tony definitely knows a fair bit about what he's writing,
and maybe you should start think about how NetBSD could make things easier
to do what he really wanted (add cpu-specific compiler flags).

Post by Dmitri Nikulin
Wow, long rant.

Indeed. If you want to start changing things for the better
-> http://www.netbsd.org/contrib/howto.html (patches preferred :)

- Hubert

--
NetBSD - Free AND Open! (And of course secure, portable, yadda yadda)

Dmitri Nikulin

2005-04-11 04:16:49 UTC