Multithreaded processors - DB Tuning Best Practices

Posted by chrisrichardson on 24-Jul-2009 08:38

Hi All,

Does anyone have anytips for best tuning the Database to cope with a multthreaded environment?  We are testing the latest release on a new server with the following spec: Sun Sparc T5440 4*1.2 ghz 8core (256 threads), 64gb ram.  Data is accessed over NFS locally (box to box direct 1gb connection).

Read performance is very good - 10+ fold performance increase compared to an older dual processor box.  We are running a binary dump and load, the load is averaging 20,000 records a minute which seems incrediblly slow.  The older box is more like 20,000 per 20secs.  I wondered if the binary load could thread properly or not?

I know NFS is know not ideal however Sun insist that the performance is better than running over iscsi (this will be my next test), iowaits are 0 so this leads me to assume that the binary dump/load is not (or cannot) taking full advantage of the available processing power.

The currenct DB params are:

-db /data/live/
-L 56000
-n 200
-S hirevp
-bibufs 35
-B 262144
-spin 75000
-tablerangesize 527
-indexrangesize 881

I'm unsure why -spin is so high (this is probably historical)....documentation suggests that is should be more like 40000.  Any help/advice on any of the above greatly appreciated.

Chris

edit: Storage box is a Sun 7110 Unified Storage.  1x1.90GHz Quad-Core AMD 8gb Ram.

All Replies

Posted by chrisrichardson on 24-Jul-2009 08:48

OK, just reading the manual - binary dump/load is indeed a single threaded operation.  Still, any info/tips regarding the config for multi-threaded setups would be very useful.

Chris

Posted by Admin on 24-Jul-2009 10:56

You d'ont specify the mos important thing: your Progress version,

The version >10.1A

-thread n

For databases with an Enterprise license, indicate if an online dump is threaded. Specify zero (0) for a single-threaded dump, one (1) for a threaded dump.

-threadnum nthreads

For a threaded dump, specify the maximum number of threads to create. The default value is the number of system CPUs. The actual number of threads created may be less than nthreads. PROUTIL DUMP determines the number of threads to create based on the complexity of the index the DUMP follows

also you can use:

proutil mydb –C dumpspecified order.orderdate LT 01-01-1998 .

with this option you can start more dump process  on the same table, in parallel.

Posted by chrisrichardson on 24-Jul-2009 11:16

Thanks.... forgot to add this in, the version is 10.2a.

Does the binary load process thread - or by design does it load sequentially to populate the records in order?

Posted by Admin on 24-Jul-2009 11:34

this is good question question for Gus, from this point only God can answer

Reading the documentation, I think that the "load" is a single-thread process.

Posted by ChUIMonster on 24-Jul-2009 12:07

Any individual proutil -load is single threaded.

But you can run multiple load processes simultaneously.  You probably only want one per storage area.

You might want to check out my "Highly Parallel Dump & Load" presentation from past Exchanges.  http://communities.progress.com/pcom/docs/DOC-15841

Posted by ChUIMonster on 24-Jul-2009 12:13

-spin is almost certainly way too high.

If, by 40000, you are referring to various bits of advice that try to relate -spin to number of processors you are on the wrong track.  That advice is not useful and has been retracted.

In Paris last summer Rich Banville gave a great presentation on latches and provided some benchmarks related to -spin on modern releases of Progress.  The essence of it is that much lower values than previously used are probably best.  Settings like 2000, 5000 and 10000 are probably reasonable.  The "best" value has nothing to do with number of processors -- if it were that simple it would be automatically set and we wouldn't need a parameter for it.

Having said all of that -- for most people who are not running benchmarks setting ti to some value such as 10000 and leaving it alone is almost always going to be perfectly fine.

Posted by ChUIMonster on 24-Jul-2009 12:19

-B is probably way too high for loading data.

I find that anything more than about 25000 for a load is, at best, a waste and at worst actually hurts.

When loading it is also very helpful to run with no-integrity (-i).  Yes, if the load crashes your db is hosed but if it crashes you probably don't really care, you're going to start over anyway.

It is also useful to set the bi cluster size to a largeish value (65536 for example) and to make sure that there are plenty of APWs (and a BIW) running.

NFS is something that I use with success on occasion when the source is a very old server with terrible IO -- but I wouldn't expect it to be a very good option any other time and I'd be very surprised if putting the new database on NFS were a good thing.

You can also make a huge difference dumping and loading by carefully designing the storage area configuration.

Posted by chrisrichardson on 25-Jul-2009 04:52

Thanks both for the advice....this is turning out to be on of those "tip of the iceburg" projects!  I am finding out that I'm only just scratching the surface of db tuning/design, regardless of the processor type.

Will let you know how I get on with the testing.  Thanks, Chris

Posted by ChUIMonster on 27-Jul-2009 09:20

You might want to consider

some experienced professional help

.

Posted by chrisrichardson on 31-Jul-2009 16:57

Does anyone know of any documentation about the thread-awareness of Openedge?  I am making the assumption that _progress process is thread-aware, without really knowing the multi-threading capabilities of OpenEdge.  Can anyone shed any light on this?

Thanks, Chris

Posted by ChUIMonster on 01-Aug-2009 07:21

_progres is aggressively single threaded.

Multi-threading in Progress is restricted to a few db utilities (proutil, dbtool).

Posted by Thomas Mercer-Hursh on 03-Aug-2009 11:04

http://www.oehive.org/MultiThread

This thread is closed