Low Performance on SAN

Posted by aszameitat on 05-Dec-2011 09:52

Many of our premier customers (esp. with Databases > 50 GB) want to change their DB-Servers with local disk  drives to state-of-the-art SAN drives.

We recommend fibre-channel connections with 4-8GB but are observing very low tps-Values with the ATM-Benchmark.
For comparability reasons we are still using the "old" benchmark with Type I Storage Areas and the listed parameters:

set NUMAPW=1
set BIW=yes
set AIW=no
:
: Server options
:
set SVOPT1=-n 20 -L 1024
set SVOPT2=-B 8000 -spin 4000
set SVOPT3=-bibufs 20
:
set CLOPT=-l 5

Whereas Systems with local drives can reach tps to 7500 the  maximum-values of SAN are about 3500 (using the same parameters of the  benchmark).  The SAN manufacturers state that the SAN isn't heavily  loaded during the tests, but the tps are very low.
We tried to use more harddrives in the SAN (16 in RAID 10 instead of 8 in RAID  10), but the tps didn't change. We used better FC-Controller with more  cache, but the tps didn't get better.
It seems that the period of time between writing some data to the  database and getting the response from the drive - "everything on the  disk" - is much slower on SAN. The tps difference on local drives  between  using write-cache and not using write-cache can also be factor  30 (e.g. 6000tps insted of 300tps on one of our customers systems).

Can anyone give us some hint how this can be explained and what the possible causes could be?
Is it possible to log something to get a clue why a 30.000 Euro SAN is much slower than a 3.000 Euro local RAID 10?
Are there some Parameters that have to be set when using the Progress Database on a SAN to improve thoughput?

I added an MS Excel-Chart that shows how the IOMeter-Benchmark raises when the number of disks ist raised, but the ATM-Benchmark almost stays the same.

Any hint could help us...

Thanks,

  Alex

SAN_PerformanceTest.xlsx

All Replies

Posted by Tim Kuehn on 05-Dec-2011 12:22

aszameitat wrote:

set SVOPT2=-B 8000

Assuming 8K block size, this is only 65MB. I'll bet your cache hit % is extremely poor too.

Depending on your available RAM, I'd bump this to a much larger number - say 400000 (or 3.2GB) at the minimum.

Posted by aszameitat on 06-Dec-2011 00:45

Thank you for your tip, and it's clear that the tps will raise if the -B-Parameter will be higher, because the effect of memory will grow. But we want to have a low effect of the memory to measure the capabilities of the disk subsystem.

Because the disk is much slower than the memory we want to concentrate on the disk first and if the disk is fast we go an tune the rest (on the production system) including the -B-Parameter.

The point is: I know how I can get more tps out of the benchmark (higher -B, higher biclustersize, more APW, Storage Area Type II, etc.), but I can't understand why the ATM benchmark doesn't get faster when the IOMeter-thoughput doubles or quadruples.

Posted by gus on 06-Dec-2011 08:43

Hard to say what is going on without more information. You have a pretty old version of the benchmark and I cannot remember too much about its default configuration. If the database you built is small enough, there will be lock contention.

There is a more recent version of atm on the OpenEdge > RDBMS page. It uses a "standard workload" I came up with some years ago as its default configuration. You can read about that in the atm documentation pdf.

Posted by Tim Kuehn on 06-Dec-2011 08:45

In that case I'd suggest talking to your SAN vendor, as they should already have whitepapers on getting the most out of your SAN.

My _suspicion_ is there's a setting somewhere which needs to be tweaked, or what you thought was a RAID 10 configuration is really some odd variation of RAID 5.

Posted by ChUIMonster on 21-Dec-2011 13:30

Looking at your spreadsheet I see two tests that involve internal disks.  One of those tests (line 14, "2nd server") seems roughly equal to the SAN performance -- the difference is very small.  The other seems a lot better.  In spite of appearing to have the same disk configuration.  So I would wonder "what's different?".

You also have an IOMeter test with "SVC" (line 11) that seems far superior to all of the others.  But no corresponding ATM testing.  Why is that?

Posted by aszameitat on 23-Dec-2011 01:30

Thank you for your contribution to the problem.

You're right: the special differences in the results of the lines 13 and 14 in the spreadsheet could be the key that leads us to some new information. The company that provided the two servers is investigating that point.

Perhaps it could give me informations about how to design an IOMeter-Test that "works" like a Progress ATM-Benchmark (that's one of my secondary goals).

We couldn't start an ATM-Benchmark on the machine in line 11 because this is a production-machine of another customer of the hardware-company, which - unfortunately - doesn't use progress. The test was made to demonstrate that a SAN with 32 disks is faster than one with 16 disks when looking at the IOMeter-Benchmark.

Posted by aszameitat on 23-Dec-2011 01:44

I took a look at the differences of the 5.1 Version of the ATM Benchmark compared to "my" version, but could only find "cosmetic" differences between the two versions and the fact that it uses 150 Clients and runs für 10 minutes per test.

So I believe that this test will not lead to very different results.

If we had a locking issue this should also occur on the test with local disks...

Do you know if a Progress Whitepaper exists that explaines on a very technical level how the data is written from the broker to the disk (for bi and db-Files)? Perhaps this information could help us to lead the hardware-vendor to the right point.

Posted by ChUIMonster on 28-Dec-2011 09:42

My best guess is that the SAN is not actually configured the way that you have been told that it is configured.  It is not exactly unheard of for SAN vendors to obfuscate the true configuration of their systems.  Getting a straight answer out of them is sort of like getting a congressman to explain his position on congressional pay raises.  The bit about "the SAN isn't heavily loaded during the tests" is interesting.  What proof do they have to offer and share?  Understanding why they say that may help to identify where the problem is.  (All too often these sorts of asssertions are based on aggregate measures of total SAN utilization over long periods of time -- IOW they do not look specifically at the LUNs that you are using during the test period, they just look at something like average utilization during the week of the test and tell you "everything is fine".)

Also -- is ATM actually the most relevant benchmark to your needs?  In my experience applications are more often constrained by read performance than transaction throughput.

Posted by gus on 03-Jan-2012 09:05

As you said, the code for the transactions is very much the same. The workload applied by the newer version is quite a bit different though, as is the database configuration.

There is not a paper that goes into detail of the low-level I/O operations, which vary a bit by operating system. I can summarise though. It is unlikely to help you much.

The RDBMS uses write-ahead logging with a buffer replacement policy of no-force, steal. Updates are performed in-place on buffers at the block level with log records for each change. Log records are spooled to the "before-image log"  and must be written to disk /first/ before a data block can be written to disk. No-force, steal means that modified blocks are not forced to disk at transaction commit and a block modified by one transaction can be replaced by another transaction after it has been written to disk.

Modified buffers that are not otherwise written are are periodcally forced to disk in asynchronous checkpoint cycles. Page-writer processes are used to schedule writes in an orderly fashion during a checkpoint cycle.

Writes to the before-image log are synchronous and writes to data extents are asynchronous but flushed at the end of a buffer pool checkpoint cycle.

Posted by gus on 03-Jan-2012 09:10

The atm benchmark is not like most applications but it is heavily I/O intensive when configured correctly. Though it writes a bit more than it reads, it has a high rate of random reads. So it should be at least somewhat meaningful when looking at storage subsystem I/O performance.

This thread is closed