proutil bigrow performance in 11.6

Posted by Rob Fitzpatrick on 17-May-2017 15:55

Today I have been doing a little testing of proutil bigrow on various platforms in 11.6.0 and 11.6.3.  The results aren't what I expect.

The test:
- create a sports DB
- set BI cluster size to 32 GB
- bigrow to add another 4 clusters (writing 256 MB total to BI)
- delete database

I expect this to be fast in 11.6 due the changes made in 11.3 that the BI writes for bigrow should be buffered.  I tested on each platform both with and without -zextendSyncIO.

Results, in seconds, on Server 2012 R2 Datacenter (11.6.3, VM, SAN storage):
default:
26
25
25
with -zextendSyncIO:
26
26
25

Results on RHEL 6.3 (11.6.3, bare metal, local disks):
default:
9.14
8.54
7.25
9.55
with -zextendSyncIO:
9.07
9.88
9.78

Results on AIX 7.1.0.0 (11.6.0, bare metal, SAN storage):
default:
1.15
1.05
0.93
with -zextendSyncIO:
16.54
14.28
13.54

On AIX, the difference with and without -zextendSyncIO is dramatic, and is what I expect based on my understanding of the 11.3 enhancement.  On Windows and Linux I see no significant differences.

Does anyone else see these platform differences?  Or do you have different results you can share?  Should this work the same on all platforms?

Posted by mfurgal on 17-May-2017 16:28

Rob.  

Windows and Linux extend files unbuffered.  The Operating Systems do not have the support in place for them to be extended buffered, hence –zextendSyncIO will have no effect.

Mike
-- 
Mike Furgal
Director – Database and Pro2 Services
PROGRESS Bravepoint
678-225-6331 (office)
617-803-2870 (cell)


All Replies

Posted by mfurgal on 17-May-2017 16:28

Rob.  

Windows and Linux extend files unbuffered.  The Operating Systems do not have the support in place for them to be extended buffered, hence –zextendSyncIO will have no effect.

Mike
-- 
Mike Furgal
Director – Database and Pro2 Services
PROGRESS Bravepoint
678-225-6331 (office)
617-803-2870 (cell)


Posted by Rob Fitzpatrick on 17-May-2017 17:54

Thanks Mike for the explanation.

Posted by George Potemkin on 17-May-2017 23:45

Is there any difference between bigrow test and dd test?

dd if=/dev/zero of=./test.out bs=8k count=32768 oflag=dsync

Posted by mfurgal on 18-May-2017 07:22

I use the dd command as well on UNIX when I am told that it must be a Progress issues.  The problem is the osync, dsync options are not available on all platforms, but proutil –C bigrow is.

Me and my team have probably run the bigrow command on 500+ systems.  Based on this experience, we know without looking much further when the disk is the bottleneck.  My recommendation:  A minimum of 10 MB/Second sequential write speed when measured using the bigrow command.

Mike
-- 
Mike Furgal
Director – Database and Pro2 Services
PROGRESS Bravepoint
678-225-6331 (office)
617-803-2870 (cell)


Posted by ChUIMonster on 18-May-2017 07:37

Adding -r to the bigrow command seems to radically speed it up on Linux.

Posted by mfurgal on 18-May-2017 07:44

Yes, but what is the point?

If you want to measure disk speed, then adding “-r” defeats that purpose since all writes go to the OS cache.  However adding “-r” to the bigrow command to preformat clusters is a very useful trick.

Mike
-- 
Mike Furgal
Director – Database and Pro2 Services
PROGRESS Bravepoint
678-225-6331 (office)
617-803-2870 (cell)


Posted by Rob Fitzpatrick on 18-May-2017 07:47

Thanks for the replies.  My aim in this case was to use bigrow as a tool to measure I/O performance.  When I ran it on AIX and it was faster than I expected, it reminded me that I haven't been seeing the anticipated performance improvement on other platforms in 11.3+.  I'll just modify my shell script to always use -zextendSyncIO so the results are consistent between AIX and Linux.  

I'll compare it with George's dd command and see how they differ.

Posted by George Potemkin on 18-May-2017 08:01

Mike, I meant that bigrow /might/ use the additional calls - like sync() or fdatasync() calls. Can we expect that the results of bigrow and dd commands will always match one another? In other words, do both commands always test only write() system call?

Posted by gus bjorklund on 24-May-2017 07:04

> On May 18, 2017, at 9:04 AM, George Potemkin wrote:

>

> Can we expect that the results of bigrow and dd commands will always match one anothe

no. that cannot be guaranteed.

0) the dd program has different options on different systems. some do not have an option for synchronous writes.

1) to further complicate things, the behaviour of the O_SYNC and D_SYNC options to open() system call vary by filesystem type. ext4 appears to ignore them, at least according to my experiments on Centos 7.

3) if you use the dd program with /dev/zero as input, you may get a sparse file as output, depending on operating system and filesystem type. if so, nothing is written at all, just metadata, because the blocks with all zeros are optimised away.

the original purpose of the bigrow test was to check for potential disk write throughput shortcomings. for this purpose, it has worked rather well (though many arguments have ensued with storage IT people who think they know better). from long experience with hundreds of systems, we know that there was a strong correlation between long bigrow times and poor application/database performance.

unfortunately, with the changes in operating systems and filesystems, this has now become a bit harder to determine. i’ve been looking for another solution. so far i have nothing better.

This thread is closed