Checkpoint Duration

Posted by byoung2735 on 04-Feb-2015 10:32

Has anyone found a correlation between the Duration of a checkpoint with poor database performance? No buffers are being flushed during this period but occasionally we'll see longer 5+ second durations which match Sync Time, normally they are under 2 seconds. Solaris, 10.2B08. Thanks.

All Replies

Posted by James Palmer on 04-Feb-2015 10:35

How frequently are you checkpointing? As far as I know at a checkpoint all db actions are suspended whilst the checkpoint happens, so yes if your checkpoint is taking a lot of time then you will see poor performance.

Posted by TheMadDBA on 04-Feb-2015 12:42

Several seconds for a sync time is pretty bad and probably an indication of a hardware issue of some kind... disk configuration or swapping are the most likely candidates.

Check out the BI disks with iostat looking for queuing and/or latency issues. Make sure you aren't over committed on the memory side.

Posted by gus on 04-Feb-2015 13:00

nearly all of the I/O at the end of a checkpoint cycle is going to the data extents, NOT the bi extents. bi extents are written in synchronous mode. the data extents a fdatasync()'ed to get the changed data blocks written to disk from the operating system's filesystem cache if they have not already been written.

there are probably Solaris tuning parameters to control the pace of these writes by the filesystem. sso you want to tune those to get a state where when  we fdatasync() there is not much left to write.

Posted by gus on 04-Feb-2015 13:09

There is a correlation: while the data extents are being data sync()'ed, no database changes or transaction state changes can occur. i've seen it happen that in an extreme case this was consuming more than 25 percent of the total time.

Posted by George Potemkin on 04-Feb-2015 15:22

Does your database large? Are you using VxFS? Then set the write_throttle parameter.

Posted by George Potemkin on 04-Feb-2015 15:27

The customer of ours had the following picture before they set:

vxtunefs -o write_throttle=32 /mount

04/29/14        Checkpoints
 No. Time        Len   Freq   Dirty   CPT Q    Scan   APW Q Flushes   Duration  Sync Time
 241 14:05:51   4098   4532   33382    6512  103316     902       0       0.72       0.16
 240 12:47:43   4081   4688   10265    6658   50981     540       0       0.88       0.11
 239 12:12:21    217   2122   21277   20394   15303       9       0       1.45       0.90
 238 12:09:19    174    182   23497   23041    2227       1       0      20.92      20.20
 237 12:05:33    217    226   34923   34230    2024       0       0       0.72       0.09
 236 12:01:00    264    273   36302   35557    2898       0       0       8.48       7.92
 235 11:56:10    278    290   34283   33577    3321       0       0       5.77       5.22
 234 11:51:23    277    287   31426   30614    3302      21       0       0.95       0.37
 233 11:46:31    280    292   36103   35414    2816       2       0       0.67       0.12
 232 11:41:04    306    327   31330   30601    3646       0       0      12.17      11.61
 231 11:36:18    274    286   36795   35993    3042      65       0       1.06       0.51
 230 11:30:35    326    343   34089   33519    3449      74       0       0.86       0.34

After the change the Sync Time is not higher than 0.01-0.02 sec

Posted by byoung2735 on 25-Feb-2015 15:54

Thanks to all for the feedback. We are running ZFS, and after eliminating compression (which may have helped a bit), changing logbias to "throughput" from the default of "latency" has cut the checkpoint duration to about a quarter of what it was before.

Posted by shaske1 on 27-Feb-2015 13:02

To provide a bit more info here...  there is still room for improvement.

Original, pre-ZFS sync times were averaging ~50-60ms.  Prior to flipping the logbias we were getting daily averages around 1300-1400ms.  After switching logbias to throughput we're around 250-300ms.

There are some other ZFS options (dedup and compression) that we suspect might be contributing. 

(Edited to add: Compression was previously adjusted for BI filesystem, but not the data/index areas.  There is also an open question about whether or not a separate /bi disk pool or intent log (ZIL) device would be beneficial in the specific case.  There does not seem to be much guidance about tuning ZFS for Progress, so we have been taking cues from analogous Oracle DB best practices.  We are not using -directio on this database, but we are somewhat skeptical that it would improve the situation.) 

Disk in this case is local SSD.  (Pre-ZFS was SAN storage.)

This thread is closed