Has anyone found a correlation between the Duration of a checkpoint with poor database performance? No buffers are being flushed during this period but occasionally we'll see longer 5+ second durations which match Sync Time, normally they are under 2 seconds. Solaris, 10.2B08. Thanks.
How frequently are you checkpointing? As far as I know at a checkpoint all db actions are suspended whilst the checkpoint happens, so yes if your checkpoint is taking a lot of time then you will see poor performance.
Several seconds for a sync time is pretty bad and probably an indication of a hardware issue of some kind... disk configuration or swapping are the most likely candidates.
Check out the BI disks with iostat looking for queuing and/or latency issues. Make sure you aren't over committed on the memory side.
nearly all of the I/O at the end of a checkpoint cycle is going to the data extents, NOT the bi extents. bi extents are written in synchronous mode. the data extents a fdatasync()'ed to get the changed data blocks written to disk from the operating system's filesystem cache if they have not already been written.
there are probably Solaris tuning parameters to control the pace of these writes by the filesystem. sso you want to tune those to get a state where when we fdatasync() there is not much left to write.
There is a correlation: while the data extents are being data sync()'ed, no database changes or transaction state changes can occur. i've seen it happen that in an extreme case this was consuming more than 25 percent of the total time.
Does your database large? Are you using VxFS? Then set the write_throttle parameter.
The customer of ours had the following picture before they set:
vxtunefs -o write_throttle=32 /mount
04/29/14 Checkpoints No. Time Len Freq Dirty CPT Q Scan APW Q Flushes Duration Sync Time 241 14:05:51 4098 4532 33382 6512 103316 902 0 0.72 0.16 240 12:47:43 4081 4688 10265 6658 50981 540 0 0.88 0.11 239 12:12:21 217 2122 21277 20394 15303 9 0 1.45 0.90 238 12:09:19 174 182 23497 23041 2227 1 0 20.92 20.20 237 12:05:33 217 226 34923 34230 2024 0 0 0.72 0.09 236 12:01:00 264 273 36302 35557 2898 0 0 8.48 7.92 235 11:56:10 278 290 34283 33577 3321 0 0 5.77 5.22 234 11:51:23 277 287 31426 30614 3302 21 0 0.95 0.37 233 11:46:31 280 292 36103 35414 2816 2 0 0.67 0.12 232 11:41:04 306 327 31330 30601 3646 0 0 12.17 11.61 231 11:36:18 274 286 36795 35993 3042 65 0 1.06 0.51 230 11:30:35 326 343 34089 33519 3449 74 0 0.86 0.34
After the change the Sync Time is not higher than 0.01-0.02 sec
Thanks to all for the feedback. We are running ZFS, and after eliminating compression (which may have helped a bit), changing logbias to "throughput" from the default of "latency" has cut the checkpoint duration to about a quarter of what it was before.
To provide a bit more info here... there is still room for improvement.
Original, pre-ZFS sync times were averaging ~50-60ms. Prior to flipping the logbias we were getting daily averages around 1300-1400ms. After switching logbias to throughput we're around 250-300ms.
There are some other ZFS options (dedup and compression) that we suspect might be contributing.
(Edited to add: Compression was previously adjusted for BI filesystem, but not the data/index areas. There is also an open question about whether or not a separate /bi disk pool or intent log (ZIL) device would be beneficial in the specific case. There does not seem to be much guidance about tuning ZFS for Progress, so we have been taking cues from analogous Oracle DB best practices. We are not using -directio on this database, but we are somewhat skeptical that it would improve the situation.)
Disk in this case is local SSD. (Pre-ZFS was SAN storage.)