dump_d.p + bulkload vs binary dump and load

Posted by Grant Holman on 30-Nov-2015 04:36

Hi

My workplace has an automated script that is run periodically to do a complete dump and load using dump_d.p and bulkload.  I've used binary dump and load for this in the past but never bothered with bulkload as my understanding was that binary is the optimal choice.

Would there be a reason to choose bulkload over binary in certain circumstances?

Thanks

Posted by Grant Holman on 30-Nov-2015 09:45

Thanks for that James.  Interesting to hear that Type II needs no (or at least far less) dump and loading.

All Replies

Posted by James Palmer on 30-Nov-2015 04:59

Personally I'd always use binary for a full database dump and load. I've seen people using the bulkload for things such as sql permissions and user data from the vsts.

A slight aside, how frequently are you dumping and loading, why are you doing it, and what are you hoping to achieve. Are you using Type II storage, and what Progress version?

I suspect there is a high chance you are trying to fix a problem that no longer exists, or would be better solved another way.

Posted by Grant Holman on 30-Nov-2015 07:25

Hi James, this is something that's been in place for many years on a 9.1D system.  I don't know how frequently its run but would guess maybe every 6 - 12 months.  It's done for the usual scatter and fragmentation reasons.

Posted by James Palmer on 30-Nov-2015 07:30

I thought as much. If you're running a more modern version of Progress (10.x or later), and you should be moving that way if you're not already as 9.x is now unsupported, then your data should be in Type II storage areas. Type II storage areas completely remove the necessity to dump and load on such a regular basis. Ours hasn't been done for at least 5 years, and the only real reason I'm looking to do it now is to restructure the storage areas. Add to that, that in 11.x (not sure of exact version) you can compress indexes online, which pretty much does the same as an index rebuild, so you can maintain index fragmentation online as well.

Posted by George Potemkin on 30-Nov-2015 09:11

James,

> Ours hasn't been done for at least 5 years, and the only real reason I'm looking to do it now is to restructure the storage areas.

Enable the timing for dbanalys. Run ixanalys for a whole database. Then run tabanalys again for a whole database. Compare the times of the "area block analysis" phases for both runs. The results may surprise you.

Best regards,
George

Posted by Grant Holman on 30-Nov-2015 09:45

Thanks for that James.  Interesting to hear that Type II needs no (or at least far less) dump and loading.

Posted by George Potemkin on 30-Nov-2015 10:05

Nowadays it's much harder to do D&L: the databases are larger but the downtime for D&L is shrinking. And for SAT2 we don't have an indicator that would say it's time for D&L. SAT2 does reduce the needs for D&L. But SAT2 plus the optimal toss/create limits is the best solution. ;-)

Posted by George Potemkin on 30-Nov-2015 12:34

> you can compress indexes online, which pretty much does the same as an index rebuild

This is not entirely correct. Idxbuild rebuilds a free chain while idxcompact does not. In SAT2 the blocks that were in past owned by an index can be added to a free chain (for an unique index: to an index delete chain and then to a free chain) but later these free blocks can be re-used only by the same index. The order of blocks on the free chain does matter for the read speed. It's easy to test. Run idxifx/2. index scan /without/ record validation and compare its running time with the time of the "area block analysis" phase of ixanalys. Or make a copy of your production databases and use Data Dictionary to delete the indexes in the large index area. The clusters owned by the indexes will be added to the area's free clsuter chain. Check the time that dbanalys spends to scan the free cluster chain. Dbanalys reads only two blocks from each cluster: the first and the last ones. Most likely the time to scan the chain can be a few tens times slower than the time of the sequential area scan during the "area block analysis" phase. These tests have sense only for large indexes. Otherwise the filesystem cache will hide the difference. Also it's important to use a copy of real production database.

IIRC, idxcompact was introduced in V9.0A.

This thread is closed