Storage Area configuration: What is the best approach?

Posted by davidkerkhofs3 on 27-Aug-2009 05:45

Hello,

I'm redesigning part of our databases into Type 2 storage areas and want to do the job correctly.

Following the guidelines from Progress' DBA handbook I should:

- skim off big tables (high on rows or high on volume) and give them (+ their indexes) their own SA

- put the rest in logical areas (ease of administration), divided into data SA and index SA

However, being in the luxurious position of having a production database with production data, I can dbanalys the database and get actual figures before designing the SA's.

Which brings me to the records per block setting for an area.

In Tom Bascom's document Highly parallel D & L you'll find that he organized the db so that tables that have the same RPB (= db block size - create limit / mean size) sit together in a SA.

Is this the best approach? Why doesn't Progress emphasize on the added value of this RPB setting?

When size does matter a DBA should not only be in the neighboorhood to watch my Oracle database. Progress needs some governance as well.

I can imagine that ease of administration is nice, but what do you get with a SA "orderdata" when it contains tables with a RPB ranging from 4 to 256?

Or am I making a too big a deal out of these settings?

Just make everything 128 (RPB, not the default btw) and 64 (cluster-size), with ease of administration in logical SA's, and only do the metrics for the big tables?

All suggestions / discussions welcome.

All Replies

Posted by ChUIMonster on 28-Aug-2009 11:29

Progress Corp doesn't run real-life production databases    "ease of administration" is all well and good but, in reality, I don't think that you gain any ease with the "functional" approach to storage areas.

The problem with picking a single rows per block size, such as 128, is that it may cause fragmentation of records that are "too large" for that size.  That will then result in more IO than you would otherwise have which harms performance.

Organizing storage areas around a common rows per block allows the best "packing" without risk of fragmentation.  This optimizes your IO operations and provides the best performance.  Personally, I'm fairly aggressive about this.  Getting as much data as possible into a block is more important to me than avoiding fragmentation -- so I usually take the next higher RPB than one would calculate based on the white paper that you might find.  (There are exceptions but, in general, I find that fragmentation is not a *big* problem for *most* tables in *most* databases.)

Posted by davidkerkhofs3 on 29-Aug-2009 14:55

Sounds like a practical answer; the ones we all love.

I'll give it a try and hopefully fragmentation doesn't hit me hard.

I'll let you know if it does.

Thx

Posted by davidkerkhofs3 on 31-Aug-2009 08:20

And what about the cluster size for indexes?

Is there any benefit in using 512 blocks / cluster for a single index?

The advise on sizing SA's in KB p81745 doesn't really address the difference between a large index and a large table with its indexes.

In reading the article I would conclude that I need to focus on large tables.

But I can imagine that indexes that use up a lot of space in a SA are best separated.

Is this a correct assumption?

So for ease of admin:

d "RPB128":7,128;64 .

d "MyBigTable":8,256;512 .

d "RPBIDX":17,1;8 .

d "MyBigTableIndexes":18,1;64 .

Let's say I have one very big index (> 1 GB) in MyBigTable. Shouldn't I best purge that one out of the MyBigTableIndexes SA and give it its own SA?

And, are the other indexes for MyBigTable still best served with their "MyBigTableIndexes" SA, or can I add them back to a catch all "RPBIDX" SA?

Posted by ChUIMonster on 31-Aug-2009 13:10

Indexes tend to be much smaller than the data tables that they index.  Furthermore in many cases where a table has multiple indexes there might be a mix of very small and relatively large indexes.  Run a dbanalys to see what I mean...

The upshot of that is that 512 is not usually a good cluster size for a storage area which holds indexes.  Every index will consume at least one cluster.  If you are using 8k blocks that will cost you 4GB per index.... In an empty database (more than one person has discovered this the hard way...).  Many (most) indexes will consume far less than a whole 512 block cluster.

Finally, indexes grow much more slowly than the corresponding data.  So the "less frequent space allocation" advantage to type 2 areas is less significant for indexes than it is for data.

Sooooo, IMHO:

1) split indexes from data, they should not share a storage area.

2) unless you have an index which you know is very large use 8 blocks per cluster for index areas, the waste for small indexes is minimal but you still get the advantages of a type 2 area

This thread is closed