Cluster Size 64 for Index Areas

Posted by James Palmer on 24-Nov-2014 06:25

Could someone who was at the DBA Cage Match (or someone who knows why) please explain again the reasoning behind setting the cluster size on index areas to 64? I was trying to explain to a colleague and realise it's not gone in and stayed in my head! 

All Replies

Posted by Stefan Marquardt on 24-Nov-2014 06:46

I deleted my first answer because I get mixed up with RPM, a higher value than 1 should be better to avoid large index areas if somebody adds an table by accident.

And I think the enhancement request was (wish):

Areas which are reserved by the DB engine for indexes only to avoid the mix index/table by a database setting.

As a result my first result was OT, sorry.

Posted by James Palmer on 24-Nov-2014 06:49

Thanks Stefan,

Yes I remember that point too. The Cluster Size of 64 for index areas was to do with splitting of indexes, but the exact explanation escapes me.

Posted by ChUIMonster on 24-Nov-2014 06:52

Accomodating the possibility for accidentally adding tables to an index area is the reasoning for setting rows per block > 1.  That is a somewhat different discussion.

The cluster size discussion was that when Progress needs to split an index block the engine needs to be certain that there are at least 16 free blocks.  With the cluster size set to 8 you might end up allocating 2 clusters on disk in the middle of a task that is already very time sensitive.

That would be unfortunate for the users who have to wait for that ;)


On 11/24/14, 7:27 AM, Stefan Marquardt wrote:
Reply by Stefan Marquardt

To avoid large index areas when somebody adds a table by accident

Stop receiving emails on this subject.

Flag this post as spam/abuse.



-- 
Tom Bascom
603 396 4886
tom@greenfieldtech.com

Posted by Paul Koufalis on 24-Nov-2014 06:54

If there are not enough free blocks in the cluster for an index split a new cluster must be added before the index split operation can finish. . This results in a couple of latches being held for a loooong time including MTX IIRC.

Posted by James Palmer on 24-Nov-2014 07:03

Thanks for the answers. And what scenario would trigger an index split?

Posted by ChUIMonster on 24-Nov-2014 07:09

As you add records indexes grow...  as they grow the blocks need to occasionally split to keep the b-tree balanced.

On 11/24/14, 8:03 AM, James Palmer wrote:
Reply by James Palmer

Thanks for the answers. And what scenario would trigger an index split?

Stop receiving emails on this subject.

Flag this post as spam/abuse.



-- 
Tom Bascom
603 396 4886
tom@greenfieldtech.com

Posted by James Palmer on 24-Nov-2014 07:19

Thanks Tom. Much appreciated. Makes sense again!

Posted by gus on 24-Nov-2014 11:11

@palmer: index entries are sorted. when you add a new one, there is only one place to put it. it normally has to go between two existing entries and if that block has insufficient room, it must be split into two and entry for new block added to parent. if parent has insufficient room then /it/ must be split too. but i see no need to have at least 16 free blocks.

Posted by ChUIMonster on 24-Nov-2014 11:19

As I recall it wasn't so much "16" as "more than 8". The maximum depth
of index blocks being 6 plays into it. You need to reserve that many
"just in case" (on both sides?) when a block splits. Rich explained it
much better than me.

--
Tom Bascom
603 396 4886
tom@greenfieldtech.com

Posted by James Palmer on 25-Nov-2014 04:27

[mention:9617a07f61934bc98f6fccb3b7fabfae:e9ed411860ed4f2ba0265705b8793d05] Thanks for replying. That helps to make sense of it.

Posted by George Potemkin on 27-Nov-2014 14:00

If I recall correctly "64" was mentioned as the max number of the index tree levels. But documentation says the max value is defined by the -cs parameter and its maximum is 256. BTW, the largest single index I saw was 0.5 TB in size and consisted of 5 levels only. Also Richard has mentioned a relation to some bug that was recently fixed. It could be the bug PSC00262763 fixed in V10.2B08 but I can be wrong.

Posted by Richard Banville on 01-Dec-2014 14:01

The maximum number of index b-tree levels currently supported in OpenEdge 11.5 is 16.  It has been this way for a very long time.  With the compression algorithms in our index b-trees, I'd be very interested to hear who has the deepest index b-tree to see if this should be improved on in the future.

The change requiring pre-allocation of free space just prior to an index block split was introduced 9 years ago in 10.0B04 and 10.1A to solve a bug that would cause the database to hang waiting on a buffer lock. The hang occurred due to a buffer locking protocol inconsistency/conflict when additional free space is required by an index object when in the middle of a block split operation.

Due to the highly concurrent optimistic locking protocol in place when traversing down the index b-tree (a technique  called "crabbing") there is currently no way to know how deep the index really is when beneath the root block since another user could have performed a split along a different path in the index b-tree which cascaded upward, above where you currently are in the tree and not involving any index buffers you have a lock on.  Of course this could be improved upon since at some point you have all the index blocks you need locked to perform the split operation but it is more complicated than just that.

There are many ways to solve this problem, some riskier than others, and I have no plans to debate the approach previously taken in this forum.  Suffice it to say that ensuring there are at least 16 blocks available to the index object just prior to the split operation ensures that it will complete without needing to request additional space from the storage area (or the file system).  You wouldn't want to be allocating new space when in the middle of split operation anyway because of the performance impact on the rest of the application while in the middle of a multi-block index split operation.

This explains why the suggestion for using a cluster size of 64 for index areas was made.  Allocation of space just prior to an index split operation will occur much less frequently than with a cluster size of 8.  How much it improves your performance is really dependent on your application and for many deployments as is often the case with such things, we are probably just splitting hairs here anyway.  However, changing the cluster size in an area containing data requires some type of data movement so it is best to use decent values to start with.

Posted by George Potemkin on 02-Dec-2014 03:17

> With the compression algorithms in our index b-trees, I'd be very interested to hear who has the deepest index b-tree to see if this should be improved on in the future.

I don't think that the number of index levels can be higher than 7.

For large tables we will get new level in index tree when the number of records will reach:

recs ~(1000-10000)^levels

where 1000-10000 is the average number of index keys per leaf block.

Max number of records (per SAT2) is 2^64.

2^64=1024^level => max level=6.4 and it's an overestimated value.

Posted by James Palmer on 02-Dec-2014 03:50

Probably me being dense but how do I see how many levels an index has?

Posted by James Palmer on 02-Dec-2014 04:03

Ignore me - I've found the relevant section in a DBAnalys.

Posted by James Palmer on 02-Dec-2014 04:11

[mention:7dc6ee8d33b54aa696ccb0bdd581e3b1:e9ed411860ed4f2ba0265705b8793d05] Our deepest level is 4 (based on a visual scan of the DBANALYS output). We've got a heck of a lot at 3. 

PUB.InternetPerformance
  InternetBudgetDataKeyIdx
                             612       2      4          23546   68.0M    37.1     2.3
  InternetPerformanceKeyIdx
                             611       1      3          11928   70.1M    75.5     1.5
  IpeRunDateIdx              613       3      4          27482   82.5M    38.6     2.2


Posted by George Potemkin on 02-Dec-2014 14:47

A few large tables and their indexes:

                                            -Record Size (B)-           ---Fragments--- Scatter

Table                     Records    Size   Min   Max  Mean                Count Factor  Factor
PUB.Table1           23121895150    2.2T    38 24707   104          23125036358    14.0     1.0
 
Table       Index  Fields Levels         Blocks    Size  % Util  Factor
PUB.Table1
  Index1       22       5      5      126307653  594.5G    61.9     1.8
  Index2       23       5      5      117620303  519.3G    58.1     1.8
-------------------------------------------------------
                                            -Record Size (B)-           ---Fragments--- Scatter
Table                     Records    Size   Min   Max  Mean                Count Factor  Factor
PUB.Table2            16210782602    2.0T    89 31975   137          16212363909    4.9     1.0
 
Table       Index  Fields Levels         Blocks    Size  % Util  Factor
PUB.Table2
  Index1      103       2      4        1674840    7.3G    57.2     1.9
  Index2      104       7      5       30604893  132.9G    57.1     1.9
  Index3      105       5      4        8992056   36.1G    52.8     1.9
  Index4      106       3      4        1347838    6.2G    60.3     1.8
  Index5      102       6      5       49186263  225.3G    60.2     1.8
-------------------------------------------------------
                                            -Record Size (B)-           ---Fragments--- Scatter
Table                     Records    Size   Min   Max  Mean                Count Factor  Factor
PUB.Table3           8644278754    1.3T    44 31966   170           8644502338    159.0     1.0
 
Table       Index  Fields Levels         Blocks    Size  % Util  Factor
PUB.Table3
  Index1        9       3      4        8335147   19.9G    31.4     2.4
  Index2       10       3      4        3982408   16.9G    55.7     1.9
  Index3        8       6      5       52376577  241.2G    60.6     1.8
  Index4       11       2      4       13891121   56.6G    53.6     1.9
-------------------------------------------------------
                                            -Record Size (B)-           ---Fragments--- Scatter
Table                     Records    Size   Min   Max  Mean                Count Factor  Factor
PUB.Table4           8835112454  598.5G    34 31903    72           8835474203    36.0     1.0
 
Table       Index  Fields Levels         Blocks    Size  % Util  Factor
PUB.Table4
  Index1      101       4      5       30743601  137.3G    58.7     1.8
  Index2      100       3      5       40316244  189.0G    61.6     1.8
 

Posted by George Potemkin on 04-Dec-2014 04:06

Found db with 6 levels in an index tree:

-Record Size (B)- ---Fragments--- Scatter
Table Records Size Min Max Mean Count Factor Factor
PUB.table3 8057647115 1.3T 47 25481 173 8057759806 2.1 1.0

Table Index Fields Levels Blocks Size % Util Factor
PUB.table3
Index1 9 3 5 13190287 18.4G 36.8 2.3
Index2 10 3 5 8302850 16.0G 50.8 2.0
Index3 8 6 6 92560806 219.1G 62.5 1.8

It's the different instance of the same table as Table3 above where:
Index3 8 6 5 52376577 241.2G 60.6 1.8

The difference is db blocksize: 4K vs 8K.

Posted by Richard Banville on 17-Feb-2015 14:35

I think that is the deepest one found so far! 
It does seem that a 16 level index as an absolute maximum is good enough.
It also seems that the worst case of a 16 level index many never occur in the real world at least not given today’s sizings.
Unfortunately, we often just deal with the theoretical absolute worst case as opposed to the probably or practical worst case.
 
[collapse]
From: George Potemkin [mailto:bounce-GeorgeP12@community.progress.com]
Sent: Thursday, December 04, 2014 5:07 AM
To: TU.OE.RDBMS@community.progress.com
Subject: RE: [Technical Users - OE RDBMS] Cluster Size 64 for Index Areas
 
Reply by George Potemkin

Found db with 6 levels in an index tree:

[/collapse]

Posted by gus on 20-Feb-2015 10:23

good that we have plenty of headroom so the practical worst case can increase.

This thread is closed