Primary index

Posted by e.schutten on 02-Mar-2016 06:28

Hi,

What kind of fields are eligable for a primary index?

Fields (value) that doesn't change, like item number, order number?

Does OpenEdge stores the data in order of the primary index?

OpenEdge 10.2B

Kind Regards,

Edwin

All Replies

Posted by George Potemkin on 02-Mar-2016 06:45

> What kind of fields are eligable for a primary index?

The most frequently used index - the one that affects the read performance the most.  Primary index in Progress means a favorite index.

> Does OpenEdge stores the data in order of the primary index?

Not always. Only after D&L.

Posted by Mike Fechner on 02-Mar-2016 06:46

See knowledgebase.progress.com/.../000012195 for how Progress selects indexes.

Posted by e.schutten on 02-Mar-2016 07:21

> Does OpenEdge stores the data in order of the primary index?

> Only after D&L

So it would be wise to use a field that doesn't change. Because otherwise OpenEdge has to re-order every time the primary key change, correct?

Posted by George Potemkin on 02-Mar-2016 07:58

> So it would be wise to use a field that doesn't change. Because otherwise OpenEdge has to re-order every time the primary key change, correct?

No. It's totally unrelated to a choice of a primary index.

Think about the indexes as the walkways. If you don't have enough resources to keep all of them in the perfect conditions then you should choose the one that is just more important than others. You can choose a walkway that it will be easy personally for you to maintain. Or you can choose a walkway that is used by the most people.

Posted by e.schutten on 02-Mar-2016 09:15

Ok. Thanks for youre reply. Appreciate it!

Posted by Thomas Mercer-Hursh on 02-Mar-2016 09:59

A couple of comments.

Ultimately, a "primary" index is really nothing more than the default index when no other index is specified.  However, it is also often used to do things like enforcing uniqueness and such.

Historically, many ABL applications used something like a customer number as the primary index.  This works, if customer numbers are assigned serially and don't have any other meaning (like type of customer by range of number), but best practice indicates that the primary key for a record should be a meaningless field like a GUID or sequence.  That way, if there is ever a need to change one of the meaningful fields in the record, any connections based on that GUID still hold without repair.  

None of the indices has anything to do with the sequence of physical storage.  In a table with all additions and no deletions, the physical ordering *might* correspond to the order of creation, but you shouldn't count on it.   A dump and load will physically reorder the records, but that shouldn't be part of your design considerations since dumping and loading should be very rare.

Posted by George Potemkin on 02-Mar-2016 10:41

The article that Mike Fechner pointed out says:

<quote>

The compiler uses the following rules for ranking the available indexes when analyzing brackets and deciding which indexes to make use of:

1) An index that was specified in USE-INDEX

2) A unique index when all key components are used for equality matches.

3) The index with the most equality matches on leading index fields

4) The index with the most range matches

5) Word indexes referenced by the contains operator

6) The index with the most sort matches (BY)

7) A table's primary index

8) The index that comes first alphabetically, by index name

</quote>

If a primary index is unique then 2) will work and 7) will not (except when the WHERE clause is empty). In other words a compiler will not choose an index based on the "primary" attribute - like if the "primary" attribute does not exist at all.

Also the unique indexes used to be the bad candidates to read the bunch of the records. Probably it would be better if  the compiler will avoid to choose the unique indexes without equality matches - even when it's a primary index.

Posted by Dmitri Levin on 04-Mar-2016 15:55

The existence of primary index in OE is the result of the absence of cost based optimization. Some index has to be chosen for any 4GL query. Even if the table does not have an index at all, it will still have a designated "default" index that will be chosen for all queries. I believe it will be index based on recid.

For good written 4GL code it does not matter what index is Primary, because the choice should not come to number 7) in the quote above that George provided.

On a decent modern hardware with SSDs and huge memory available now last thing to worry is about the data physically stored in some index order. So the primary index choice boils down to what sorting order is most logically natural for the query without a WHERE condition.

FOR EACH table-name NO_LOCK:

Posted by gus on 04-Mar-2016 16:34

> On Mar 4, 2016, at 4:56 PM, Dmitri Levin wrote:

>

> The existence of primary index in OE is the result of the absence of cost based optimization.

With the utmost respect, I disagree. That is not at all true.

For a long time, it has been thought (by database theorists and the like) that a table /must/ have a primary key. We just did what was perceived to be right. But if we were starting over, I would not.

Posted by Thomas Mercer-Hursh on 04-Mar-2016 16:37

So, what would you do, starting over?

Posted by Dmitri Levin on 04-Mar-2016 16:43

Gus, I completely agree that a table /must/ have a primary key. In a good design.

But Primary index is not the same as a primary key.

Primary key has to be unique, and Primary index does not have to be unique.

Primary index is just simply a Favorite index.

When the engine chooses the index because it is the primary index ( rule # 7 above) it would be better not to use an index at all. I believe that is what cost based optimization would suggested.

This thread is closed