Index Efficiency: Unique or Not unique? - OpenEdge Development - Forum

Posted by S33 on 21-Oct-2014 09:45

If I recall correctly, there was a buzz a few versions ago, when Progress declared the ability to use more than 1 index to satisfy a query. Code like:

“for each cust where state = ‘OH’ by uid:”

“for each cust where city = ‘Chicago’ by state by uid:”

“for each cust where state = ‘OH’ and city = ‘Chicago’ by uid:”

“for each cust by state by city by uid:”

Would all be served by indexes:

Key1: state (non-unique)

Key2: city (non-unique

Key3: uid (unique, primary)

And this would avoid table scans and anything but efficient index scans. In (minimal) practice, I was never convinced/satisfied with the performance (at the time), and began following my own rule of making all indexes unique, by tacking the uid field(s) on the end. That seemed to do what I wanted (except of course in example #3 above, where I would have created a new index: “Key 4: state, city, uid”).

To make a long question short (if that’s still possible), what are the pros/cons and “best practices” today, regarding making all indexes unique?

TIA --

Jim Shepherd, Level 7

OpenEdge Development - Forum

All Replies

Posted by ChUIMonster on 21-Oct-2014 10:51

The case for resolving queries using multiple indexes was overblown.

The worst cases are those where people decided to try to define all of
their indexes with just a single field. That simply doesn't work.

In practice there are very few cases where multiple indexes are useful.
At least as far as I have seen.

In most actual business usage having a composite index with the
appropriate components is much, much better.

--
Tom Bascom
603 396 4886
tom@greenfieldtech.com

Posted by Mike Fechner on 21-Oct-2014 10:59

I don't see any value in making indices artificially unique by adding your uid field to it.

The Index on state is simply not unique.

Also for the compiler/dynamic query runtime index selection this won't make any difference - as you'd never specify and uid with a query for state and thus the index would always be considered like a non unique index when it comes to index selection/execution strategy.

The rules of combining indices are still like those described here:

knowledgebase.progress.com/.../000012195

Posted by TheMadDBA on 21-Oct-2014 11:45

One of the huge downsides of making your indexes unique is you make the indexes much larger than they need to be. This can cause performance issues all by itself depending on your table sizes/transaction volume.

Plus like Tom said the value of multiple index usage (especially with sorting) is vastly overblown. There are of course exceptions... especially if you end up at a place where they drank the Kool-Aid and put a ton of single column indexes out there.

Posted by Lieven De Foor on 21-Oct-2014 13:01

One benefit of artificially unique indexes is to provide a consistent sorting when displaying/iterating these records, so they could have some value...

Posted by TheMadDBA on 21-Oct-2014 13:20

Possibly, but in general a unique ID isn't going to mean anything to the end users or anything else observing the sorted results.

Posted by Lieven De Foor on 24-Oct-2014 09:37

I'm not saying you should ever show the unique id to the user, just using it to provide a consistent sorting without showing it would do.

Posted by TheMadDBA on 24-Oct-2014 11:35

But what would the point be if you never showed it? I guess I am missing the value of sorting it by a hidden value.

Posted by Thomas Mercer-Hursh on 24-Oct-2014 11:47

The value would be that the results would always be the same, thus not confusing the user. Admittedly, having the data sorted by a visible value tends to be the best way to do that, but I see a point in having a consistent order even when the records are not sorted by a visible value.

Think of trying to compare a list from today with one from yesterday to see what has changed. Easiest is if they are sorted by a visual value because one can focus on that field for additions and deletions. But, if there is no such sensible value, then having them consistently sorted would be tons easier than having them arbitrarily ordered.

Posted by TheMadDBA on 24-Oct-2014 12:59

I suppose so, but it still sounds like a bad design to me :)

Posted by agent_008_nl on 27-Oct-2014 01:43

Only on the question of should the primary unique key be natural or artificial. Whenever I made a recommendation it was the artificial key. Some thoughts: www.techrepublic.com/.../10-tips-for-choosing-between-a-surrogate-and-natural-primary-key

Moreover this artificial key is not a sequence but a guid (you can assign the guid on whatever client-type without a connection to the appserver etc.). I would prefer the

sequential guid www.codeproject.com/.../GUIDs-as-fast-primary-keys-under-multiple-database

www.siepman.nl/.../ID-Sequential-Guid-COMB-Vs-Int-Identity-using-Entity-Framework.aspx

as it gives beter performance, but we would need the clustered index in the rdbms for that.

Kind regards,

Stefan Houtzager

Houtzager ICT consultancy & development

www.linkedin.com/in/stefanhoutzager

This thread is closed