For each customer...

Posted by ojfoggin on 13-Oct-2009 09:25

If customer has 2 fields in it say mobno and telno (stored as chars for this example) and the telno field is indexed (the mobno field is not indexed).

If I want to find all the customers whose mobno is the same as their telno and I use this...

FOR EACH customer WHERE customer.telno = customer.mobno

NO-LOCK:

then it's not going to be able to use any index for that is it?

It will have to go through the entire database?

Is there any other way of getting this to work any quicker?

Thanks for the help!

All Replies

Posted by ChUIMonster on 13-Oct-2009 09:58

The query is going to have to look at every record in order to establish if mobno = telno.

Posted by ChUIMonster on 13-Oct-2009 10:16

To make it work quicker you want it to only scan the table as opposed to the whole database.  To ensure that that happens you very much want the table to be in a type 2 storage area.  That way only blocks that are relevant to the query will be scanned.

You also want to make each and every IO operation count for as much as possible -- so you want to pack those records as tightly as is reasonable (without causing excessive record fragmentation) into each block.  So you want to choose a rows per block setting for the storage area that will support that.  You can get quite complex about calculating the ideal but, for starters, I suggest dividing the block size by the average record size and using the next highest available rows per block as rough "rule of thumb",  Progress' semi-official suggestion (there is a white paper floating around out there) amounts to using the next lowest rows per block (in order to minimize the chances of fragmentation) but IMHO that is overly conservative in most cases.

Lastly you want to try to scan the data in the order that it is actually in the blocks.  That one is tricky and it depends on how the records get created.  But if you are going to do a table scan and the order of the result set does not matter you want to do it using the index that is least logically scattered.  Unfortunately determining which index that is is not always straightforward.

You also want to contact your sales person and ask them when NO-INDEX table scans will be available for the 4GL.  The SQL-92 engine has them and it is high time that the 4GL engine supported them.  This use case is exactly what they are good for.  Make sure that the sales weasel knows that this is a feature that a customer would find valuable (you might try linking it to any possibility of future commissions...) and make sure that they take it back to marketing and development.  Feel free to vote for the enhancement request in the enhancement request system and bend the ear of anyone else who will listen while you're at it.

Posted by ojfoggin on 13-Oct-2009 10:41

Sorry, I meant table not database

Thanks for the tips with this.  It will definitely come up in our next dev meeting.

Can I just ask an additional question

If I know that I only wat to check customers with certain aspects say customer who has customer.city = Leeds (and customer.city is indexed).

If I then did...

FOR EACH customer WHERE customer.city = "leeds"

AND customer.telno = customer.mobno

NO-LOCK:

Would this then just scan the customer returned by the city index?  Or would this again scan the entire table?  If this is the case then I'd be best trying to minimise the total number of customers left before going onto the telno = mobno line?

Posted by jmls on 13-Oct-2009 10:54

Depending on the scope of your design, you could add a logical field to the table called MobileSameAsHome which is updated whenever your data is saved to the db, and if that field is indexed you could simply say

for each customer where MobileSameAsHome eq yes no-lock:

end.

If you have more than 2 telephone number fields, consider an integer field (PhoneMatch) where each individual bit represents a match

Say you have home, work, and mobile

home = 1

work = 2

mobile = 4

therefore

home == work would be 3

home == mobile would be 5

home == work == mobile would be 7

work == mobile would be 6

you can then do a

for each customer where PhoneMatch eq 5

to find all records where home==mobile

etc etc

Posted by ChUIMonster on 13-Oct-2009 10:56

Yes.  If you can narrow it down ("bracket the query") then you can avoid the table scan, substantially* reduce IO and greatly improve performance.  You can tell if you have a bracket or not by compiling with XREF and looking for WHOLE-INDEX in the output.  You cannot, however, tell if the bracket is a "good" bracket or not without a detailed examination of the index selection and the data distribution within the table.  Checking for WHOLE-INDEX is just a red flag check.  (If you don't find it you aren't necessarily completely safe.  It's just a starting point.)

* Where "substantially" and "greatly" are fairly subjective terms .  It is, of course, possible that 98% of the table consists of persons in Leeds in which case bracketing on Leeds isn't going to get you much.

Posted by ojfoggin on 13-Oct-2009 10:57

That's not a ad idea

However, I think this is the first time I've come across this particular search and I think I've found a better way to index the search anyway (leaving around 30-50 records for the telno = mobno bit).

If this doesn't work though I'll defo have a look at the logical check.

Posted by ojfoggin on 13-Oct-2009 10:58

True true

I'll let you know how it goes.

About to run the report again.  If it take less than 2 and a half hours then it's better than it was (and I killed the last one after that time, it still hadn't finished).

Posted by jmls on 13-Oct-2009 11:02

No report should ever run for 2.5 hours

Posted by ojfoggin on 13-Oct-2009 11:04

LOL, I came to look at this today to fix a small issue with it and found that someone had started one running ... 4 days ago.

LOL, that's why I'm now trying to reindex it.

I did one the other day that was taking around an hour to run and by swapping a couple of lines around and refactoring the report I had it running in aroun 2 and a half seconds

Posted by ojfoggin on 13-Oct-2009 11:17

WOO!

It now runs in about 5 minutes (due to the first part of the report being stupid also).

Time to work on the first part of the report.

Stupid thing!

I just can't believe it has been unchanged since 1996?!?!

Has no one complained before?

Posted by jmls on 13-Oct-2009 11:21

Probably, but this is a typical situation where Progress (the language) will be blamed by bad developers.

"It's progress, it's not very good at reports. If we had Oracle ...."

I can't tell you how many times I've done some troubleshooting for a client and fixed things like this. The boss then says, "Oh, so-and-so told us it was down to Progress."

In one particular case, the company was about to ditch Progress because of poor performance.

This thread is closed