Indexing and MATCHES

Posted by Admin on 09-Dec-2009 14:16

MATCHES never works on an index and so will always cause the full-table-scan (whole index).

CONTAINS uses a word index, BEGINS can use a standard index (when used on the last element, as it's effectively a range match).

Posted by ericg on 09-Dec-2009 14:55

Okay thanks. I have only 24,000 rows in a table that a user would perform a partial string search on a certain field (not in a grid but a back-end call). This search would be used frequently in the app. So would there be much of a performance difference between using a MATCHES and a word index if only 24,000 rows? I heard word indexes degrade performance when updating the table?

Posted by Admin on 09-Dec-2009 22:28

externalresearch schrieb:

Okay thanks. I have only 24,000 rows in a table that a user would perform a partial string search on a certain field (not in a grid but a back-end call). This search would be used frequently in the app. So would there be much of a performance difference between using a MATCHES and a word index if only 24,000 rows? I heard word indexes degrade performance when updating the table?

I need to say, a lot depends....

24.000 records is something, that when it is not passed across the network may be capable for a server (i.e. when the search is executed by a character client or AppServer process on the DB servers machine). Especially when this table is heavily used and so the chance that it's in -B is high - and the server has enough CPU time left :-)

In client server situations there will be a much bigger difference between running that query indexed or non-indexed. Also when the table grows and it won't stay in -B completely.

But there's also a difference in the search result - as contains only allows the asterisk at the end of a word:

CONTAINS "test*"

returns different results then

MATCHES "*test*" .

The MATCHES version will return more records as it also searches in the middle of a word, word index will only search at the beginning. So that may result in your users having a different preference.

Anyway, I'd take the potential performance loss because of the updates on that field out of the equation (unless there are 100s of updates every minute but the word index is only queried once a day).

Posted by aspotti on 10-Dec-2009 05:07

Hi Eric,

the matches generate always a Whole-index search on XREF analisys, but in some cases you can't do more to search against a field.

Aniway my suggestion is trying (as much as possible) to create a word-index in the DB-Schema for fields you have to search against and then use the CONTAINS statement within your query.

Keep in mind that in OE indexing algorithm the Word-index is the first that search engine use against a table, even earlier the primary index.

Some customer implements the BEGIN or the GE statement with at least three/four letters in their query and a standard index

sample: you may need the word "basketball" they implemented the query in this way => FOR EACH table WHERE table.field BEGINS "bas" or FOR EACH table WHERE table.field GE "bas".

but you know, this is valid if you know how the sentence starts.

By a performance perspective it depends mainly if you have to cope with all 24k records or just a subset of these in the query and also if all these records have to be moved back and forth through the LAN, but in anycase searching against strings is pretty much honerous.

Regards,

Alex

This thread is closed

All Replies