Compare SQL query with ABL query, usage of the indexes

Posted by Stefan Marquardt on 27-Feb-2019 12:12

I think I'm running a 100% identically query, one with SQL and the other in ABL.

TABLE has Index for LastChanged and FIELD. Both contains one field only

select * from TABLE WHERE LastChanged < '2019-01-03 18:17:31:491 + 01:00' and FIELD < 1000 order by LastChanged desc  (limited to 1000 records by SQL tool used -> Squirrel)
>> This uses the index of FIELD -> very bad because the order is by LastChanged!!

FOR EACH TABLE WHERE LastChanged < DATETIME-TZ('2019-01-03 18:17:31:491 + 01:00') AND FIELD < 1000 by LastChanged desc
>> This uses the index of LastChanged -> result very fast

What is the problem that the SQL query uses the wrong index and therefor runs endless on a large table (like fullscan)?

All Replies

Posted by Mike Fechner on 27-Feb-2019 13:08

Just a wild guess. Are the SQL Statistics up to date?

Posted by gus bjorklund on 27-Feb-2019 16:14

The SQL query processor uses completely different algorithms for deciding which indexes to use than the 4GL does. Among other things, it uses statistics of the number of rows, key distributions, and other info in its choices.

You can find out what the query tree looks like by querying the _Sql_Qplan table. its contents are slightly difficult to understand at firsat, but not too bad when you get used to it.

Posted by Richard Banville on 27-Feb-2019 16:20

Was the sql command issued from the ABL or did it connect to OE SQL

Posted by Stefan Marquardt on 28-Feb-2019 08:32

Thanks for your suggentions, very appreciated.

I ran the SQL with the JDBC tool Squirrel and after my post i tried statistics as Mike mentioned too

UPDATE STATISTICS

UPDATE INDEX STATISTICS

Then the Index for LastChanges was used and I got the results in the same time.

"The SQL query processor uses completely different algorithms for deciding which indexes to use than the 4GL does. Among other things, it uses statistics of the number of rows, key distributions, and other info in its choices."

As a SQL dummy I would prefer to ignore statistics and use the same logic as ABL does

This means SQL Performance is poor without running the maintenance commands?

Shall i run this one time a week (maybe Sunday without users)?

UPDATE TABLE STATISTICS AND INDEX STATISTICS AND ALL COLUMN STATISTICS;

Posted by Mohd Sayeed Akthar on 28-Feb-2019 09:43

Hi Stefan,

Please find the responses below.

>> This means SQL Performance is poor without running the maintenance commands?

In general yes, since SQL evaluates different execution plans for a statement and selects the one with lease cost. So, without statistics, cost of all execution plans might look alike.

However, starting from OE 12.0, SQL will be introducing a feature called "Autonomous Update Statistics" (AUS), which updates the statistics without user intervention.

>> Shall i run this one time a week (maybe Sunday without users)?

In general, statistics should be updated as and when good amount of data changes in tables. If it happened to be the case that, only 5% of total tables has data changes, then only those tables requires new statistics.

>> UPDATE TABLE STATISTICS AND INDEX STATISTICS AND ALL COLUMN STATISTICS;

Above command updates statistics for all tables, indexes and columns present in the database. Below command allows you to update statistics for specific table.

UPDATE TABLE STATISTICS AND INDEX STATISTICS AND ALL COLUMN STATISTICS FOR <table_name>;

Thanks,

Akthar.

Posted by Richard Banville on 28-Feb-2019 13:57

SQL stats can be maintained automatically in OE release 12.0 by a special server process.

Posted by Stefan Marquardt on 28-Feb-2019 14:15

I ran now the statistic command : UPDATE TABLE STATISTICS AND INDEX STATISTICS AND ALL COLUMN STATISTICS;

Then I have this SQL query, not very difficult:

SELECT field0 from table WHERE field1 = 201903
AND field2 = 'XXX'
AND field3 = 999
AND field4 = 'USD'
AND (field5 = 6100
 OR field5 = 6200)

Indexes: One for field 1, one for field2, one for field 3, one for field5 and many others

I started the query with my JDBC tool and I'm waiting still, protop shows it's using index for field3.

But field1 reduces the amount of records dramatically.

The same in ABL displays the values at once! Really, at once!
Using 3 Indexes of field1, field2 and field3

Meanwhile I got the result:
Query 1 of 1, Rows read: 42, Elapsed time (seconds) - Total: 1,649.513, SQL query: 0.026, Reading results: 1,649.487
Query Plan Shows the same as I got from protop (vst): Only one index is used, a bad one not efficent enough

Posted by Mohd Sayeed Akthar on 04-Mar-2019 13:22

Hi Stefan,

Clearly, SQL optimizer should have used index of field1, if it reduces number of DB reads. I see following two possibilities on why it did not pick that index.

1. Once we execute a query, the query plan (access path) will cached, so even after updating statistics, SQL might be using cached plan. In order for SQL to use new plan (based on statistics), we need to re-connect and run the query again.

2. There could be a bug in SQL optimizer, due to which it picked non-efficient index.

If above point #1 is not the reason, then, It would be appreciated (if possible), if you can share data definition (.df) and data (.d) for the above mentioned table. This might help us in understanding the issue and improve SQL access.

And on which version of OE that these queries are being exercised?

Thanks,

Akthar.

Posted by gus bjorklund on 04-Mar-2019 18:20

note that aside from the ordering available by the key. each index can have a radically different i/o cost, as much as 500 to 1 difference, depending on the blocks where the rows are stored as well as the (less important) seek time between index blocks.

this cost should be measured (probably again when significant changes happen to storage arrangement) on each deployed system.

Posted by Dmitri Levin on 06-Mar-2019 19:04

>SQL stats can be maintained automatically in OE release 12.0 by a special server process.

Thank you very much Richard !

What is the threshold when that "special" server process decides to automatically rebuild statistics? Is that 5% changed? And will it be described in details in 12.0 documentation ?

I had some problems with that process in Oracle. When a new partition starts on the 1-st day of the month statistics that is automatically generated is off, because there is not enough data in a partition. Oracle has a way to copy statistics, so I was copying statistics from the previous month partition to overwrite the generated one on the 1st and 2d day of the month.

So there may be issues with automatically generated statistics ( like with autonomous driving cars)

Posted by Thomas Mercer-Hursh on 06-Mar-2019 19:16

like with autonomous driving cars

You mean ... safer than leaving it to humans?  :)

Posted by Mohd Sayeed Akthar on 12-Mar-2019 05:04

Hi Dmitri Levin,

>> What is the threshold when that "special" server process decides to automatically rebuild statistics? Is that 5% changed? And will it be described in details in 12.0 documentation ?

Threshold percentage will be configurable. Yes, details will be described in 12.0 documentation.

Thanks,

Akthar.

Posted by Mohd Sayeed Akthar on 12-Mar-2019 05:04

Hi Dmitri Levin,

>> What is the threshold when that "special" server process decides to automatically rebuild statistics? Is that 5% changed? And will it be described in details in 12.0 documentation ?

Threshold percentage will be configurable. Yes, details will be described in 12.0 documentation.

Thanks,

Akthar.

This thread is closed