Query performance

Posted by Stefan Drissen on 19-Dec-2013 07:46

OpenEdge 10.2B08 / Windows 7 (laptop and desktop)

Progress database

The following dynamic query which is executed NO-LOCK

FOR EACH <table1> FIELDS ( some )
   WHERE <table1>.<key1> = 1
   AND   ...

, EACH <table2> FIELDS ()
   WHERE <table2>.<key1> = <table1>.<key1>
   AND   <table2>.<key2> = <table1>.<key2>
   AND   <table2>.<key3> = 1
, EACH <table3> FIELDS ( <field1> )
   WHERE <table3>.<key1> = <table2>.<key1>
   AND   <table3>.<key2> = <table2>.<key4>
BY <table1>.<key1>
BY <table1>.<key2>
BY <table1>.<key3>
BY <table3>.<field1>

The selection on <table1> returns approx 2000 records, for each of these records there at least one <table2> record and one <table3> record based on <table2>.

The last by results in client side sorting preventing forward-only from having any effect. It takes 22 seconds (of which 5 seconds building up result list in query-open) to fetch the records from a networked database with a big enough B from a Progress client. 

When executed on a shared-memory connection the query takes 0.1 seconds.

We all know that shared-memory connections are much faster than remote connections - but this much faster for something so trivial?

Excecuting a comparable remote SQL statement via JDBC that returns the same results takes 0.3 seconds. This shocked me.

SQL DataServer

When directly selecting from the SQL database remotely, results are returned within 0.1 seconds.

When using the Progress client in combination with the DataServer it takes the Progress client 18 seconds to get the result, if I forget forward-only it takes 22 seconds.

When using RUN STORED-PROCEDURE send-sql-statement LOAD-RESULT-INTO ht ( cquery ) the result is there within 0.1 seconds.

Conclusions

For reporting from a Progress client connected to SQL Server the send-sql-statement seems a necessity.

For reporting from a Progress client connected to a remote Progress database setting up an ODBC connection is very tempting but seems so silly - a send-sql-statement for non-DataServers could have simplified things...

All Replies

Posted by Thomas Mercer-Hursh on 19-Dec-2013 09:24

One of the things on my wish list is a SQL connection object.  There are times when the SQL set orientation is just the right thing.

Have you tried structuring the data retrieval as a dataset FILL().  It would be interesting to see that comparison.

Posted by Stefan Drissen on 19-Dec-2013 09:39

Agreed.

The dataset FILL seems slightly (a few %) quicker. It may make a bigger difference on a larger result set (this is a peanuts 2000 records)

Posted by Thomas Mercer-Hursh on 19-Dec-2013 09:54

A few percent quicker than the FOR EACH?

I suppose the other question here is what is the data being retrieved for?  E.g., is it something where the retrieval should be happening on an AppServer and only a summary result passed to the client?

Posted by Stefan Drissen on 19-Dec-2013 10:31

A few percent quicker than the dynamic query with:

DO WHILE hq:GET-NEXT( NO-LOCK ):

END.

The data needs to be fetched.

It is an AppServer.

AppServers do not always live on the same machine as the database. From a scalability point of view you can wonder if you even want the AppServers on your database server (can only scale up, not out).

When using SQL Server it matters little if the AppServer is on or off the SQL Server - the 'client' query performance is always comparable to remote OpenEdge databases.

Posted by Thomas Mercer-Hursh on 19-Dec-2013 10:47

There are differences of thought on the placement of AppServers and DBs.  I have known customers who routinely put the AppServer on a different box and would sometimes dedicate more than one box to the task for larger user groups.  That approach does allow the use of smaller and cheaper machines, but I think at a significant cost of performance.  I have since been convinced that using the same box is desirable whenever possible ... in fact, I recall some benchmark results with that customer which now remind me a great deal of the sorts you are getting.

Posted by Marko Myllymäki on 09-Jan-2014 09:45

Hi Stefan, I have also been shocked by the poor performance when using joins over a network.

If you have not seen this, I have made an enhancement request regarding this issue:

https://community.progress.com/businessusers/i/openedge_database_enhancements_-_tell_us_what_youd_like_to_see/add_server-side_joins__better_networking_performance.aspx

Alternatively (or additionally), being able to send sql statements or run stored procedures from within ABL against a Progress DB would be great to have. Nice to see I'm not the only one being shocked :-)

This thread is closed