table row count - OpenEdge General - Forum - Progress Community Archive

All Replies

Posted by Thomas Mercer-Hursh on 17-Jun-2008 15:01

Depends entirely on the indexing of the table. If you are essentially doing full table scans, there isn't much that SQL can do to help you.

Posted by Admin on 17-Jun-2008 15:12

Here is the issue I have, if I have a MS SQL table and run a count in 3.5 million rows it comes back as soon as the execute button is pushed (no index on the table)

In Progress, if I have 450k rows, it can take 20 minutes.

Is there a better way to get row counts back?

Posted by Thomas Mercer-Hursh on 17-Jun-2008 15:55

SELECT COUNT(*) tends to be one of the faster methods. One suspects that this is not an equivalent test. I.e., same box, equivalent tuning, etc.

In particular, if you have a nice high -B on the database, the first count may not be fast, but subsequent ones will be very fast because the data is in memory.

Posted by Admin on 18-Jun-2008 07:21

different boxes...yes, but neither is a dog. They are both production boxes. It worries me that a simple table count, with no filter, is a query that won't return in under 20 minutes. Is there anybody than can test this on their system to see how long a simple takes. I am new to progress and have nothing to compare it against. If any body can test it and let me know how many records and how long it would be very much appreciated

select count(*)
from XXXX

Posted by Tim Kuehn on 18-Jun-2008 07:31

Posted by bwerne on 18-Jun-2008 10:02

Both Tim and Thomas have identified the underlying circumstance in OpenEdge and that indexing may give you better performance. As they mentioned, MSSQL and Oracle maintain this information so it is available to you instantly from the other vendors.

As you may know, to maintain OpenEdge SQL query performance, it is necessary for the DBA to run the 'Update Statistics' commands periodically to generate statistical information used by the optimizer.

Keeping statistical information like this online as part of the engine is on the OpenEdge roadmap. Maintaining these type of approximate statistics online would help with TCO for OpenEdge and avoid having to do the manual process periodically. A good thing all round.

The secondary benefit of "statistics online" is that the same information necessary for SQL optimization/performance could also be leveraged in other areas like database utilities and select count(*).

Posted by Admin on 18-Jun-2008 10:43

Thank you for the information. Very helpful. The problem I am going to have is creating indexes. The overlaying application is very much against me creating new indexes because they claim it will cause a need to recompile the application. This I don't understand but I asked for clarification. Removing an index can definetly cause a problem but adding one should do nothing from an application stand point. Is this also a Progress issue that somebody can shed some light on? I am leaning towards the application just wanting me to stay out of the database

My other option is to move the needed data out of Progress and into MS SQL on a regular basis and manipulate it as needed. Is there a recommend method to do that? I need the data from 11 tables to run reports. Without the index, the reports can't be run

Posted by Thomas Mercer-Hursh on 18-Jun-2008 11:53

One of the big questions here is whether the application is performing well from the ABL side. It is fairly common to have a Progress database that is not well tuned work just fine for inputing individual transactions, but then to show its poor tuning when one tries big reports or, like you, tries to read all of the records in a good sized table.

While it is true that some other vendors maintain some on-line record counts and thus can produce instant responses, that doesn't really apply if you are counting a subset based on some arbitrary selection criteria ... everybody needs to read records to do that.

Certainly the first thing I would do is UPDATE STATISTICS. Then I would suggest finding out the server startup parameters and post those here to get suggestions for tuning ... or perhaps you should start a new thread in the DB forum for that in order to get the right audience. When you post those, give as much information as you can about the hardware, configuration, progress version and any measurements you have of disk activity.

As a baseline, I just tried a select count(*) on a table with 1.9 million records and it took 45 seconds. This is a development database with minimal tuning, but in a session on the same box as the DB, i.e., no network.

Posted by Admin on 19-Jun-2008 14:12

I am getting a very big pushback from the application owner that if I ADD an index to a table it will affect the application. From my standpoint, coming mainly out of SQL/Oracle world, that sounds like BS to me. You can argue the point that it might slow down writes slightly, but I am not worried about that at this time. I am worried that I can't run queries against my data.

From a Progress stand point, does this "affect the application" make any sense to anybody? Is this truely a Progress issue that adding an index will mess up an application? They went as far to say that if I add an index they will need to recomplie the application.

Posted by Thomas Mercer-Hursh on 19-Jun-2008 14:32

Adding an index can require a recompile. That is true, but it also isn't a big deal. Tools exist for doing intelligent recompiles based on the table, but frankly most applications can do a stem to stern recompile in not very long anyway. There is some potential impact on writes, if one adds a lot of indexes, but I'll bet it is rare that anyone can measure it. There is also some risk that a program will change behavior because a different index gets chosen ... shouldn't happen if the application is coded right, but we all know that isn't always the case.

But, I think the real issue here is that the numbers you are getting are way off scale, as illustrated by the figures I quoted. Mine are from a little desktop Linux box, single core, if I remember correctly. So, something is wrong in your environment. Focus on figuring out what that is first and you may find that you don't really need any changes. If you are just reporting and you can get speeds like the one I quoted for a whole table scan, then it probably doesn't matter whether you have the index or not. Conversely, if you are getting those kinds of numbers from other factors, then adding the index might not help.

All this is testable and analyzable. No need to guess.

Posted by Admin on 19-Jun-2008 14:38

Thomas - Thank you. I needed to hear what you said to make sure I wasn't crazy. If the application is coded correctly nothing should change. This application isn't a small shop application that I am talking about. It should be able to handle large loads in a corporate environment. I agree about writes, as long as I don't add 10 indexes to 1 table I shouldn't have a problem

I fully agree there is a problem with the database. An index is a band aid to get around the problem, nothing else.

Thanks again for the help

Posted by Thomas Mercer-Hursh on 19-Jun-2008 15:07

And, it is a bandaid that might not help.

Posted by Admin on 19-Jun-2008 20:03

I am being told that the application is compiled against the exact version of the database. if I add an index to a table, the table will have a different CRC and the application will no longer work with that table. Does Progress have CRCs on their tables that could break an application??? i am thinking I am getting the big run around

Posted by kevin_saunders on 20-Jun-2008 02:36

Posted by Thomas Mercer-Hursh on 20-Jun-2008 11:19

So, partly you are getting a run around since there are changes you could make which would not require a recompile, but partly you aren't since changing the index structure could change the behavior of the application, either immediately with dynamic queries or later when a program is recompiled, if it then picks a different index. I.e., one doesn't change data structure willy-nilly without some checking and testing and awareness, but by the same token there is no reason to freeze it in concrete is there are good reasons for change.

If there are good reasons for you to want to make a selection from a large table based on a field which is not currently indexed, it seems likely that somewhere else in the application that need would also exist and might benefit from the same index. Even if it is only for reporting, that is still a part of the application requirements and should be considered.

But, before you get very excited about whether you need this index or not and whether they will let you add it or not, you need to take care of figuring out the background problem. 20 minute table scans on tables that size indicate that there is something abnormal going on and since it is abnormal, an index may do nothing to help. But, fix whatever the problem is that is producing the long scans and you may find that the time to return a result is so short that you don't care about the index.

Have you done the UPDATE STATISTICS yet? Bumped up the fetch limit? Are you using READ UNCOMMITTED?

Posted by Admin on 20-Jun-2008 11:56

I have not updated the stats yet but I am planning on it. I asked the software vendor first about it because I don't want them coming back at this point saying I did something wrong.

I did run some more tests. I did a select off of the table, no WHERE clause, but I ordered by the PK. The query comes back quickly if the sort is ascending. I waited 5 minutes before I killed it for descending order.

Posted by Thomas Mercer-Hursh on 20-Jun-2008 12:15

UPDATE STATISTICS will only impact SQL, not ABL access (more's the pity).

Check out the documentation for server start up parameters that begin with -SQL ... the slow sort performance suggests that you aren't giving it enough room to work in, although you should be able to reverse sort on a field without a sort, if you know what I mean.

If you are getting rapid response from the whole table without the WHERE, then it seems likely that you are choking on the disk space to do the sort and select. It has been a while since I paid attention to these parameters ... although I need to do so again soon ... but there is one that is something like SQLTempPgSize which gives you sort space.

And, up the fetch limit in your ODBC connection.

You might have to move this discussion to the DB forum to get the best input on server parameters. But, the most important thing is to be systematic in your diagnosis. I.e.,

1. Start out by doing all the No Brainer things like READ UNCOMMITTED, UPDATE STATISTICS, fetch limit etc.

2. Make careful tests and observations at all stages. E.g., if a query is taking a while coming back, what is going on during that period? Are the server disks being hammered? Is it the database or temp space? How different is the speed if you do the same query from the server box? How is the ABL performance for something similar? Etc., Etc.

Posted by Admin on 20-Jun-2008 14:23

This is a dumb question but where do I find the startup parameters for Progress?

Posted by Thomas Mercer-Hursh on 20-Jun-2008 14:48

Ask the DB Admin?

If Unix, there is probably a parameter file, possibly with a .pf extension, but it would be referenced in the script that starts the database.

If you look in the .lg file you will see the values.

Posted by Admin on 20-Jun-2008 15:00

if [

$ -eq 0 ] && [ -f $/ifaces.stop
]

style="FONT-SIZE: 10pt; COLOR: navy; FONT-FAMILY: Arial">then

style="FONT-SIZE: 10pt; COLOR: navy; FONT-FAMILY: Arial">echo
"...Removing interface lock file:"

style="FONT-SIZE: 10pt; COLOR: navy; FONT-FAMILY: Arial">rm -f
$/ifaces.stop 2> /dev/nullcd

Posted by Thomas Mercer-Hursh on 20-Jun-2008 15:53

The key here is proadsv, which is another shell script in $DLC/bin ... but reading it won't tell you much either. It seems you are using the Admin Server, which I don't use, so you may have to start a thread on the DB forum. You can look in $.lg to see the values, but it might be a good idea to just start a thread in the DB forum based on the idea that you are having SQL performance problems and here's current settings and characteristics.

There are a lot of sites where the DBA is part time or outsourced because Progress takes a lot less management than most other databases ... but that also often means there is no one around who knows much. But, whoever setup the Admin Server parameters should be able to tell you what parameters are being used.

Posted by jtownsen on 23-Jun-2008 12:53

Given that the Admin Server is being used, you could use the Progress Explorere (GUI tool - Windows only) to modify the configuration. Alternatively, you can look in $DLC/properties for the conmgr.properties file and manually update the values. Using Progress Explorer is definitely easier though!

Posted by Thomas Mercer-Hursh on 23-Jun-2008 13:26

This thread has been taken up on the DB forum for a more appropriate audience.

Posted by Admin on 25-Jun-2008 10:52

I did the best thing. I have the software vendor's DBA doing a health check. Thank you for your help

I do have a question...what is the windows software of choice to use on a progress database for admin and developer work. winsql??

Posted by jtownsen on 26-Jun-2008 02:57

At the risk of this sounding like a sales pitch (which it's

not meant to be)...

Posted by Admin on 26-Jun-2008 07:19

I agree and that is what is exactly what I am doing. In fact, that is why I stepped back from advice here and I am letting the software vendor do it since nobody knows their software better than them.

I still would like to know what windows based tools people are using? I need some better tools than access or MS query tool

Posted by Thomas Mercer-Hursh on 26-Jun-2008 11:35

Posted by Admin on 27-Jun-2008 15:18

Update stats - I assume I can run this on one table at a time? Would the query be "update statisics <

>"?

Is there anything I should be worry about when I am doing this?

Posted by Thomas Mercer-Hursh on 27-Jun-2008 15:22

UPDATE ( TABLE STATISTICS )

... ;

Posted by Admin on 27-Jun-2008 15:29

So jsut a warning, I am very new to progress. Anyways, if the data has no index and you wish to count/filter on a column: Instead of addin g an index to teh tables and risking problems, can't you just make a TempTable with the index's you want and load the data into teh temp table. That way you can index as needed to spead of your report without having to mess with the original data layout.

Posted by Admin on 27-Jun-2008 15:31

I could do that but how do I select a subset of data into a temp table without the intial index? I have no idea how long an entire table select into temp would take but that sounds like a very long work around compared to adding an index.

Posted by Admin on 27-Jun-2008 15:40

I am assuming I only need to do indexes and not columns (update stats)? Do I need to do both?

Posted by Thomas Mercer-Hursh on 27-Jun-2008 15:41

For the problem you have described, this is not any help. Your issue is just getting the data from the table in the first place. Temp-tables are very helpful when reporting when one has situations such as "select all open invoices for customer X", but then the report is supposed to be sorted by dollar value or something.

Posted by Admin on 27-Jun-2008 15:59

Is there a rule of thumb on how long it takes to update stats. I have done it on MS SQL and I hated it because it took a long time. I am thinking I need to do this one table at a time but what would happen if I only got through half of the DB in a night?

Posted by Thomas Mercer-Hursh on 27-Jun-2008 16:11

There are too many variables and unknowns to make a guess. I would consider creating a script to feed into sqlexp and run some tests.

Posted by ChUIMonster on 28-Jun-2008 11:44

At the risk of sounding like a sales pitch... the vendor's DBA isn't always the best source for advice. Particularly not if that was also the source of the misinformation about indexes and CRCs. There's a lot to be said for an independent 3rd party point of view.

Posted by Admin on 28-Jun-2008 15:05

I agree. I am giving them a shot first but I am planning for a 3rd party also

This thread is closed