Odd Performance Question

Posted by Jeff Ledbetter on 06-Jul-2016 07:36

Hi.

We have a table with 4 key fields and a RAW field (16k max size).

We need to blast through all of those records and update one of the key fields. Ok, that's simple.  I'll just do a FOR..EACH.

What I am surprised by is the performance. There are only 300 records in the table but it takes at least 11-12 seconds to update them all. If there is no update, it's only a matter of milliseconds to read them.

With a RAW field in the table, is there some extra overhead for for an update (even though the RAW field itself is not getting updated)?

Thanks.

All Replies

Posted by Rob Fitzpatrick on 06-Jul-2016 07:45

Is this client/server or self-service?

Is it faster if you use the FIELDS option with just the key fields?

Posted by Brian K. Maher on 06-Jul-2016 07:46

Jeff,
 
Are you updating a field that is included in the index being used by the for each?  If so, that may be the problem.  We walk “live” through the index so if you modify the fields in the index you can cause a loop.
 
Classic example:
 
for each customer:
    customer.custnum = customer.custnum + 10000.
end.
 
Brian

Posted by marian.edu on 06-Jul-2016 07:50

That’s probably what it is, if you can’t use full table scan see if there is another index you might use…


Marian Edu

Acorn IT 
+40 740 036 212

Posted by Jeff Ledbetter on 06-Jul-2016 07:54

Rob, I tried both CS and direct connect. I also tried a DEFINE QUERY with FIELDS and did not see a difference.

Posted by Jeff Ledbetter on 06-Jul-2016 07:57

Brian, yes it is a key field that I am updating. It is not explicitly used in the FOR..EACH but there is only a single index.

Is the better option to add the index as inactive, update the data, and then activate the index later? Or, is there another option that I am overlooking?

We have a similar table with text data instead of RAW and it is much much faster. The records are smaller but I was still surprised at the difference.

Posted by Roger Blanchard on 06-Jul-2016 07:58

PRESELECT EACH ?

Posted by Jeff Ledbetter on 06-Jul-2016 07:58

Marian, it's a FOR..EACH with no WHERE clause so I assume it was doing a full table scan but I guess it is probably using the only index. I tried adding the TABLE-SCAN option but didn't see anything groundbreaking with performance.

Posted by Tim Kuehn on 06-Jul-2016 08:00

tx size? One per record or all at once?

Posted by Brian K. Maher on 06-Jul-2016 08:02

Jeff,
 
Use the TABLE-SCAN option.
 
Brian

Posted by Brian K. Maher on 06-Jul-2016 08:04

Jeff,
 
The key thing to remember is that by default you are walking down thru an index, if you modify a record and change a value of a field that is part of that index it is possible for you to move the updated index further down in the index tree such that you will end up reading it again (see my example code from my first post).  I really think this is what is happening.
 
Brian

Posted by marian.edu on 06-Jul-2016 08:06

With no where it’s using the default index but if you say it doesn’t make any difference when using table-scan then the only thing might be the updated record doesn’t fit in the allocated space and has to be moved… what data type is that field that you’re updating?



Marian Edu

Acorn IT 
+40 740 036 212

Posted by Jeff Ledbetter on 06-Jul-2016 08:22

Makes sense. I'll try the TABLE-SCAN option again. Perhaps I was doing it wrong. I don't think I will be able to use it since my code has to work with 10.2B and higher but I am curious. :)

Posted by Jeff Ledbetter on 06-Jul-2016 08:23

Marian, just a single character field.  If I do the same thing on a similar table with no RAW data, there's no real performance issue. It must be moving the RAW data around...

Posted by James Palmer on 06-Jul-2016 08:36

TABLE-SCAN presupposes Type II storage. If it's Type I then it just selects the default index to traverse. Worth pointing out.

Posted by James Palmer on 06-Jul-2016 08:38

If there aren't many records involved, can you cache the records' rowids you want to update in a temp-table and then find them individually to update?

Posted by marian.edu on 06-Jul-2016 08:50

Hey Jeff, not much of a dba but I guess if the record doesn’t fit in the previously allocated space it has to be moved around… a character is a variable size so it’s the perfect candidate when it comes of increasing the record size :(


I suspect there might be some database settings that can help heal the pain there, or maybe you just move the blob field in another storage area?

Else when table-scan is not available you might try to use a simple find/while available loop instead…

find first table where pk no-error.
if available table update pk.
else leave.


Marian Edu

Acorn IT 
+40 740 036 212

Posted by jamesmc on 06-Jul-2016 09:50

I'm not too up on newer ways of doing this (still on 10.2B) but in this scenario I would do the following:

DEFINE BUFFER bTable FOR table.

FOR EACH table NO-LOCK:

    FIND bTable
        WHERE ROWID(bTable) = ROWID(table)
            EXCLUSIVE-LOCK NO-ERROR NO-WAIT.

IF AVAILABLE bTable THEN ASSIGN bTable.fieldName = "NewValue". END.

HTH

Posted by James Palmer on 06-Jul-2016 10:42

Unfortunately that won't help in the case where the field name is in the index used for the original for each as it will still be updating the field in the index and will still have the original problem.

Posted by peggycole on 11-Jul-2016 02:07

Maybe there are some write-triggers active on that table?

Posted by George Potemkin on 12-Jul-2016 02:22

I have TestStat.i code that was written for my tests:

{TestStat.i}

RUN GetStat(1).
/* Insert a code to test */
RUN GetStat(2).
RUN ReportStat(1 /*vRepeats*/).

ReportStat reports all database resources used by a code between points 1 and 2.

I has uploaded TestStat.i on ftp:

ftp://ftp.progress-tech.ru/pub/Users/george/Programs/TestStat.i

P.S. Forum seemed to use the worse editor in the World! I [:S]

Posted by Jeff Ledbetter on 04-Aug-2016 13:16

I am running this on a larger database now and can't believe how slow it is. There are only 10,000 records in this table, and the only index is disabled.

I am reading the ROWIDs from a text file (dumped beforehand), finding the record by ROWID, and then updating a field with a value.

For a very similar table with hundreds of thousands of records (but no RAW data), it's fairly immediate. But for the table with RAW data, it took 10 minutes do it.

:-\

Posted by Keith Sudbury on 04-Aug-2016 13:50

If the definitions of the table aren't too sensitive.. could you post the definition here? Mainly looking for validation expressions, triggers or any other things special about this table.

I didn't have any luck reproducing the issue on my side. I am assuming you are running a relatively recent version of OE?

Posted by Fernando Souza on 04-Aug-2016 14:00

There is no extra processing on the RAW field. What can impact is how much data there is already in the record, where the field you are updating is physically in the record, and how much data needs to be shifted to allow space for the new field value. When you read a record, we copy it into memory into the record buffer. When you update it, we need to reallocate memory for that record buffer and depending on the memory layout, the system may need to get a bigger memory block somewhere else in the  memory space, so there could be data being copied twice. I wonder if you would see a difference if the raw field was physically before the fields you are updating.  

With all that said, 10 minutes does seem excessive.

Posted by Jeff Ledbetter on 04-Aug-2016 14:10

Keith, there are no triggers or validation. It's just 5 fields and a RAW field.

Hi Fernando. There is a new character field being added to the table and it is a lower order than the field that contains the RAW data, so there may be some data shifting going on.

My current testing is on OE 10.2B but the behavior is similar on 11.6.

Posted by OctavioOlguin on 04-Aug-2016 16:02

I've noted that before, also...

And what I do to get the original speed is to do the FOR EACH with a DESCENDING modifier...

Posted by Frank Meulblok on 05-Aug-2016 03:52

I'd be curious what the impact is of replacing the RAW with a BLOB.

If total record size a factor, it should make a difference:

Since BLOBs only store a 64-bit refetence, where a RAW holds the full data instead, record sizes would be much smaller.

If you're not actually reading or updating the BLOB field, the content wouldn't be fetched.

This thread is closed