Hi.
We have a table with 4 key fields and a RAW field (16k max size).
We need to blast through all of those records and update one of the key fields. Ok, that's simple. I'll just do a FOR..EACH.
What I am surprised by is the performance. There are only 300 records in the table but it takes at least 11-12 seconds to update them all. If there is no update, it's only a matter of milliseconds to read them.
With a RAW field in the table, is there some extra overhead for for an update (even though the RAW field itself is not getting updated)?
Thanks.
Is this client/server or self-service?
Is it faster if you use the FIELDS option with just the key fields?
That’s probably what it is, if you can’t use full table scan see if there is another index you might use…
Rob, I tried both CS and direct connect. I also tried a DEFINE QUERY with FIELDS and did not see a difference.
Brian, yes it is a key field that I am updating. It is not explicitly used in the FOR..EACH but there is only a single index.
Is the better option to add the index as inactive, update the data, and then activate the index later? Or, is there another option that I am overlooking?
We have a similar table with text data instead of RAW and it is much much faster. The records are smaller but I was still surprised at the difference.
PRESELECT EACH ?
Marian, it's a FOR..EACH with no WHERE clause so I assume it was doing a full table scan but I guess it is probably using the only index. I tried adding the TABLE-SCAN option but didn't see anything groundbreaking with performance.
tx size? One per record or all at once?
With no where it’s using the default index but if you say it doesn’t make any difference when using table-scan then the only thing might be the updated record doesn’t fit in the allocated space and has to be moved… what data type is that field that you’re updating?
Makes sense. I'll try the TABLE-SCAN option again. Perhaps I was doing it wrong. I don't think I will be able to use it since my code has to work with 10.2B and higher but I am curious. :)
Marian, just a single character field. If I do the same thing on a similar table with no RAW data, there's no real performance issue. It must be moving the RAW data around...
TABLE-SCAN presupposes Type II storage. If it's Type I then it just selects the default index to traverse. Worth pointing out.
If there aren't many records involved, can you cache the records' rowids you want to update in a temp-table and then find them individually to update?
Hey Jeff, not much of a dba but I guess if the record doesn’t fit in the previously allocated space it has to be moved around… a character is a variable size so it’s the perfect candidate when it comes of increasing the record size :(
I'm not too up on newer ways of doing this (still on 10.2B) but in this scenario I would do the following:
DEFINE BUFFER bTable FOR table. FOR EACH table NO-LOCK: FIND bTable WHERE ROWID(bTable) = ROWID(table) EXCLUSIVE-LOCK NO-ERROR NO-WAIT.
IF AVAILABLE bTable THEN ASSIGN bTable.fieldName = "NewValue". END.
HTH
Unfortunately that won't help in the case where the field name is in the index used for the original for each as it will still be updating the field in the index and will still have the original problem.
Maybe there are some write-triggers active on that table?
I have TestStat.i code that was written for my tests:
{TestStat.i} RUN GetStat(1). /* Insert a code to test */ RUN GetStat(2). RUN ReportStat(1 /*vRepeats*/).
ReportStat reports all database resources used by a code between points 1 and 2.
I has uploaded TestStat.i on ftp:
ftp://ftp.progress-tech.ru/pub/Users/george/Programs/TestStat.i
P.S. Forum seemed to use the worse editor in the World! I [:S]
I am running this on a larger database now and can't believe how slow it is. There are only 10,000 records in this table, and the only index is disabled.
I am reading the ROWIDs from a text file (dumped beforehand), finding the record by ROWID, and then updating a field with a value.
For a very similar table with hundreds of thousands of records (but no RAW data), it's fairly immediate. But for the table with RAW data, it took 10 minutes do it.
:-\
If the definitions of the table aren't too sensitive.. could you post the definition here? Mainly looking for validation expressions, triggers or any other things special about this table.
I didn't have any luck reproducing the issue on my side. I am assuming you are running a relatively recent version of OE?
There is no extra processing on the RAW field. What can impact is how much data there is already in the record, where the field you are updating is physically in the record, and how much data needs to be shifted to allow space for the new field value. When you read a record, we copy it into memory into the record buffer. When you update it, we need to reallocate memory for that record buffer and depending on the memory layout, the system may need to get a bigger memory block somewhere else in the memory space, so there could be data being copied twice. I wonder if you would see a difference if the raw field was physically before the fields you are updating.
With all that said, 10 minutes does seem excessive.
Keith, there are no triggers or validation. It's just 5 fields and a RAW field.
Hi Fernando. There is a new character field being added to the table and it is a lower order than the field that contains the RAW data, so there may be some data shifting going on.
My current testing is on OE 10.2B but the behavior is similar on 11.6.
I've noted that before, also...
And what I do to get the original speed is to do the FOR EACH with a DESCENDING modifier...
I'd be curious what the impact is of replacing the RAW with a BLOB.
If total record size a factor, it should make a difference:
Since BLOBs only store a 64-bit refetence, where a RAW holds the full data instead, record sizes would be much smaller.
If you're not actually reading or updating the BLOB field, the content wouldn't be fetched.