Optimal transaction size to delete large set of records

Posted by Lieven De Foor on 11-Jul-2017 10:05

Hi

Using a FOR EACH EXCLUSIVE-LOCK effectively comes down to a single transaction per iteration of the FOR loop (assuming there is no transaction active at a higher level).

For deleting a large set of records this is noticable slower than grouping records in sets of say 1000 records (in my experiments up to 25% slower).

Is there an optimal transaction size for this kind of actions, or a way to calculate an optimal size?

Too large increases risk of lock table overflow and possible negative impact on database performance (assumption, please correct me if wrong).

Too small lowers performance...

I'm not worried about partial results in case of an error (the action is idempotent and can be rerun).

Posted by gus bjorklund on 11-Jul-2017 14:06

the reason that 1 record per transaction is slow compared to 50 or 100 is that the transaction log records and associated processing to start and end a transaction are each about the same or a little more than that to update/delete/create 1 record. the ratio is also dependent on how many indexes and row size.

there isn’t really any “optimal” size. the start/end overhead is the same for 1 record as for 100 so making the transaction size larger than 10 or 20 will usually do the trick. bigger than that leads to diminishing returns.

All Replies

Posted by ChUIMonster on 11-Jul-2017 10:27

What version are you experimenting  with?

In the past my personal experiments have indicated that one record at a time works best for DELETE operations.  Grouping transactions has appeared to hurt delete performance.  Multi-threading does too.

When grouping records for updates I have not noticed any consistently significant difference between 50, 100, 500 or 1,000 - so I usually just do 100.

Posted by James Palmer on 11-Jul-2017 10:32

I've always grouped in blocks of 100 records, but that's an arbitrary number. Be interested to see what others say.

Posted by Lieven De Foor on 11-Jul-2017 10:33

Tests were done with 11.6.3 on my local machine (Windows 7), with a relatively small locally running db (started using default startup parameters).

I've tested deleting all records through a dynamic query using transaction sizes of 1, 500, 1000 and 2000 records.

From my experiments there was not much difference between 500, 1000 or 2000 records per transaction, but there was a significant difference when there was only 1 record per transaction...

Posted by James Palmer on 11-Jul-2017 10:36

Interesting what you say about deletes Tom. My testing in 11.6 showed that grouping deletes into blocks of 100 records was around 35% faster than deleting one at a time. This was deleting somewhere in the region of 10k records, so not a massive test.

Posted by gus bjorklund on 11-Jul-2017 14:06

the reason that 1 record per transaction is slow compared to 50 or 100 is that the transaction log records and associated processing to start and end a transaction are each about the same or a little more than that to update/delete/create 1 record. the ratio is also dependent on how many indexes and row size.

there isn’t really any “optimal” size. the start/end overhead is the same for 1 record as for 100 so making the transaction size larger than 10 or 20 will usually do the trick. bigger than that leads to diminishing returns.

This thread is closed