Lock wait timeout error

Posted by romain.pennes@foederis.fr on 26-Jan-2015 08:41

Hello,

We are experiencing more and more "Lock wait timeout" errors using Rollbase.
We have 30 customers on the same Rollbase database and overall the application runs very smoothly.
(Saving a records usually only takes 1 second)

But sometimes after saving a record, we have to wait 30 seconds and then see a "Lock wait timeout" error. We are pretty sure that this is not a trigger code problem or infinite loop since we only experience this very rarely. All of our customers experience this "lock wait timeout" error on different objects, at least 2 or 3 times a day.

All of our customers share the same database in one server (MySQL), our Rollbase database is 14GB large. Why do we experience this issues? Do you have any configuration tips to avoid this?

Kind regards,

Romain.

All Replies

Posted by romain.pennes@foederis.fr on 26-Jan-2015 08:54

By the way, we have been using "READ UNCOMMITED" transaction isolation level by default. Should we use another one to avoid this problem? What is the best transaction isolation level to use in MySQL for Rollbase use?

On another note, we have more than 3.000.000 records in our rb_obj_data table.

At which point should we separate customers into another database?

Posted by Tom Kincaid on 26-Jan-2015 09:40

 
I assume you are on the private cloud correct? What version of Rollbase and what version of MySQL are you using?
 
 
[collapse]
From: romain.pennes@foederis.fr [mailto:bounce-romainpennesfoederisfr@community.progress.com]
Sent: Monday, January 26, 2015 9:42 AM
To: TU.Rollbase@community.progress.com
Subject: [Technical Users - Rollbase] Lock wait timeout error
 
Thread created by romain.pennes@foederis.fr

Hello,

We are experiencing more and more "Lock wait timeout" errors using Rollbase.
We have 30 customers on the same Rollbase database and overall the application runs very smoothly.
(Saving a records usually only takes 1 second)

But sometimes after saving a record, we have to wait 30 seconds and then see a "Lock wait timeout" error. We are pretty sure that this is not a trigger code problem or infinite loop since we only experience this very rarely. All of our customers experience this "lock wait timeout" error on different objects, at least 2 or 3 times a day.

All of our customers share the same database in one server (MySQL), our Rollbase database is 14GB large. Why do we experience this issues? Do you have any configuration tips to avoid this?

Kind regards,

Romain.

Stop receiving emails on this subject.

Flag this post as spam/abuse.

[/collapse]

Posted by Godfrey Sorita on 26-Jan-2015 11:05

Hi Romain,

On your logs, are you seeing a pattern on the time when "Lock wait timeout" errors are occurring? Are there other transactions made prior to the error like batch jobs, imports and reports?


Thanks,
Godfrey

Posted by pvorobie on 26-Jan-2015 11:15

We use READ COMMITTED isolation in production environment.

Posted by romain.pennes@foederis.fr on 26-Jan-2015 11:23

Hello, we are indeed using Rolbase private cloud 2.2.2.0.

We are using MySQL 5.5.25.

At some point we believed that "Lock wait timeout" errors occured more while there were import batch jobs running, but we managed to only have imports run at night and we still experience "Lock wait timeout" errors during the day.

The pattern of the error in the log files isn't always the same.

Can I send you log files by email?

Kind regards,

Romain.

Posted by Godfrey Sorita on 26-Jan-2015 11:34

Ok. Please make sure to include the main, event and job logs. My email address is gsorita@progress.com.

Posted by romain.pennes@foederis.fr on 26-Jan-2015 11:35

Thank you for your answers.

Do you think I can try and change from "READ UNCOMMITED" to "READ COMMITTED" without any risk?

Posted by pvorobie on 26-Jan-2015 11:45

Sorry but I'm not MySQL expert. I'll try to ask out DevOps.

Posted by romain.pennes@foederis.fr on 26-Jan-2015 11:52

Thank you.

Posted by murali on 26-Jan-2015 13:23

Romain,
 
Lock wait timeout error’ happens when a transaction A is starved by  another transaction B which holds a lock  which A needs to proceed.
It waits for  innodb_lock_wait_timeout  seconds; which is   50 secs by default. You can change it to a higher value if you want.  
 
What is the value in your environment?
 
In my dev box, it is 50
+--------------------------------+-------+
| Variable_name                  | Value |
+--------------------------------+-------+
| innodb_lock_wait_timeout       | 50    |
 
 
We need to find out the root cause. The kind of transactions that run for more than 50 seconds ( and which hold the locks ) , causing others to starve and time-out.
 
 
This is what I would suggest::
 
You should have an *automated script* which does the following when you encounter ‘Lock wait timeout’
 
The scripts does the following things:
 
1) Takes a few system snapshots on mysql DB   ( I mean, take a note of system variables, performance schema, innodb engine status  etc via queries)
 
a) show processlist  ( show active processes)
b) show engine innodb status   ( this one is useful… gives you a snapshot of active  transactions  ; with lock info )
….
You should be looking out for output like this  (a long running txn  (been active for 510 sec) with  row locks) …
 
MySQL thread id 94669, OS thread handle 0x45c93940, query id 1638525054 10.xxx.xx.xx rollbase
---TRANSACTION 3B82057A7, ACTIVE 510 sec fetching rows

mysql tables in use 1, locked 0

5 lock struct(s), heap size 1248, 2 row lock(s), undo log entries 15

 

 c) And finally the query from this one:

http://dev.mysql.com/doc/innodb-plugin/1.0/en/innodb-information-schema-examples.html

 

SELECT r.trx_id waiting_trx_id, 

       r.trx_mysql_thread_id waiting_thread,

       r.trx_query waiting_query,

       b.trx_id blocking_trx_id,

       b.trx_mysql_thread_id blocking_thread,

       b.trx_query blocking_query

   FROM       information_schema.innodb_lock_waits w

   INNER JOIN information_schema.innodb_trx b  ON 

    b.trx_id = w.blocking_trx_id

  INNER JOIN information_schema.innodb_trx r  ON 

    r.trx_id = w.requesting_trx_id;

 

 

This will likely help you catch the transaction which is causing starvation.

 

2) Take a thread dump of your RB server.

 

This may help you correlate the blocked transaction with a   waiting-on-db ( socket-read)   thread from the server.

 

 

 

Another suggestion:

 

Please turn mysql slow query log on ;   and see if you are generating really slow queries.  Please capture those queries and send them to support. We can analyze them.

 

set global slow_query_log_file='C:/temp/mysql_slow.log';

set long_query_time=20;   # 20 secs

set global  slow_query_log=1;

 

Posted by Godfrey Sorita on 26-Jan-2015 14:34

I have observed some of the "lock wait timeout" errors in your zone are caused by createRecord and updateRecord Object Script API. Take note that these APIs invokes triggers(e.g. After Create, After Update) the same way as the Rollbase user interface does and might cause unwanted chain reactions.

Unfortunately, the source object and trigger which fired it is not mentioned in the logs. Do you happen to know object records (created/updated using createRecord/updateRecord) which could possibly run another set of triggers?

Posted by Godfrey Sorita on 30-Jan-2015 10:55

Hi Romain,

We are curious on how you were able to set the isolation level to READ_UNCOMMITED. In the master zone UI, it only allows REPEATABLE READ or READ COMMITTED. Did you set it on your MySQL database?

Also, have you tried the suggestions of Murali?

Regards,

Godfrey

This thread is closed