Unusual interaction between -lkwtmo and STOP-AFTER

Posted by dbeavon on 25-Oct-2019 17:32

I'm seeing a misleading error message.  It claims that a lock-wait timeout expired when it did not.

(Lock wait timeout of 40 seconds expired (8812))

We start pasoe with -lkwtmo 40.  Consider that timeout in combination with the following program :

DEFINE VARIABLE m_Error AS CHARACTER NO-UNDO.

m_Error = "Unexpected error in " + PROGRAM-NAME(1) + " at line {&LINE-NUMBER}. ".

/* ************************************************************************ */
/* BlockForTransaction                                                      */
/* ************************************************************************ */
BlockForTransaction:
DO TRANSACTION

   STOP-AFTER 10
   
   ON STOP   UNDO BlockForTransaction,
      RETURN ERROR m_Error
   ON ERROR  UNDO BlockForTransaction,
      RETURN ERROR m_Error
   ON ENDKEY UNDO BlockForTransaction,
      RETURN ERROR m_Error:

   /* ********************************************************************* */
   /* Start contentious locking                                             */
   /* ********************************************************************* */
   BlockSqlTransLoop:
   FOR EACH   contentious_data
      WHERE   contentious_data.system  = p_System          
      EXCLUSIVE-LOCK
      ON STOP   UNDO BlockForTransaction,
         RETURN ERROR m_Error
      ON ERROR  UNDO BlockForTransaction,
         RETURN ERROR m_Error
      ON ENDKEY UNDO BlockForTransaction,
         RETURN ERROR m_Error:

 
   /* ********************************************************************* */
   /* End BlockSqlTransLoop                                                 */
   /* ********************************************************************* */
   END.
   
END.


   

If there is a long delay in retrieving a record lock, I get the message: Lock wait timeout of 40 seconds expired (8812)

But I get it after only ten seconds!  In effect, I am hitting the STOP-AFTER timeout ... but it is misrepresenting it as a lock-wait timeout with the full duration of the -lkwtmo parameter.  The PASOE agent logs look like so:

[19/10/25@13:18:47.447-0400] P-023260 T-025748 3 AS-24 AS -- TRACE: Internal Procedure 'FetchAsyncData' (in 'app/p/app0479.p') START (Proxy 1).
[19/10/25@13:18:47.591-0400] P-023260 T-025748 1 AS-24 APPL           \\Workspace\app\p\app0479.p 414    STARTING TEST 
[19/10/25@13:18:57.658-0400] P-023260 T-025748 1 AS-24 -- Lock wait timeout of 40 seconds expired (8812)
[19/10/25@13:18:57.658-0400] P-023260 T-025748 1 AS-24 -- ** ABL Debug-Alert Stack Trace **
[19/10/25@13:18:57.658-0400] P-023260 T-025748 1 AS-24 -- --> FetchAsyncDataTransaction app/p/app0479.p at line 617  (\\Workspace\app\p\app0479.p)
[19/10/25@13:18:57.658-0400] P-023260 T-025748 1 AS-24 --     FetchAsyncData app/p/app0479.p at line 507  (\\Workspace\app\p\app0479.p)
[19/10/25@13:18:57.661-0400] P-023260 T-025748 3 AS-24 AS -- TRACE: Internal Procedure (Proxy 1) END ERROR.

To make a long story short, it seems that the timeouts (both STOP-AFTER and lock-wait) are both in effect and working properly, but the resulting message in the agent log is confusing.  It did not take 40 seconds for the timeout to expire; and any person who is reading these logs would be very confused.

This agent log message was not reproducible in a _progres session for some reason.  It only happens for me in the context of the PASOE agent process on 11.7.5 windows.

Posted by Fernando Souza on 25-Oct-2019 19:04

There is something going wrong with the regular _progres client as you should see  the same behavior in either case. We will also look into the behavior you describe and figure out how to best address the confusion it can cause.

All Replies

Posted by Fernando Souza on 25-Oct-2019 19:04

There is something going wrong with the regular _progres client as you should see  the same behavior in either case. We will also look into the behavior you describe and figure out how to best address the confusion it can cause.

Posted by dbeavon on 25-Oct-2019 20:13

Thanks Fernando.  

When I first saw the message I assumed that my STOP-AFTER behavior was being "trumped" by the lock-wait.  But then I examined the logs more closely to see the actual amount of time that had elapsed.  Thankfully the actual timeout that was effective was the min of the two (stop-after and lock-wait).  I'd guess that the message is some sort of a default that is shown for any preemptive interruptions.

This thread is closed