In our Q&A we tried to upgrade from 11.6.3 to 11.7 (and 11.7.3 - the updated re-release) and we hit a wall.
Our .NET application uses state-free AppServers.
Couple of our scheduled async tasks started failing with protrace:
uttraceback : 0x0000001c
uttrace_withsigid : 0x00000150
utcoreEx : 0x00000130
drexit : 0x00000484
drSigFatal : 0x000000a8
fdGetTblPartLock : 0x0000010c
screadlock@AF51_24 : 0x00000194
scidxb : 0x00000124
proidxcache : 0x00000120
fdidxcache@AF202_40 : 0x00000044
scfild : 0x0000015c
IPRA.$rnreloc : 0x00001e08
rnproc_entry : 0x00001414
rninterpret : 0x00000064
IPRA.$rnudfunc_local : 0x000002d4
IPRA.$rnudfunc_Exec : 0x00000160
rnudfuncOO_Body : 0x0000022c
rnudfunc_run : 0x00000170
rnudfuncOOMethod : 0x00000158
fmoo4glRunMethod : 0x0000010c
fmoo4glGetAttr : 0x0000016c
IPRA.$rnwdatrEval : 0x000000b4
rnwdatrx : 0x00000034
fmEWDAX : 0x00000044
fmeval : 0x00000330
rnexpstmt : 0x00000230
rnexec_entry : 0x00000278
rninterpret : 0x00000064
cr_run_loaded : 0x00000028
IPRA.$execProc : 0x000005f8
IPRA.$execCall : 0x00000034
IPRA.$processRequest : 0x00000458
ReaderClose : 0x00000284
closeWrite : 0x0000001c
open4GLWriteLast : 0x00000098
IPRA.$ub_pushRq : 0x000004dc
ub_processRequest : 0x00000200
csd_dispatch_message : 0x00000530
IPRA.$do_serve_mainline : 0x00000308
IPRA.$do_serve : 0x00000038
classicapsv_main : 0x000000d0
main : 0x00000070
We also saw:
uttraceback : 0x0000001c
uttrace_withsigid : 0x00000150
utcoreEx : 0x00000130
drexit : 0x00000484
drSigFatal : 0x000000a8
fdseqval : 0x0000025c
fmENXTVAL : 0x0000004c
fmeval : 0x00000330
rnasgeasy : 0x00000268
rnexec_entry : 0x00000278
rninterpret : 0x00000064
cr_run_loaded : 0x00000028
IPRA.$execProc : 0x000005f8
IPRA.$execCall : 0x00000034
IPRA.$processRequest : 0x00000458
ReaderClose : 0x00000284
closeWrite : 0x0000001c
open4GLWriteLast : 0x00000098
IPRA.$ub_pushRq : 0x000004dc
ub_processRequest : 0x00000200
csd_dispatch_message : 0x00000530
IPRA.$do_serve_mainline : 0x00000308
IPRA.$do_serve : 0x00000038
classicapsv_main : 0x000000d0
main : 0x00000070
We are able to re-produce the error by re-running the programs. I have a case open with TS. If we switch the Q&A environment back to 11.6.3 it works without issues.
Running the same program directly (using mpro) instead of going through AppServer calls, seems to work 99% of the time.
Adding some debug message statements before and after the procedure to try and figure out the records it is stuck on seems to cause the procedure to run successfully via AppServer.
Anyone run into this? Any ideas? I would like to get over to 11.7.3 to make use of the new logging parameters but we've hit a wall. Our multi-tenant databases are running on AIX 7.1.
Seems like a odd lock timing bug given the procedure success rate increases by simply adding debugging message statements. Though this is ran against a Q&A environment that has no activity.
I was also able to reproduce this in 11.7 (not just 11.7.3). Something changed between 11.6.3 > 11.7 that is causing our AppServers to crash with memory violation issues running programs that previously worked.
Progress Case #00448724