Can proGetStack terminate an OpenEdge 11.6 batch process on linux rhel 7.3 ?
I'm asking the question because I just had a batch process that terminated shortly after issuing proGetStack.
I was trying to get info on the process because it was no longer doing what is is supposed to do.
The clientlog mentioned in the command line doesn't exist, I guess it got deleted by a temp file cleanup routine.
cat protrace.11399
PROGRESS stack trace as of Wed Jan 04 13:49:38 2017
Progress OpenEdge Release 11.6 build 1346 SP02 on Linux cce-dev01 3.10.0-514.el7.x86_64 #1 SMP Wed Oct 19 11:24:13 EDT 2016
Command line arguments are
/usr/dlc/bin/_progres -pf cfg/lisa.common.pf -pf cfg/lisa.sm.pf -U mqcons -P -db /usr2/cce/opvolg1600/data/MQDB -T /appltmp/opvolg1600 -b -p ccetools/batch-b.p -param APPL=1,PROG=ccetools/mqcons.p,USER=ccesync,TRACE=no,ISDEV=yes -logentrytypes 4GLTRACE -clientlog /appltmp/opvolg1600/mqcons_prod.4gl -logthreshold 20000000 -numlogfiles 3
Startup parameters:
-pf /usr/dlc/startup.pf,-cpinternal utf-8,-cpstream utf-8,-cpcoll Basic,-cpcase Basic,-d dmy,-numsep 46,-numdec 44,-T /tmp,-Mm 10240,(end .pf),-pf cfg/lisa.common.pf,-yy 1960,-d dmy,-E,-useOsLocale,-inp 40960,-tok 8192,-noinactiveidx,-NL,-s 500,-l 750,-Mm 10240,-mmax 65000,-Bt 20000,-cpinternal utf-8,-cpstream utf-8,-cprcodeout undefined,-rereadnolock,-rand 2,-h 10,-errorstack,(end .pf),-pf cfg/lisa.sm.pf,-db ./data/LISA,(end .pf),-U mqcons,-P ******,-db /usr2/cce/opvolg1600/data/MQDB,-T /appltmp/opvolg1600,-b,-p ccetools/batch-b.p,-param APPL=1,PROG=ccetools/mqcons.p,USER=ccesync,TRACE=no,ISDEV=yes,-logentrytypes 4GLTRACE,-clientlog /appltmp/opvolg1600/mqcons_prod.4gl,-logthreshold 20000000,-numlogfiles 3
log file entries for the process
grep P-11399 LISA.lg
[2016/12/14@07:08:34.165+0100] P-11399 T-140511261087552 I ABL 74: (452) Login by proadmin on batch.
[2016/12/14@07:08:34.174+0100] P-11399 T-140511261087552 I ABL 74: (7129) Usr 74 set name to mqcons.
[2016/12/19@10:30:03.050+0100] P-11399 T-140511261087552 I ABL 74: (5410) WARNING: -D limit has been exceeded; automatically increasing to 150.
[2017/01/04@13:49:52.272+0100] P-11399 T-140511261087552 I ABL 74: (453) Logout by mqcons on batch.
Interesting. proGetStack terminates the PAUSE statement. I did not know that. Looking at the implementation, I can see why it happens. I believe it is not intended behaviour, AFAIK proGetStack should not affect the operation of the AVM. I would suggest logging this with TS, so it can be evaluated by development.
FWIW, when I tested this, I did get C and ABL stack traces. But that is a different issue.
Tech support logged defect PSC00353478 for the fact that a KILL -1 signal (which is what proGetStack sends on Linux/Unix) causes a PAUSE <n> statement to end prematurely.
proGetStack should not terminate a process... at least it is not intended to do so. The SIGUSR1 signal handler does not try to terminate the AVM. There have been situations where generating the protrace information stumbles over a memory stomp or some other unexpected circumstance, which would then result in a segfault, thus terminating the process.
Is this the entire content of the protrace file? There is no C stack trace, which might suggest the AVM crashed/terminated abnormally while try to generate the stack trace. But the db log suggests the process disconnected from the database gracefully.
It is the entire protrace, I was hoping for an ABL stack trace .
The process is a batch process that reads from an ActiveMQ using ABL stomp client and writes records to the database.
Normally this process runs as long as the db is up & running, automatically reconnecting to the MQ if that disappears.
I'm trying to identify why it stops working after a few weeks.
I found that following procedure terminates after a proGetStack
__progres -b -p batch.p > batch.log &
/* batch.p */
message "start" now. etime(yes). pause 3600. message "end" now "after" etime "ms". quit.
Interesting. proGetStack terminates the PAUSE statement. I did not know that. Looking at the implementation, I can see why it happens. I believe it is not intended behaviour, AFAIK proGetStack should not affect the operation of the AVM. I would suggest logging this with TS, so it can be evaluated by development.
FWIW, when I tested this, I did get C and ABL stack traces. But that is a different issue.
I also get a normal stack trace in the protrace with the sample code. it's only IRL example that didn't produce a stack trace.
I created case 00381001 for the pause.
Next time I have this issue I'll try to locate the clientlog file before I run proGetStack
Carl,
Not to jump in on your thread but I noticed you said you were using ActiveMQ with an ABL stomp client. We have been using Sonic for years and we are looking at moving to ActiveMQ. Is there a reason you used a stomp client instead of the sonic 4gl adapter? Did you write this ABL stomp client yourself or is it from bitbucket.org/.../overview ?
Thanks.
Free ActiveMQ was perfectly adequate for our needs, Sonic was too expensive when we looked at it.
The stomp client in this case is a descendant of www.oehive.org/.../1204 but we use Julian's client as well.
If I would need to enhance our stomp client I'd rather put effort into Julian's client than further develop our own.
thanks for the info.
We're moving from Sonic to Connectplaza
Tech support logged defect PSC00353478 for the fact that a KILL -1 signal (which is what proGetStack sends on Linux/Unix) causes a PAUSE <n> statement to end prematurely.
Something is not correct. It may just be a typo but proGetStack sends SIGUSR1, not -1. The numeric value of SIGUSR1 varies by platform ("kill -l" (lower case "ell") will list the local names and values) but signal 1 is always SIGHUP. If you sent SIGHUP then terminating a PAUSE would be normal and expected. If you sent SIGUSR1 you should get a stack trace.
I agree, it should state kill -10 or kill -USR1,
I copied it from the tech support reply, same typo in knowledgebase.progress.com/.../proGetStack-causes-AVM-to-end-PAUSE-n-pause
I'll reopen the ts case to get it fixed.