Can proGetStack terminate an OpenEdge batch process on linux

Posted by cverbiest on 04-Jan-2017 07:14

Can proGetStack terminate an OpenEdge 11.6 batch process on linux rhel 7.3  ?

I'm asking the question because I just had a batch process that terminated shortly after issuing proGetStack.

I was trying to get info on the process because it was no longer doing what is is supposed to do.
The clientlog mentioned in the command line doesn't exist, I guess it got deleted by a temp file cleanup routine.

cat protrace.11399

PROGRESS stack trace as of Wed Jan 04 13:49:38 2017
Progress OpenEdge Release 11.6 build 1346 SP02 on Linux cce-dev01 3.10.0-514.el7.x86_64 #1 SMP Wed Oct 19 11:24:13 EDT 2016

Command line arguments are
/usr/dlc/bin/_progres -pf cfg/lisa.common.pf -pf cfg/lisa.sm.pf -U mqcons -P -db /usr2/cce/opvolg1600/data/MQDB -T /appltmp/opvolg1600 -b -p ccetools/batch-b.p -param APPL=1,PROG=ccetools/mqcons.p,USER=ccesync,TRACE=no,ISDEV=yes -logentrytypes 4GLTRACE -clientlog /appltmp/opvolg1600/mqcons_prod.4gl -logthreshold 20000000 -numlogfiles 3

Startup parameters:
-pf /usr/dlc/startup.pf,-cpinternal utf-8,-cpstream utf-8,-cpcoll Basic,-cpcase Basic,-d dmy,-numsep 46,-numdec 44,-T /tmp,-Mm 10240,(end .pf),-pf cfg/lisa.common.pf,-yy 1960,-d dmy,-E,-useOsLocale,-inp 40960,-tok 8192,-noinactiveidx,-NL,-s 500,-l 750,-Mm 10240,-mmax 65000,-Bt 20000,-cpinternal utf-8,-cpstream utf-8,-cprcodeout undefined,-rereadnolock,-rand 2,-h 10,-errorstack,(end .pf),-pf cfg/lisa.sm.pf,-db ./data/LISA,(end .pf),-U mqcons,-P ******,-db /usr2/cce/opvolg1600/data/MQDB,-T /appltmp/opvolg1600,-b,-p ccetools/batch-b.p,-param APPL=1,PROG=ccetools/mqcons.p,USER=ccesync,TRACE=no,ISDEV=yes,-logentrytypes 4GLTRACE,-clientlog /appltmp/opvolg1600/mqcons_prod.4gl,-logthreshold 20000000,-numlogfiles 3

log file entries for the process

grep P-11399 LISA.lg
[2016/12/14@07:08:34.165+0100] P-11399      T-140511261087552 I ABL    74: (452)   Login by proadmin on batch.
[2016/12/14@07:08:34.174+0100] P-11399      T-140511261087552 I ABL    74: (7129)  Usr 74 set name to mqcons.
[2016/12/19@10:30:03.050+0100] P-11399      T-140511261087552 I ABL    74: (5410)  WARNING: -D limit has been exceeded; automatically increasing to 150.
[2017/01/04@13:49:52.272+0100] P-11399      T-140511261087552 I ABL    74: (453)   Logout by mqcons on batch.

Posted by Garry Hall on 04-Jan-2017 08:23

Interesting. proGetStack terminates the PAUSE statement. I did not know that. Looking at the implementation, I can see why it happens. I believe it is not intended behaviour, AFAIK proGetStack should not affect the operation of the AVM. I would suggest logging this with TS, so it can be evaluated by development.

FWIW, when I tested this, I did get C and ABL stack traces. But that is a different issue.

Posted by cverbiest on 05-Jan-2017 03:59

Tech support  logged defect PSC00353478 for the fact that a KILL -1 signal (which is what proGetStack sends on Linux/Unix) causes a PAUSE <n> statement to end prematurely.

All Replies

Posted by Garry Hall on 04-Jan-2017 07:30

proGetStack should not terminate a process... at least it is not intended to do so. The SIGUSR1 signal handler does not try to terminate the AVM. There have been situations where generating the protrace information stumbles over a memory stomp or some other unexpected circumstance, which would then result in a segfault, thus terminating the process.

Is this the entire content of the protrace file? There is no C stack trace, which might suggest the AVM crashed/terminated abnormally while try to generate the stack trace. But the db log suggests the process disconnected from the database gracefully.

Posted by cverbiest on 04-Jan-2017 08:09

It is the entire protrace, I was hoping for an ABL stack trace .

The process is a batch process that reads from an ActiveMQ using ABL stomp client and writes records to the database.

Normally this process runs as long as the db is up & running, automatically reconnecting to the MQ if that disappears.

I'm trying to identify why it stops working after a few weeks.

I found that following procedure terminates after a proGetStack

__progres -b -p batch.p > batch.log &

/* batch.p */
message "start" now. etime(yes). pause 3600. message "end" now "after" etime "ms". quit.

Posted by Garry Hall on 04-Jan-2017 08:23

Interesting. proGetStack terminates the PAUSE statement. I did not know that. Looking at the implementation, I can see why it happens. I believe it is not intended behaviour, AFAIK proGetStack should not affect the operation of the AVM. I would suggest logging this with TS, so it can be evaluated by development.

FWIW, when I tested this, I did get C and ABL stack traces. But that is a different issue.

Posted by cverbiest on 04-Jan-2017 08:32

I also get a normal stack trace in the protrace with the sample code. it's only IRL example that didn't produce a stack trace.

I created case 00381001 for the pause.

Next time I have this issue I'll try to locate the clientlog file before I run proGetStack

Posted by Roger Blanchard on 04-Jan-2017 09:00

Carl,

Not to jump in on your thread but I noticed you said you were using ActiveMQ with an ABL stomp client. We have been using Sonic for years and we are looking at moving to ActiveMQ. Is there a reason you used a stomp client instead of the sonic 4gl adapter? Did you write this ABL stomp client yourself or is it from bitbucket.org/.../overview ?

Thanks.

Posted by cverbiest on 04-Jan-2017 09:56

Free ActiveMQ was perfectly adequate for our needs, Sonic was too expensive when we looked at it.

The stomp client in this case is a descendant of www.oehive.org/.../1204 but we use Julian's client as well.

If I would need to enhance our stomp client I'd rather put effort into Julian's client than further develop our own.

Posted by Roger Blanchard on 04-Jan-2017 09:58

thanks for the info.

Posted by gdb390 on 04-Jan-2017 09:59

We're moving from Sonic to Connectplaza

Posted by cverbiest on 05-Jan-2017 03:59

Tech support  logged defect PSC00353478 for the fact that a KILL -1 signal (which is what proGetStack sends on Linux/Unix) causes a PAUSE <n> statement to end prematurely.

Posted by ChUIMonster on 06-Jan-2017 05:29

Something is not correct.  It may just be a typo but proGetStack sends SIGUSR1, not -1.  The numeric value of SIGUSR1 varies by platform ("kill -l" (lower case "ell") will list the local names and values) but signal 1 is always SIGHUP.  If you sent SIGHUP then terminating a PAUSE would be normal and expected.  If you sent SIGUSR1 you should get a stack trace.

Posted by cverbiest on 06-Jan-2017 06:55

I agree, it should state kill -10 or kill -USR1,

I copied it from the tech support reply, same typo in knowledgebase.progress.com/.../proGetStack-causes-AVM-to-end-PAUSE-n-pause

I'll reopen the ts case to get it fixed.

Posted by Brian K. Maher on 06-Jan-2017 06:58

no need to reopen the case.  I’ll take care of it.
 

Posted by Brian K. Maher on 06-Jan-2017 07:00

KB article has been changed to “kill -USR1”.

This thread is closed