OpenEdge 11.3.2
QAD2014EE
I have the problem of the app server agent using all the CPU (user space specifically. not sys.).
Putting a strace on the PID show me it is waiting for an event on fd 137
poll([{fd=137, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 0) = 0 (Timeout)
poll([{fd=137, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 0) = 0 (Timeout)
poll([{fd=137, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 0) = 0 (Timeout)
Looking at the /proc file system tells me fd 137 is a socket.
lrwx------ 1 root root 64 Apr 13 08:06 137 -> socket:[134684538]
lsof and fuser let me track down the other end of the socket.
In this case it is the broker (asbman -query the PID displayed in the Broker PID field).
So what is that doing? strace on that PID
It is appearing that too is waiting on a futex
[root@auco-rh2 fd]# strace -p 32593
Process 32593 attached
futex(0x7f4aec5a19d0, FUTEX_WAIT, 32594, NULL
Unlike the poll strace which just keeps printing poll, the strace on the broker just prints out futex once and nothing more.
Nothing appears abonormal in the broker.log.
Has anyone encountered this type of issue before? i.e. app server agents taking all the cpu.
Any ideas on how to further narrow down what is happening?
Cheers
In OE 11.3 you should be able to dynamically change the logging level of the broker and app server. Might try increasing the logging to verbose see if that shows anything. Don't set it to extended in a prod system.
What are the app server agents doing in the database ?
Are you able to kill each agent and allow the broker to restart them without impacting the application ?
Look in $WRKDIR to make sure there are no java error logs and also check the admin server, app server and broker logs.
Hi,
I have a case with 11.7.5 on Linux Redhat like this.
Debugging with strace gives that command "dbman" freezes on "futex(0x7f278b3829d0, FUTEX_WAIT".
Seems to be Java related.
Is there a knows reason why this happens?
I will do more debugging if it happens again.
/Fredrik