Rouge App Server Agents - 100 CPU

Posted by craig love on 12-Apr-2017 20:08

OpenEdge 11.3.2　

QAD2014EE　

I have the problem of the app server agent using all the CPU (user space specifically. not sys.).　

Putting a strace on the PID show me it is waiting for an event on fd 137　

poll([{fd=137, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 0) = 0 (Timeout)　

Looking at the /proc file system tells me fd 137 is a socket.　

lrwx------ 1 root root 64 Apr 13 08:06 137 -> socket:[134684538]　

lsof and fuser let me track down the other end of the socket.　

In this case it is the broker (asbman -query the PID displayed in the Broker PID field).　

So what is that doing? strace on that PID　

It is appearing that too is waiting on a futex　

[root@auco-rh2 fd]# strace -p 32593　

Process 32593 attached　

futex(0x7f4aec5a19d0, FUTEX_WAIT, 32594, NULL　

Unlike the poll strace which just keeps printing poll, the strace on the broker just prints out futex once and nothing more.　

Nothing appears abonormal in the broker.log.　

Has anyone encountered this type of issue before? i.e. app server agents taking all the cpu.

Any ideas on how to further narrow down what is happening?

Cheers

OpenEdge RDBMS - Forum

All Replies

Posted by cjbrandt on 12-Apr-2017 22:09

In OE 11.3 you should be able to dynamically change the logging level of the broker and app server. Might try increasing the logging to verbose see if that shows anything. Don't set it to extended in a prod system.

What are the app server agents doing in the database ?

Are you able to kill each agent and allow the broker to restart them without impacting the application ?

Look in $WRKDIR to make sure there are no java error logs and also check the admin server, app server and broker logs.

Posted by fredrik.richtner on 11-Mar-2020 11:32

Hi,

I have a case with 11.7.5 on Linux Redhat like this.

Debugging with strace gives that command "dbman" freezes on "futex(0x7f278b3829d0, FUTEX_WAIT".

Seems to be Java related.

Is there a knows reason why this happens?

I will do more debugging if it happens again.

/Fredrik

This thread is closed