Debugging Appservers

Posted by Nigel Allen on 25-May-2015 18:22


Working at a clients whose system (Solaris / 10.2A02 - yes, I know) looks like this:

dot net front end -- external appserver (state-free) -- internal appserver (stateless) -- database

Everything was working (quite) well until a week ago when we noticed that asbman -q -i internal was reporting Active Clients as 511/now and 512/peak and eventually nothing would connect to the database.

Restarted the internal appserver and all became calm again. We have noticed though that on a daily basis (we are currently restarting the internal nightly) Active climbs to between 20+ to 40+. Of the 35 available servers only a few are ever busy sending. As a perfect example, when we saw the 511/512 figures, all 35 servers were running "Available".

We started to debug the situation by running asbman -listclients, seeing that in this case it was conn handle 748 and feeding that in turn to "asbman -q -i internal -clientdetail 748". This gave us the agent pid which we used to send a SIGUSR2 to the process. Generated a protrace which had an empty ABL stack :(

We then grabbed the connection id and started to search through the broker and server logs.

A relevant few lines are these:

[15/05/25@10:01:31.234+1000] P-013643 T-000001 2 AS AS Application Server connected with connection id: (8358)
[15/05/25@10:01:31.234+1000] P-013643 T-000001 3 AS AS ASK Protocol is disabled.
[15/05/25@10:01:31.234+1000] P-013643 T-000001 3 AS AS CSS_CONN before ACK, connection= bound=0 changed=0
[15/05/25@10:01:31.236+1000] P-013810 T-000001 3 AS AS requestID= 0a356d900000023199143204602200003521580000021885
[15/05/25@10:01:31.236+1000] P-013810 T-000001 2 AS AS -- TRACE: Non-PERSISTENT Procedure 'eris/internal/ReceiptPayment.p' START. (5498)
[15/05/25@10:01:31.245+1000] P-013810 T-000001 2 AS 4GLTRACE Run CheckDiary in me/meperprc.p [Main Block - eris/internal/ReceiptPayment.p @ 775]
[15/05/25@10:01:31.245+1000] P-013810 T-000001 3 AS 4GLTRACE Return from CheckDiary [me/meperprc.p]

What I can't understand is that given the connection id I'm following is the one that ends with -7e7", it seems to disappear almost immediately.

I'm just about at my limit of appserver debugging - can anyone make any suggestions as to where I should go from here or even better point me at some half-decent documentation that might help?



Added to this, can anyone advise what the following lines mean? Repeated.

[15/05/25@10:03:26.015+1000] P-013810 T-000001 2 AS AS Application Server connected with connection id: (8358)
[15/05/25@10:03:26.015+1000] P-013810 T-000001 3 AS AS ASK Protocol is disabled.
[15/05/25@10:03:26.015+1000] P-013810 T-000001 3 AS AS CSS_CONN before ACK, connection= bound=0 changed=0



All Replies

Posted by Garry Hall on 01-Jun-2015 09:36

As this request is growing mouldy, I'll have a stab at it.

The external appserver, although it is state-free, is still an ABL client making a connection to the internal appserver and does not seem to be disconnecting. I believe the state-free agent will not automatically disconnect any appserver connections it makes. Thus the number of clients connected to the internal appserver are accumulating. This tends to match your analysis that the external agent was doing nothing (no ABL stack, waiting for a request), but still had the connection to the internal appserver. There should be a message indicating the client (external appserver) disconnected, with that same connection id (message 8359). I would assume there might be some codepath through the external appserver that does not disconnect connections that are made.

The (8358) message is logging to indicate a connect request was being processed. The "CSS_CONN before ACK" message is an internal debugging message, indicating that the agent received a connect message, and it is about to send an acknowledgement. "ASK protocol is disabled" is an internal diagnostic showing whether the AppServer Keep-Alive functionality has been turned on.

This thread is closed