Currently I'm forced to bounce PASOE a few times a day. This is the instance of PASOE that is installed on my local workstation with PDSOE.
What happens is that after a while of using it, the connections or sessions seem to be "leaked". I'm not sure what connections or sessions I'm dealing with because all the screens that I have access to show 1 or less.
Below is the exception I get from .Net:
Progress.Open4GL.Exceptions.RunTime4GLErrorException HResult=0x80131500 Message=ERROR condition: No connections available to process request. (7211) Source=Progress.o4glrt StackTrace: at Progress.Open4GL.Proxy.Procedure.RunPersistentProcedure(String requestID, String procName, ParameterSet params_Renamed, MetaSchema schema, Int32 stateModel) at Progress.Open4GL.Proxy.Procedure.RunPersistentProcedure(String procName, ParameterSet params_Renamed) ==== log (nothing in abl ms-agent log, only in the session mgr log): 14:07:21.067/4003832 [catalina-exec-5] ERROR com.progress.appserv.Session - LocalSession(97AEA49D755BD74E1BC5A0D28A5D4E76284608D618A7.oepas1) : timeout error occurred while reserving a connection = com.progress.appserv.broker.exception.BrokerException$NoAvailableConnectionsException: Agent:No Available Connections[cannot reserve connection]:Agent. (18298)
It would be really nice to understand what exactly is being leaked. In OEE, there is often nothing shown under the "connections", "sessions", or "agents" for the abl application. On rare occasion I will see only one connection, but in the normal case I do not.
In the tomcat manager there is also almost nothing of interest (screenshot). In the current scenario, it was "dev_lumbertrack_dkb" that was refusing connections.
My hope is to understand what is going wrong so that if this ever happens in production we will know how to handle it. We will have a number of different abl apps and web apps running in the same tomcat instance so it will be extremely disruptive to just stop and restart the entire instance.
I've tried reading thru the available docs but cannot find any clues about how to begin to troubleshoot whatever component has leaked connections (eg.https://knowledgebase.progress.com/servlet/fileField?id=0BEa0000000TisZ)
Here are some things I've tried, none of which will free up the APSV connections for use by my .Net openclient clients.
*** Kill all ms-agents using oemanager:
$OUT = Invoke-RestMethod -Method Delete -Uri ('http://localhost:' + $portnumber.ToString() + '/oemanager/applications/' + $ablapp + '/agents/' + $agent.agentId) -Credential $cred -ContentType "application/vnd.progress+json"
*** stop/restart individual webapps
*** use the tomcat manager (screenshot above) to expire tomcat sessions
NONE of these operations will free up the imaginary connections which are supposedly unavailable . I was hoping to see something in the logs but they aren't all that helpful, beyond the message you see above. I will see what I can do to increase logging in the "session manager". But I had thought that stopping and restarting the webapp would essentially clear out the session manager. Does the session manager have a mechanism for resetting only *one* abl application at a time? Or would that require bouncing all of tomcat, as is my last resort?
If anyone has suggestions about how to troubleshoot, I would greatly appreciate it. I think my current theory is that the session manager has been corrupted in some way, possibly over-counting my active connections and restricting me because it thinks I've surpassed the limit of 5. There hasn't been anything I can do to reset the session manager aside from bouncing the tomcat instance. I'd like a way to "reset" whatever portion of the session manager is responsible for this and see if I can get my tomcat to become responsive again.
Any help would be much appreciated. Hopefully most of this question makes sense. I have only a few weeks of experience with OEPAS thus far, and only with the dev license.
David,
The way OpenClients work is that once they disconnect they have to release the connections, if they do not then we should be seeing hung clients/connections. To be safe, dotnet clients should configure their timeout properties(IdleSessionTimeout,ConnectionTimeout,ConnectionLifetime etc). From PASOE, we can enable IdleResourcetimeout and enable appropriate property(like idleConnectionTimeout) if we want to cancel or terminate the connections that might be hanging around. This should mitigate the problem of terminating stale sessions/connections in PASOE instance and should not require any restart of the PASOE server. I am not sure if this is relevant to your case, but thought of letting you know in-case you were looking for options.
To determine you have idle connections, I would prefer accessing the API - localhost:8810/.../sessions
If this gives any results then it means that you have sessions/connections hanging. I see that you tried to access sessions but you do not see anything in the response.
With PASOE Development server, the max connections or requests you can make would be only 5. You can actually make more than 5 requests but only 5 will be handled in the given time. So the remaining requests will fail with the error you have mentioned. To make sure that the sessions are IDLE and waiting for the next requests, I would look for the number of sessions you currently have and what is the status of it. I would first look at the MS-Agent /sessions API and see if I have used up all the 5 sessions. If any of the sessions status is ACTIVE, then I would see the session stacks to know what that session is doing.
MS-Agent sessions -
MS-Agent Session stacks -
Another way to see is all the sessions were used would be to look at the agent log. The MS-Agent sessions in the agent log can be identified with 'AS-<some number>'. In my case, where I was running a development license, If I run for 50 concurrent clients, I got same errors as you have experienced and I can see in my agent log that all my 5 MS-Agent ABL Sessions were utilized. So please look at the agent log and see how many sessions were used up.
cat testinst.agent.log | grep 'AS-' | cut -d ' ' -f 5 | uniq | sort -u
AS-10
AS-4
AS-7
AS-8
AS-9
AS-Admin
AS-Aux-0
AS-Aux-5
AS-Aux-6
AS-Listener
AS-ResourceMgr
This should tell us why you were seeing the error. It might have happened that the client was disconnected for some reason, but the MS-Agent sessions are still running the application logic. If you do not see all the 5 sessions used up then I sense a bug in the Product for PASOE dev license.
Hi David,
Can you please check if you releasing your connections properly in your .NET client? Here are the kbase articles that explain what might cause such problems while running .NET client with PASOE.
knowledgebase.progress.com/.../P129149
knowledgebase.progress.com/.../NET-Open-Client-disconnect-messages-not-received-by-PASOE
Irfan, for the moment if we assume a developer did *not* program something exactly right in their .Net managed client (or the client application crashed or was terminated by the OS, etc), then what should we expect out of OEPAS? Are we going to have to bounce the entire TomCat instance in production if there are these types of misbehaving clients? In my example it may be that there were misbehaving clients that had previously made connections, but these clients had long-since died. Is there some period of time by which the server reclaims the "phantom" connections?
Much of our client activity comes from short-lived applications (ie. one day or less) but I would expect the OEPAS server to be able to run for at least a whole week at a time without needing to bounce it. Nor would I expect that individual misbehaving clients should impact the stability of the server. As I mentioned earlier, I currently need to bounce my oepas1 instance a couple times a day and that will be very inappropriate in production. ;)
How can I view/monitor the phantom connections that are giving "unavailable" errors for new requests? They don't seem to be visible in the oemanager (OEE) or in the tomcat manager, nor is the ms-agent (_mproapsv) running.
Is there some other tool I can use to investigate these phantom connections? Or is there a command to clean the slate, without restarting all of tomcat? Would you like a memory dump of the tomcat process?
David,
The way OpenClients work is that once they disconnect they have to release the connections, if they do not then we should be seeing hung clients/connections. To be safe, dotnet clients should configure their timeout properties(IdleSessionTimeout,ConnectionTimeout,ConnectionLifetime etc). From PASOE, we can enable IdleResourcetimeout and enable appropriate property(like idleConnectionTimeout) if we want to cancel or terminate the connections that might be hanging around. This should mitigate the problem of terminating stale sessions/connections in PASOE instance and should not require any restart of the PASOE server. I am not sure if this is relevant to your case, but thought of letting you know in-case you were looking for options.
To determine you have idle connections, I would prefer accessing the API - localhost:8810/.../sessions
If this gives any results then it means that you have sessions/connections hanging. I see that you tried to access sessions but you do not see anything in the response.
With PASOE Development server, the max connections or requests you can make would be only 5. You can actually make more than 5 requests but only 5 will be handled in the given time. So the remaining requests will fail with the error you have mentioned. To make sure that the sessions are IDLE and waiting for the next requests, I would look for the number of sessions you currently have and what is the status of it. I would first look at the MS-Agent /sessions API and see if I have used up all the 5 sessions. If any of the sessions status is ACTIVE, then I would see the session stacks to know what that session is doing.
MS-Agent sessions -
MS-Agent Session stacks -
Another way to see is all the sessions were used would be to look at the agent log. The MS-Agent sessions in the agent log can be identified with 'AS-<some number>'. In my case, where I was running a development license, If I run for 50 concurrent clients, I got same errors as you have experienced and I can see in my agent log that all my 5 MS-Agent ABL Sessions were utilized. So please look at the agent log and see how many sessions were used up.
cat testinst.agent.log | grep 'AS-' | cut -d ' ' -f 5 | uniq | sort -u
AS-10
AS-4
AS-7
AS-8
AS-9
AS-Admin
AS-Aux-0
AS-Aux-5
AS-Aux-6
AS-Listener
AS-ResourceMgr
This should tell us why you were seeing the error. It might have happened that the client was disconnected for some reason, but the MS-Agent sessions are still running the application logic. If you do not see all the 5 sessions used up then I sense a bug in the Product for PASOE dev license.