Appserver Agents and Client queue depths/Active clients

Posted by MBeynon on 13-Jan-2016 04:04

Hello,

I wonder if someone could give me a little advice please?

We have a client who is concerned about the load being placed on their production Appserver, specifically the number of Agents SENDING and the client queue depth.

Currently they have a maximum of ten licensed Agents configured and according to the stats, all ten are busy on most days. At two points during the last month, our clients became concerned with the client queue depth which began creeping up and at one point "maxed-out" at 145 before dropping down to the 40's and 50's.

They are concerned about this and the subsequent drop in response times to the incoming requests.

I've done a search on the knowledgebase and found this: http://knowledgebase.progress.com/articles/Article/P83089

which talks about possible reasons for agents being in the sending state for longer than expected:

1. In both SENDING or LOCKED States, the Agent could be stuck because it lost the connection to the AppServer client it's currently associated with. This is typically due to Network issues interrupting the TCP/IP connection or because the client crashed/was  killed, terminated their session abnormally.

2.If the Appserver Agent stays in the state of SENDING for a long time a programming scoping issue can cause a lot more data to be returned than expected, via the Broker, from the AppServer to the client (for example: Temp-tables or datasets that contain many more records than expected).

With regards to point one, how does this happen and is there anything that can be done to mitigate this?

I'm going to suggest our client "Set the AppServer logging mode to VERBOSE (srvrLoggingLevel=3 in uBroker.properties) in order to find what procedures are running by the Agents identified by their PID." and also review our code for memory leaks, but if anyone has any thoughts, comments or suggestions I'd be most grateful.

Thankyou,

Mark.

P.S. The appserver is STATELESS BTW.

All Replies

Posted by Mark Davies on 13-Jan-2016 04:21

Hi Mark,

A word of advice here...

If this is a production server I would not set the logging level to verbose. If your client already has performance issues they will have it even worse - especially if the code has a huge amount of procedures calls and a big call stack.

You didn't say which version of Progress they run on, but you can get the type of information you are looking for from running the progetstack <PID> script (while in a proenv environment). This will dump the stack trace (Windows and Linux/Unix). The stack trace has information about the program stack and all persistent running procedures.

Also, if they are running an AppServer connecting to a database over the network I would strongly suggest that you look at changing it to use shared memory connection instead - we have seen an 80% performance increase by just doing that.

Also look at lock scoping on your database - could be that records are being locked and is waiting for other agents to release it first. There is a utility called HCK that comes in quite handy to check for record locks.

Check if agents are just taking to long to run something or if you have an influx of requests. If the latter, you probably need more appserver agents.

In most cases we found that bad performing code, bad use of indexes on searches and just some really bad coding style is to blame for hanging agents. Use an XREF compile to check for WHOLE-INDEX scans on searches and resolve those first. HCK is also useful here - you can see how many record reads were done on a table - if this is high for something that should only return a few record then that could be the case. A bad query not using indexes could use major CPU resources and add to the slow response.

If you have long-running reports being processed by the appserver then perhaps see if those can be moved to run on a scheduler later at night or when the application is in quiet time.

Also check hardware - if the server is under speced for it use then it will not be able to handle the load.

Posted by MBeynon on 13-Jan-2016 07:09

Hi Mark,

Thanks for the help and advice.

I know that they're currently on 102.A.05 but I'm not sure what type of connection they are using to the DB.

We're pretty strict on Db reads and I'm fairly confident there are no WHOLE-INDEX table reads but I'll give HCK a try to investigate possible record locking issues.

It could simply be as you say that this is due to an influx of requests but I'll do some more investigations based on your recommendations.

Many Thanks,

Mark.

Posted by Paul Koufalis on 13-Jan-2016 08:33

Are you seeing gajillions of short connect/disconnect in the logs or relatively few, long running connections?

Has this crept up slowly over weeks and months or all-of-a-sudden?

Are the _proapsv processes consuming a lot of CPU? Or disk I/O? Or both? Excessive CPU use could point to rapid reader issues.

Check the disk I/O usage of the temp file system. Use -t -T <directory> to show linked temp files. Which ones are big if any? Heavy temp-table use in a slow temp FS could slow down the _proapsv's.

One thing that I have seen is Apsv agents doing OS-COMMAND SILENT <something> and that something getting jammed up. Keep an eye on the _proapsv processes and their children.

You can also install ProTop Free and drill down to the Apsv user to see what they are doing in the DB, providing your table/index range size startup parameters are set correctly. Mind you, I don't remember if _UserTableStat and _UserIndexStat were available in 10.2A05.

Finally, please contact me offline at pk@wss.com regarding something I don't want to discuss here.

Posted by Rob Fitzpatrick on 13-Jan-2016 08:41

My recollection is that the _User*Stat tables were added in a service pack of 10.1B, maybe SP02 or 03.

Posted by Rob Fitzpatrick on 13-Jan-2016 09:06

What is HCK? A search came up empty. Can someone provide a link please?

This thread is closed