Analysing Java Memory Usage (AppServer Brokers)

Posted by mroberts@rev.com.au on 28-Jul-2015 17:32

Hi All,

(OE 10.2B07, windows and unix 32 and 64 bit environments, appserver connections).

We have over the years had java processes (AppServer Brokers, Admin Servers) crash with Java Heap errors.

All the KBase articles and information I have gleaned point to increasing the memory footprint of the java process, which we have done.

Now that we are consolidating many customers onto the cloud, we have brought across the now considered default parameters for the appserver brokers.  One thing we are struggling to do is to capacity plan memory for these java processes.  The memory footprint for these is quite large, and there is no linear measure that we can see about  number of users or transactions that pass through the broker that leads to a heap failure.  We have had the heap blow out at high volume times (kinda expected), but also had it fail with 1 or 2 people connected in quiet times.  Unfortunately for AppServer brokers, failure to set the heap space correctly is catastrophic.  If it hit the limit and denied future connections, then we could at lest have the customer trade through the number of people who are already logged on, and tune when appropriate.  But the heap failure takes out the customers entire access as the broker stops handling requests.

We can plan for future disk based on growth, for DB memory based on well documented processes for setting buffer pools, but the big hole in the planning is that we don't know how or when a broker will die with a Heap error and cannot guarantee to the business that we have effectively used the memory we are being charged for.  We are trying to measure actual java memory usage so we can not unnecessarily over resource the virtual space we are paying for.

So I was wondering if anyone has feedback on ways to analyse the actual memory used particularly through the AppServer brokers.

Any feedback or real world experience in this area would be greatly appreciated.  

We are on 10.2B at the moment, but are planning an upgrade to 11.5 (or 11.6 at the end of the year, depending on timing) ... so solutions available in the later versions give more weight to the upgrade project priority.

Thanks

Mark Roberts.

 

 

All Replies

Posted by TheMadDBA on 28-Jul-2015 18:24

Tweaking java is always a pain and I am sad to say that some of the java versions required by more recent OE versions take up even more memory than previously. Most of the fault lies with java itself (server vs client concepts).

I assume you have already used JConsole to take a peak at your java processes? I didn't get much out of it but it does let you get a few details and allows you to run garbage collection.

To me the real solution is to make sure you are using name server load balancing to spread the load between a larger number of brokers. In the past I have set up to 50 appserver different brokers to respond to the same logical broker name. When you are running 4000+ users it indeed sucks for a broker to hang or crash. But having only a small fraction of the users impacted for a brief amount of time is certainly better than locking everyone out.

Posted by Tjerk Coomans on 29-Jul-2015 02:33

Hi Mark,

Can you give us some more details about the setup of the application. Personally I never get problems like this, not in 10.2B08 and also not in the 11 versions. But I have to say that most installations I work on are not the biggest. Going from 50 users to 2000 with one exception of 12000.

The only problems I have sometimes is with the adminserver and this is only when OpenEdge management is installed and many resources are monitored. The simple trick here is to never install OE Management on the db or appserver machine, always use a seperate machine for this (remote monitoring). so in case you add more resources to monitor and the heapsize gets to small only your monitor server crashes and you can easily change the jvm parameters and restart the adminserver without hurting the production environment.

Are your appservers running stateless, state-free? What's the max client connections set on.

If something crashes is it the broker or an agent?

Regards

Tjerk

Posted by Libor Laubacher on 29-Jul-2015 04:28

>>

OE 10.2B07

We have over the years had java processes (AppServer Brokers, Admin Servers) crash with Java Heap errors.

<<

You might want to go to 10.2B08 + HF as there is Adminserver OOM (memory leak) bug fixed.

Posted by David Cleary on 29-Jul-2015 08:05

You might want to consider moving to PAS for OpenEdge when you move to 11.5 or 11.6. It is much more resource friendly, and for 11.6, support features such as Amazon Elastic Load Balancing. The biggest change would be using HTTP instead of native AppServer protocol for ABL clients.
 
Dave
 
[collapse]
From: mroberts@rev.com.au [mailto:bounce-mrobertsrevcomau@community.progress.com]
Sent: Tuesday, July 28, 2015 6:33 PM
To: TU.OE.Deployment@community.progress.com
Subject: [Technical Users - OE Deployment] Analysing Java Memory Usage (AppServer Brokers)
 
Thread created by mroberts@rev.com.au

Hi All,

(OE 10.2B07, windows and unix 32 and 64 bit environments, appserver connections).

We have over the years had java processes (AppServer Brokers, Admin Servers) crash with Java Heap errors.

All the KBase articles and information I have gleaned point to increasing the memory footprint of the java process, which we have done.

Now that we are consolidating many customers onto the cloud, we have brought across the now considered default parameters for the appserver brokers.  One thing we are struggling to do is to capacity plan memory for these java processes.  The memory footprint for these is quite large, and there is no linear measure that we can see about  number of users or transactions that pass through the broker that leads to a heap failure.  We have had the heap blow out at high volume times (kinda expected), but also had it fail with 1 or 2 people connected in quiet times.  Unfortunately for AppServer brokers, failure to set the heap space correctly is catastrophic.  If it hit the limit and denied future connections, then we could at lest have the customer trade through the number of people who are already logged on, and tune when appropriate.  But the heap failure takes out the customers entire access as the broker stops handling requests.

We can plan for future disk based on growth, for DB memory based on well documented processes for setting buffer pools, but the big hole in the planning is that we don't know how or when a broker will die with a Heap error and cannot guarantee to the business that we have effectively used the memory we are being charged for.  We are trying to measure actual java memory usage so we can not unnecessarily over resource the virtual space we are paying for.

So I was wondering if anyone has feedback on ways to analyse the actual memory used particularly through the AppServer brokers.

Any feedback or real world experience in this area would be greatly appreciated.  

We are on 10.2B at the moment, but are planning an upgrade to 11.5 (or 11.6 at the end of the year, depending on timing) ... so solutions available in the later versions give more weight to the upgrade project priority.

Thanks

Mark Roberts.

 

 

Stop receiving emails on this subject.

Flag this post as spam/abuse.

[/collapse]

Posted by mroberts@rev.com.au on 30-Jul-2015 21:04

Hi Tjerk,

10.2B07 is out current version... have had problems over various versions.  The appserver we use are stateless.  Number of users does  not always seem to be the problem, as it can happen at times when there are only a few users on.  I'm not sure whether payload could be part of the issue, and the broker is having trouble with the payload.  Max clients is set to 512 (the default) for most of our appservers.  We have a lot of customers each using their own broker and appserver pool, and scale out sideways.  We have no confidence in the broker being able to handle the volumes you talk about when they die with relatively minor numbers we have at the moment.

The problem I am trying to solve is the broker crashing ... the agents crashing are normally caused by bone-headed developers and I can track that with protrace files, and the effect is only on the 1 user who had the agent at the time.  That size crash is manageable.

Mark

Posted by mroberts@rev.com.au on 03-Aug-2015 21:21

Forgive my ignorance Libor ... HF is what ?

We have a separate recommendation for a 10.2B08 upgrade to resolve some OE Replication issues we have in one site ... if it also fixed the Admin Servers crashing then that would also justify the upgrade.

Thanks for your response.

Mark.

Posted by TheMadDBA on 03-Aug-2015 23:47

HF = hot fix. Meaning a special patch that has to be requested from Progress that fixes one or more issues.

These are basically mini service packs. Progress should know the HF number or maybe Libor can let you know which one.

This thread is closed