Intermittent performance issues

Posted by James Palmer on 26-May-2016 04:45

Progress 10.2B08 on WIndows Server 2008 R2.

This is a client server. They have a very perplexing issue that rears its head regularly, but not regularly enough to obviously be related to anything on a schedule. 

The client receives EDI files that are processed into the database automatically by a batch process. On the whole these take a couple of minutes max to process, but at times they suddenly start taking 10 minutes + to process. There's nothing different about the actual files. 

Unfortunately they're in a pressure situation where unprocessed files cost money as it delays orders going out/in. As a result their sys admin (a third party unfortunately) is rather trigger happy and likes to reboot the server ASAP. This sorts the problems out short term, but it means I can't easily see what is hammering the server at the time of the issues. 

I've managed to identify that the disk the databases are running on performs ok, but the C drive has very slow writes (21 seconds for the bigrow test). Personally I suspect this may well be a big part of the issue as the temp files and so on reside here. 

There are no long running transactions at the time of the problems, other than the EDI importer itself, and they are usually around 2-3 minutes long when problems are occurring. I take this to be a symptom rather than a cause. 

I would ideally like to be able to track what's going on on the server so that if the sysadmin reboots I can still get some sort of snapshot. The issue is, I am not allowed to install any products on the server - that includes tools like Protop (sorry Tom). 

Does anyone have any thoughts on what I can do to try and track the issues better. I've told the sysadmin that I need to see the server before he reboots but he's getting it in the neck from the client. All a bit of a mess. 

All Replies

Posted by cverbiest on 26-May-2016 05:20

Maybe you can check for processes that aren't cleaned up.

We've had a situation where we launched notepad.exe that stayed around.

After a while there were a couple of hunderd notepad processes that started slowing thing down dramatically.

This is something that disappears with a reboot.

It will probably not be notepad but it could be similar.

Posted by James Palmer on 26-May-2016 05:39

Thanks it's worth investigating for sure.

Posted by Keith Sudbury on 26-May-2016 07:56

Without being able to install or change anything it is going to be a little tricky :-)

If the batch process is long running - make sure they memory footprint isn't increasing over time. Memory leaks can cause a lot of issues.

If you can monitor the processes when they are good and bad... you can use Resource Monitor to see what is going on at a process level. You should be able to see how much IO is happening on the temporary files (if any).

Promon will obviously give a lot of information although not as friendly a presentation as ProTop.

Performance Monitor will also give you information about disk IO, CPU and Memory in general. If they aren't trending that already... they need to fix that.

Posted by ChUIMonster on 26-May-2016 08:00

It is always so much more rewarding when you do your analysis with your hands tied behind your back!

Pointing -T at a slow drive would certainly be a potential issue.  Especially if the -T space is actually being used.  Are the files in that -T space active, large and growing?  If they are then you should look at various client options like -Bt, -mmax, -TB, -TM and so forth and prevent as much of that disk use as possible.  If SQL is involved make sure that you have updated the statistics.  (You should also fix your code...)

FWIW and IMHO in 999 cases out of 10 where someone is destroying the evidence on a routine basis (by, for instance, rebooting before you can see what is going on...) that person's area of responsibility is where this problem and the next dozen or so problems lie.

While you are looking for processes that are not cleaned up -- check to see if there are perhaps a lot of CMD shells piling up.  That is very easy to do if someone is improperly using .BAT files in this "batch processing" that you refer to.

Posted by James Palmer on 26-May-2016 13:54

How can I work out how many _dbagent _mprosrv and _mprshut processes I should have. There seem to be rather a lot of them.

Posted by ChUIMonster on 26-May-2016 14:38

Step #1 -- open taskmgr and get to the processes tab.  Depending on which version of windows you are suffering with, you then click the "view" menu or right click where the column names are.  Unselect useless junk columns like "description".  Definitely select "command line".  Make it as wide as possible.  Hunt everywhere for some form of "start time".  Reluctantly concede the Microsoft has not seen fit to provide that data.  Sort by command line.  You should now be able to see what those various instances of _mprosrv etc are actually doing for you.  You might decide that you have too many of something as a result.  Or you might say -- "oh, yes, that is what should be running".

Posted by Rob Fitzpatrick on 26-May-2016 15:24

Process Explorer, from Microsoft, is a better "task manager" and does provide process start time as well as a lot of other process and thread metrics that aren't in Task Manager.

And there is nothing to install.  You can just download and run it.  You can even run it directly from the web via Sysinternals Live.

Posted by Keith Sudbury on 26-May-2016 15:28

You should have 1 _mprshut process per APW,BIW,AIW,AIMGMT and WDOG process.

_Mprosrv processes will be based on how many brokers you start and the Mi,Ma and Mpb settings... and as well as how many connections have been made. promon R&D 1 17 will show the _mprosrv processes arranged by broker.

Posted by ChUIMonster on 26-May-2016 15:38

But he's not allowed to install anything.

Posted by Rob Fitzpatrick on 26-May-2016 15:40

Right.  And Process Explorer doesn't have to be installed.

Posted by ChUIMonster on 26-May-2016 15:51

Are they shipping it by default now?  When did that start?

Posted by Rob Fitzpatrick on 26-May-2016 16:03

It didn't.  It's not in-box (though it should be, IMO).  But it can be run from the web without making any permanent changes to the box, i.e. "installing" it.  And to be clear I'm not suggesting doing it without permission.  It doesn't hurt to ask.

I've worked in customer environments where there were installation restrictions.  And of course there are valid reasons for such rules, i.e. to keep suspect third-party code off the box.  *Any* software is a possible risk.  But if a customer is going to trust Microsoft enough to use their OS on a mission-critical box then I think it's not unreasonable, when push comes to shove and there's a production problem to solve, to ask them to permit using a Microsoft tool to diagnose it.

I don't know whether it's the customer or the third-party that imposes the no-install rule.  But the customer is losing money and the third party is "getting it in the neck" so they both have incentive to work toward a solution.

Posted by James Palmer on 01-Jun-2016 03:56

Just a quick update on this. The customer has moved the C: drive to SSD and the problems seem to have gone away, at least today. They still need their stuff tuning but it's a start at least.

This thread is closed