A Consultant's checklist?

Posted by James Palmer on 18-Dec-2014 05:19

Morning all, 

I'm not after people revealing their trade secrets, I'm just interested in starting discussions and hopefully trying to learn something :) 

The scenario: You've just arrived on site with a new client. They're running a Progress Database and have brought you in because it's under-performing. You've got your coffee and mint imperials. Do you have a particular checklist that you work through to sort out the problems? What do you look at, and in what order? Do you use 3rd party tools (eg ProTop) or do you rely on promon etc? 

Thanks! :) 

All Replies

Posted by Paul Koufalis on 18-Dec-2014 08:06

The very first thing I do is interrogate the customer.  "Under-performing" is so vague that it could be anything. Some pointed questions can at least let me know if it's a new problem, an old problem, it's always been like this...etc.  Then I want to know if it's a read or write issue so I ask a few questions around that.  Also I want to know if it's at a particular time of day that it's bad as I don't want to be investigating at 1pm when the problem only occurs at 10am.

With that info in hand, I start on the outside and work my way in:  mem, cpu and disk are first.  I'm looking for low-hanging fruit.  Then I check all the DB startup parameters for any egregious errors (-B 1000).

After that the path I take really depends on the information I've gathered so far.  I can start in promon but protop gives me more information, more easily, more quickly and in a more condensed fashion (I'm not flipping back-and-forth through screens as much like in promon).  And since it's tiny and free the client never minds that I install it.

Paul

Posted by James Palmer on 18-Dec-2014 08:08

Thanks Paul - very helpful, and you get a free beer the next time we meet as a thank you for being the first to respond! :)

Posted by ke@iap.de on 18-Dec-2014 08:59

Hi James,

for one customer I created a list of questions we handed out to the user. They should answer with yes or no and had some space for free text.

(Translation is on the fly by myself from German :)

ProgramName is always too slow?

Starting takes too long?

Programs start to slow?

Are you able to show the problem?

Reporting is slow?

Sometimes application is a bit notchy?

Sometimes application freezes?

It getting more worse every day?

It is worse every day?

The I analyze technology, I use the monitoring tools from the OS (nmon, top, vmstat, iostat, Windows Performance Monitor...). Then we talk to the storage or hardware guys. Standard answer is: "My part is fast. The others are slow". Most times this is not true :)

For OE I start with promon etc. and next is VSTs (for DB and program level).

So path is:

a) find out what means "slow"

b) look at hardware

c) talk to all hardware guys

d) analyze DB and application with VSTs

Sometimes it is easy (-B), sometimes it is medium level (VST), sometimes it is hard to find (SAN saturated, but that is hidden). Sherlock Homes is my idol :)

Klaus

Posted by James Palmer on 18-Dec-2014 09:03

Vielen Dank Klaus. Very helpful. I'd buy you a beer as well, but German beer is far superior to English so it would be a waste of money! :)

Posted by Mike Fechner on 18-Dec-2014 09:09

Well Klaus is from a part of Germany not that famous for its beer. And not all English beers are that bad…
 

Posted by TheMadDBA on 18-Dec-2014 11:27

First thing I would do is throw away the mints :)

Paul and Klaus gave out some really good information. I basically follow the approach Paul outlined usually skipping to the DB first pretty quickly after some initial sanity checks of the system (Memory,Disk, CPU and Network).  A relatively high percentage of the time there are a few common errors like Paul said... mostly parameters either set too low or too high.

After looking at the basic stuff I use my own home grown tools to look at the VSTs for common problems like excessive reads of certain tables/indexes, storage layout and a few other things.

One of the other things to look out for are OS specific quirks; especially around memory management. For example different AIX versions have different ideas of how AIX buffer cache is handled.

Posted by Paul Koufalis on 18-Dec-2014 11:30

lru_file_repage is your friend.

Posted by TheMadDBA on 18-Dec-2014 12:24

Right..Just pointing out that things are sometimes handled differently between AIX versions (5,6,7). Or what works right for AIX doesn't always hold true for different Unix flavors.

More than once a client has kept AIX 5 settings (ioo,vmo, etc) on AIX 6/7 that ended up causing issues.

Posted by Paul Koufalis on 18-Dec-2014 12:35

I agree 100%.  I was just being silly.  AIX jokes aren't that funny live and obviously even less effective on a discussion forum.

An AIX Admin and a Windows Admin walk into a bar...

Posted by TheMadDBA on 18-Dec-2014 12:38

The Linux Admin ducks?

Posted by Paul Koufalis on 18-Dec-2014 12:44

Ba-dump!  We're here all week people.  

This thread is closed