dbmon.sh script version 3 is now available

Posted by George Potemkin on 23-Jun-2016 07:23

There are many tools to monitor Progress databases: Protop, OE Management, ProMonitor, Promon gather script [1].

The dbmon.sh script is one of them. It starts promon session and goes through its menu to gather almost [2] all possible information and repeats all steps by the specified number of the sampling intervals. The first version of the dbmon.sh script was written many yeas ago and our customers were using it to gather the database statistics per the whole day (usually with 10-30 min sampling intervals) as well as during the incidents (like a sudden performance degradation) using the short sampling intervals.

New in dbmon.sh Version 3:

1. If you do not specify the list of databases (really the masks for the paths to the databases) then the dbmon.sh will gather information for all running databases that use the Progress version specified by the DLC environment variable [3].

2. dbmon.sh prepares the databases for monitoring:
2.1 dbmon.sh puts ACO blocks [4] into DAABP [5]. It will protect promon from the contention of LRU latch;
2.2 dbmon.sh finds a directory with the "drwxrwxrw" permissions and sets it in promon as "Directory for Statement Cache Files". It will protect 4GL sessions against the hangs on STCA lock [ 6 ].

3. dbmon.sh launches three "roasting" promon sessions: "rare", "medium" and "well done":
3.1 Main promon session gathers the most part of information in promon using the basic sampling interval [7];
3.2 Status promon session gathers "Blocked Clients", "Buffer Lock Queue" and "Active Transactions" [ 8 ] using 10 sec intervals or the basic sampling interval whatever is smaller;
3.3. Latch promon session uses "Activity: Latch Counts" menu to gather the current owners of the latches. The session checks information once per second. If it will catch a latch's owner then the script will also reports the latch locks and naps for the last one sec interval [9].

4. dbmon.sh launches a 4GL procedure (dbmon.p) that collects the _TabStat/_IndexStat statistics, selects the most active sessions, collects the _UserTabStat/_UserIndexStat statistics for these sessions as well as uses the statement cache to get the procedures they are running at this moment.

5. dbmon.sh gathers other miscellaneous information.

6. At the end all logs will be automatically archived and the archive will be sent to the recipients if you specified their emails.

There is another side of dbmon.sh when it will use its SWAT skills. If "civilians" can not connect a database then dbmon.sh script will still try to do it or at least it will collect some useful information:
The dbmon.sh script recognize the following situations:
a. Database is starting (a long crash recovery);
b. Database is shutting down (shutdown is already initiated);
c. Sessions can't connect a database due to a login semaphore/USR latch lock;
d. Sessions hang immediately after db connection (most likely the LRU latch is locked);
e. Too many users connected to a database (error 5291).

In the cases "c" and "e" the dbmon.sh will try to run promon for a given database.
In the cases "c" and "d" the script will create a protrace file for a probe connection session that was used as a first step before starting the real monitoring processes.
In all cases the script will monitor what is going on in a database on file level. It will report the changes of db extent timestamps, the changes of their sizes and the changes in the .lk file.
You can enable monitoring on the file level for "normal" databases as well. Use the -filemon option. I don't thinks it will add anything useful to the information collected by regular promon but at least you can test the file level monitoring before you might need it.

Also you can start the dbmon.sh script with the -protrace option with the following values:
-protrace enable - It allows for the script to create the protrace files for all processes that are using a database that can not be connected;
-protrace all - It tells to the script to create the protrace files for all processes running against all databases that were selected for monitoring;
-protrace dbname - It tells to the script to create the protrace files only for the processes that are running against the specified database.

In all these case the script will the lsof command to get the list of the processes that are connected to a databases. If the lsof is not available then the script will parse the output of "ps -ef" command.

If OS allows [10] to get the working directories of the running processes then the script will move the protrace files to its own working directory, will rename them by adding db name and the name of the executable and finally will add the protrace files to an archive.

Use the -help option to get information how to use the dbmon.sh script.

Default parameters: all running databases, 4 sec sampling interval, 5 intervals - total monitoring duration is 20 sec.

In case of any "strange" situation with your database just run the script with the default parameters and it will capture all data that might be needed to analyze the situation.

Also the dbmon.sh script can be run by cron - for example, run it every day for 24 hours with 10 min sampling interval.

The script is almost insensitive to the version of Progress. The script was tested with versions from 10.2B to 11.6 (and a bit higher ;-). If you are using the older versions then you just will not get the information that is available in recent ones. For Progress version before 10.2A SP01 you need to change a probe connection procedure. Instead of promon session you need to use a 4GL session. Just uncomment the corresponding code inside the script. The script was tested with Progress V10.1C.

The dbmon.sh script was tested on Linux, Solaris, HP-UX and AIX. If it does not run well on your Unix flavor then send me the errors and I'll try to fix the issue.

Main promon session creates a large log with the various information. You can "split" the log per promon's menus. Use dbmonSplit.p procedure.

Sasha Kraljevic has written 8 years ago the programs to parse the promon's output and load data into a database. I do not use them (I prefer to use the "slide show" after dbmonSplit.p) but they are still available.

The script dbmon.sh and the programs can be downloaded here:
ftp.progress-tech.ru/.../

Notes:
[1] Promon gather script (gather.sh) - see Article: Script for gathering database information on Unix.
knowledgebase.progress.com/.../000010526
[2] dbmon.sh skips some promon's menu. For example, it skips "Status: Lock Table" because an application can create a huge number of record locks. Saving this information would be time consuming and the logs would be very large. But you can easy uncomment the corresponding menu inside the script.
[3] If you run dbmon.sh when DLC is not set then it will automatically locate a Progress directory provided all running databases are using the same Progress version.
[4] ACO blocks - Area Control Object blocks (bk_type 12 & objectId 0). Their contents is used, for example, by prostrct statistics.
[5] DAABP = Dynamically Allocated Alternate Buffer Pool(s) - an undocumented but supported (me think ;-) feature that is available since 10.2B SP06.
[ 6 ] Setting a directory for Statement Cache Files with the right permissions will protect your database against the bug OE00237502 / PSC00258643. See kbase knowledgebase.progress.com/.../000040544
[7] Unfortunately I'm forced to use the promon's rule and the sampling interval is rounded to an even number. Hence it's not possible to use one sec interval.
[ 8 ] If the total monitoring duration is longer than 5 min then "Active Transactions" will be gathered by main promon.
[9] Latch locks and naps are not the direct indicators of latch's busyness. A better indicator is how often you can see an owner of latch compared to the number of attempts (= the number of sampling intervals).
[10] For example, HP-UX does not have a command to get the current working directory of the running processes but you can find a solution:
community.hpe.com/.../5093802
The script will search for the "pwdx" command.

All Replies

Posted by danielb on 26-Jun-2016 06:42

Wow, this looks extremely useful. I don't suppose anyone has already done a Windows port for it yet? Or, something similar?

Posted by George Potemkin on 26-Jun-2016 10:41

It would be rather hard to port the "SWAT" part of dbmon script to the Windows. The "normal" monitoring can be done in Progress 4GL using VSTs. So the tools like ProTop can be used on Windows. AFAIK, ProTop has an option to send a current snapshot of the database statistics as a plain text in email.

Posted by Thomas Mercer-Hursh on 26-Jun-2016 10:48

It's easy to port ... just apply the Linux patch! :)

Posted by gus on 26-Jun-2016 11:08

bash is coming for windoze.

then it will almost work.

regards,

gus

3 Logicians walk into a bar.

The bartender asks if they'd all like a beer.

The first logician says "I don't know².

The second says "I don¹t know².

Then the third exclaims "Yes!

On 6/26/16, 11:42 AM, "George Potemkin"

wrote:

>Update from Progress Community [https://community.progress.com/]

>

>George Potemkin [https://community.progress.com/members/georgep12]

>

>It would be rather hard to port the "SWAT" part of dbmon script to the

>Windows. The "normal" monitoring can be done in Progress 4GL using VSTs.

>So the tools like ProTop can be used on Windows. AFAIK, ProTop has an

>option to send a current snapshot of the database statistics as a plain

>text in email.

>

>View online

>[https://community.progress.com/community_groups/openedge_rdbms/f/18/p/257

>00/88779#88779]

>

>You received this notification because you subscribed to the forum. To

>unsubscribe from only this thread, go here

>[https://community.progress.com/community_groups/openedge_rdbms/f/18/t/257

>00/mute].

>

>Flag

>[https://community.progress.com/community_groups/openedge_rdbms/f/18/p/257

>00/88779?AbuseContentId=02e0e716-ef4d-4c24-b19c-a935f1602e6b&AbuseContentT

>ypeId=f586769b-0822-468a-b7f3-a94d480ed9b0&AbuseFlag=true] this post as

>spam/abuse.

This thread is closed