PASOE concurrency problems in tcman command

Posted by dbeavon on 14-Mar-2018 12:59

I'm wondering if anyone knows of restrictions in the use of the tcman command in PASOE .  Generally I would expect the tcman env to respond as follows.  (output below)

proenv>cd C:\OpenEdge\WRK\oepas1\bin\

proenv>tcman env
catalina home:     C:\Progress\OpenEdge\servers\pasoe
catalina base:     C:\OpenEdge\WRK\oepas1
catalina tmpdir    C:\OpenEdge\WRK\oepas1\temp
catalina pid:      C:\OpenEdge\WRK\oepas1\logs\catalina-oepas1.pid
java home:         C:\Progress\OpenEdge\jdk
jre home:
manager http port: 8815
manager https port:8816
manager shut port: 8817
manager URL:       http://localhost:8815/manager
config type:       instance
config alias:      oepas1
config parent:     C:/Progress/OpenEdge/servers/pasoe
server running:    1
instance tracking: True
instance file:     C:\Progress\OpenEdge\servers\pasoe\conf\instances.windows
server process-id: 9464
window title:
security model:    developer
service:           true

 

The documentation for the command is found here:

https://documentation.progress.com/output/ua/OpenEdge_latest/index.html#page/ompas%2Fthe-tcman-command.html%23

Unfortunately there seem to be timing issues with this.  Depending on what else is going on in the system, I may get other types of messages in my response from tcman env, such as this one:

Exception: The process cannot access the file 'C:\OpenEdge\WRK\oepas1\logs\catalina-oepas1.pid' because it is being used by another process.
in tcmanager at line : 
Exiting with fatal exception condition

 

Or this one:

rm : Cannot find path 'C:\OpenEdge\WRK\oepas1\temp\catalina-pid.tmp' because it does not exist.
At C:\Progress\OpenEdge\servers\pasoe\bin\tcmanager.ps1:589 char:13
+             rm -path "$_tmppidfile"
+             ~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (C:\OpenEdge\WRK...atalina-pid.tmp:String) [Remove-Item], ItemNotFoundEx 
   ception
    + FullyQualifiedErrorId : PathNotFound,Microsoft.PowerShell.Commands.RemoveItemCommand
 
catalina home:     C:\Progress\OpenEdge\servers\pasoe
catalina base:     C:\OpenEdge\WRK\oepas1

 

A lot of stuff in PASOE depends upon the successful operation of tcman env and its corresponding powershell script that is used internally (tcmanager.ps1).  I would like the output of this to be less fragile.

I suspect the timing issues are related to other concurrent operations that may be underway at the same time (eg. perhaps from OEE or admin server).  I suspect that the tcman command hasn't been tested yet for concurrency in a large server environment.  It is especially suspicious that the error messages indicate the use of hard-coded temp files which multiple tcman commands (or other related operations) may all be trying to create or exclusively lock or delete at the same time:

eg...

  • C:\OpenEdge\WRK\oepas1\temp\catalina-pid.tmp
  • C:\OpenEdge\WRK\oepas1\logs\catalina-oepas1.pid

Any pointers would be appreciated.  Given the fact that tcman env takes about a second to run and is run quite frequently, it seems problematic if this thing has concurrency issues.

Posted by dbeavon on 20-Mar-2018 10:20

I've reported the problem to Progress and they do acknowledge that there is a defect in the fact that tcman can fail intermittently.

A workaround I've been using is to parse the appserver.properties and catalina.properties files myself (in the conf folder of catalina base).

A kb article should be published shortly, as I understand.  I'm not sure about the ETA for a bug fix; hopefully that is available before OE 12.

Posted by dbeavon on 22-Mar-2018 15:04

There is now a KB : knowledgebase.progress.com/.../File-contention-errors-on-catalina-instance-name-pid-and-catalina-pid-tmp-when-using-tcman-env

Apparently the tcman command has a concurrency problem on Windows.   It is not that uncommon to see the symptoms of the problem; if you read the KB you may recognize the related error messages, especially if you are already running  PASOE on the Windows platform.

There is no current ETA for a fix.  We may have to wait for OE 12.

All Replies

Posted by dbeavon on 14-Mar-2018 13:41

I've also just reported as a PASOE bug with product support.  It will be nice to hear if I am doing something wrong with the use of tcman.  I suspect there has never been any stress-testing of these types of management commands.

I'll report back anything I discover.  I'd expect that others may be encountering these error messages intermittently as well, whether they are fully aware of it or not.

Posted by gus bjorklund on 14-Mar-2018 15:56

> On Mar 14, 2018, at 2:01 PM, dbeavon wrote:

>

> Given the fact that tcman env takes about a second to run and is run quite frequently . . .

is there some reason for why you run tcman quite frequently? while you should not be running into the problem you observe, there should also be no good reason to run it frequently.

Posted by dbeavon on 14-Mar-2018 16:38

I don't often run it directly (or at least not deliberately, or because I want that information myself).  I've found that simply by using OEE and by running various admin commands, then it will internally cause a ton of repetitive calls to tcman env (which in turn runs powershell).  I think it happens from some other places too (ie. using the app server view in PDSOE and so on).  This tcman env seems to provide some pretty fundamental information and it seems to be needed on a repeating basis.    If you want to run procmon for a day on a PASOE server, and filter on powershell.exe, then you will see what I'm talking about.

I normally only see problems indirectly, but I've also seen the error messages myself while using the tcman command on occasion.  As one example of an indirect symptom, you can see one of these errors in the logs quite a bit (The process cannot access the file 'C:\OpenEdge\WRK\oepas1\logs\catalina-oepas1.pid' because it is being used by another process).   I guess I finally learned where that wierdness is coming from.

Matt Baker said earlier that in 11.7.2 they were going to reduce the number of times that tcman env is called (see community.progress.com/.../36187 )

But he had not yet said anything about any concurrency or timing problems like the ones I'm seeing now.  Perhaps it is a regression that came up since I moved to 11.7.2 and picked up those other changes.

Posted by dbeavon on 20-Mar-2018 10:20

I've reported the problem to Progress and they do acknowledge that there is a defect in the fact that tcman can fail intermittently.

A workaround I've been using is to parse the appserver.properties and catalina.properties files myself (in the conf folder of catalina base).

A kb article should be published shortly, as I understand.  I'm not sure about the ETA for a bug fix; hopefully that is available before OE 12.

Posted by dbeavon on 22-Mar-2018 15:04

There is now a KB : knowledgebase.progress.com/.../File-contention-errors-on-catalina-instance-name-pid-and-catalina-pid-tmp-when-using-tcman-env

Apparently the tcman command has a concurrency problem on Windows.   It is not that uncommon to see the symptoms of the problem; if you read the KB you may recognize the related error messages, especially if you are already running  PASOE on the Windows platform.

There is no current ETA for a fix.  We may have to wait for OE 12.

This thread is closed