OE Replication Plus bug?

Posted by ChUIMonster on 23-Feb-2016 07:42

I'm not finding anything in the kbase but perhaps my search terms are inadequate...

oe11.5 sp 1 (64 bits) on Windows Server 2012 r2

I have a nice, happy install of OpenEdge.  Everything works.  Client can connect, the database hums along, everyone is happy!

Until...

I enable sitereplication.

Replication works great.

But, suddenly, none of the remote 4gl or sql servers will start.  Nobody can connect (except me, being the dinosaur that I am, I use self-service character sessions on the server via proenv...)

I *can* use proserve to start "m3" servers.

But, so far, no incantations that I have come up with will convince "dbman" or OEE to start those servers.

If I disable site replication everything returns to normal.

I have heard a rumor that this is a "known bug".  But I cannot find any evidence of that in the kbase.  And I would rather have a workaround than a kbase entry anyway ;)

All Replies

Posted by James Palmer on 23-Feb-2016 07:47

Sounds like a perfect candidate for the Linux patch Tom! :)

I used to manage replication plus on a very similar configuration and had no issues. I assume you've set the correct machines as source and target? I know it sounds a stupid question and I almost feel bad posting it...

Posted by James Palmer on 23-Feb-2016 07:52

Are you getting any errors in logs etc Tom?

Posted by ChUIMonster on 23-Feb-2016 08:08

If it were up to me I would patch it in a heartbeat.

The source and target are fine.  Remember that replication works perfectly?  No issues -- except that turning it on apparently results in deciding not to launch the 4gl and sql servers.

No, there are no errors or log entries of any kind anywhere.  Which, so far as I can tell, is perfectly normal when dealing with anything that involves %dlc%\properties

The only indication of a problem is that the servers don't start and people cannot connect.  If you look deeper you discover that the servers have not started.  So far as I can tell no attempt is ever made to start them.

Posted by Peter Judge on 23-Feb-2016 08:10

I'd suggest giving TS a call.
 

Posted by James Palmer on 23-Feb-2016 08:13

Sounds very odd. Nothing in admserv.log then either.

I'd go with Peter's suggestion. Very odd though.

Posted by sfgaarthuis on 23-Feb-2016 08:14

How Many users?
-min/maxport issue? 




Met vriendelijke groet,
Simon Gaarthuis
+31653998444

Op 23 feb. 2016 om 14:43 heeft ChUIMonster <bounce-ChUIMonster@community.progress.com> het volgende geschreven:

<ProgressEmailLogo-png_2D00_150x42x2-png> Update from Progress Community
<4TQ25RAP8G6W-png_2D00_70x70x2-png>
ChUIMonster

I'm not finding anything in the kbase but perhaps my search terms are inadequate...

oe11.5 sp 1 (64 bits) on Windows Server 2012 r2

I have a nice, happy install of OpenEdge.  Everything works.  Client can connect, the database hums along, everyone is happy!

Until...

I enable sitereplication.

Replication works great.

But, suddenly, none of the remote 4gl or sql servers will start.  Nobody can connect (except me, being the dinosaur that I am, I use self-service character sessions on the server via proenv...)

I *can* use proserve to start "m3" servers.

But, so far, no incantations that I have come up with will convince "dbman" or OEE to start those servers.

If I disable site replication everything returns to normal.

I have heard a rumor that this is a "known bug".  But I cannot find any evidence of that in the kbase.  And I would rather have a workaround than a kbase entry anyway ;)

View online

 

You received this notification because you subscribed to the forum.  To stop receiving updates from only this thread, go here.

Flag this post as spam/abuse.

Posted by ChUIMonster on 23-Feb-2016 09:13

Nope, not related to any of that.  I've messed around with all of those - just in case.

Posted by ChUIMonster on 28-Feb-2016 15:12

Here is what I have discovered:

0) My dislike for properties files and windows is even more well founded than I imagined and has not been improved in any way by this experience.  Applying the Linux patch gets more attractive every time I see a windows login screen :(

1) There were several minor configuration issues.  These were very aggravating but, in the end immaterial.  The biggest problem with these is that anything involving properties files is basically trial and error.  There is no useful logging by default.  You can turn on more logging but that requires restarting the admin server (not so easy on a server that is being used by others) and when you do that you get flooded with a whole lot of drivel -- finding anything useful is nearly impossible.

2) "dbman" et al are opaque.  They often do things silently -- no matter your logging level they write nothing anywhere.  And then simply return no information.  You get to "try it" to see if it works.  The trial and error cycle is very time consuming.

3) There are a whole lot of very knowledgeable and helpful people out there -- thank you all!  But especially Libor who pointed out the biggest problems and convinced me to try a couple of things that I was sure didn't matter.

... drum roll...

4) I'm not very smart.  For years I have been implementing replication working on the assumption that:

   proserve dbname -DBService replserv -S port#

is the canonical command to start a db with replication enabled.  I had this misguided idea that the -S was "tied" to the -DBService.  I don't know why I thought that.  In retrospect I should have known better.  (Although, in my defense, there are a lot of examples in the documentation that look this.)

None the less -- this approach of setting aside a port and specifying it on the proserve command line has worked very well and continues to work well in scripted environments (UNIX).  I expected it to work and it took a lot of persuading (thanks Libor) to convince me that it was causing problems.

I had also *thought* that you needed a port dedicated to the replication service.  This seems to be untrue.  Apparently you can just put any old 4gl broker in dbname.repl,properites (I didn't try sql brokers).

Anyhow... the root problem seems to have been that adding

   -DBService replserv -S port#

to "otherargs" somehow prevents any of the "server groups" in conmgr.properties from starting.  Even if the port# is not in conflict with anything.  For reasons that escape me there do not appear to be any messages in any log files which would reveal that this is happening.  You just get to wonder why you cannot connect when all that you (think that you) did is to start replication services.

Furthermore, as it turns out, "dbman" magically figures out that it needs to start a replication server (a fact which is not visible in any log file that I've been able to find).  Which makes the addition of DBService to otherargs pointless and possibly harmful.

So, in the end, it was basically "operator error".

Thanks to everyone who pitched in and helped out!

This thread is closed