Hardware replication vs Openedge Replication

Posted by Jens Dahlin on 04-Feb-2011 03:12

Hi,

Any thoughts on choosing a hardware based replication rather than OpenEdge? We are considering either:

2 x HP P4000 iSCSI based replication

or

2x HP P2000 plus OpenEdge Replication.

We haven't gotten very far with this yet but overall I see that the hardware based replication is ALOT cheaper. Our database is not very big (<10GB) but we run a fair amount of users so licensing is really the big cost.

We have no big need for the replicated database to be used for running statistiscs and reports - we just need it as a standby in case of emergencies?

More important than the cost is of course that the replication really works. So, my simple question is: does it?

Best regards

Jens

All Replies

Posted by kevin_saunders on 04-Feb-2011 03:35

OE Replication works and works well.

Hardware replication? I'm sure it works, but does it understand the way OE works? Keep in mind that hardware replication cannot see what is in shared memory, and only the stuff on the disks is replicated. Of course if you are not 24x7, then having the DBs down for the replication is fine and should work without issue (at a previous client we successfully used snapshotting to replicate databases which were then backed up and it worked very well, with minimal downtime).

Posted by Jens Dahlin on 04-Feb-2011 05:21

As I understand it it's not really replication, rather something they call storage clustering. Two sets of disks behave as one. When writing data to the virtual disk it writes to the two physical sets. If one of the two physical sets goes down the other continues without stopping.

Posted by Thomas Mercer-Hursh on 04-Feb-2011 12:32

Sounds like mirroring, but across two boxs.  Not really the same thing as replication.  Mirroring within a box is certainly a good thing for reducing risk from single disk failure, but it does nothing to protect against controller failure, building burning down, etc.  Real replication should be off site.

Posted by Jens Dahlin on 07-Feb-2011 01:24

Since our LAN spans two different buildings it will be off site. So the conclusion here is that nobody has tried hardware replication?

Posted by Thomas Mercer-Hursh on 07-Feb-2011 11:39

Not that simple.

First, let's work on terminology so that we are clear about what we are talking about.

In mirroring ... a word usually applied to disk pairs within the same cabinet ... every disk write or delete is directed simultaneously to two different disks so that is one fails the other has an absolutely current copy of the data and the application can keep running.

In replication ... a word usually applied to two separate databases on two separate machines ... all active transactions occur against the primary database and any one of several mechanisms is used to capture that transaction and ship it over to the secondary database where it is then applied.  Thus, at any given instant in time the secondary database is some interval in time behind the primary, but, there is the expectation that, if the primary fails, one can still apply the captured transactions agains the secondarry, bringing it current, and then switch to the secondary for production use.  Catching up and doing the switch are not instantaneous, so there is some loss in availability while the switch is made, but the approach is highly suitable for a replicate which is at considerable distance from the primary and thus unlikely to be affected by the same natural disaster, etc.

What you seem to be talking about is something that functions like mirroring, but in which the databases/disks are separated across the LAN.  So, the first thing to do is to quiz the supplier closely as to what is really going on.  For starters, LANs are slow compared to disks so how can the secondary really be kept up to date?  Is there some kind of cache involved to buffer this process?  What happens if one does something like a DB restore which whill overwhelm any cache?  Name names and products since there is probably someone in the Progress world with concrete experience and knowledge.

Note too that I believe the current SEC standard for a replicate database is 300 miles.  Across the parking lot doesn't do a lot for you in an earthquake/hurricane/tornado/local disaster of choice.

Posted by Jens Dahlin on 08-Feb-2011 02:52

The considered product is HP Storageworks P4300. According to the vendor it should work with up to 3ms of network latency and we have a lot less between the different buildings. What we really want to protect ourselves from are shorter power outages, possible failures of hardware, fire, simple stuff like that. We dont have earthquakes or tornadoes here. We already have off site backup for real disasters.

And as far as being slow, today we have the database on an almost 10 year old Sun Fire V240 with disks in an external scsi-box and as far as I understand it that is way slower than the offered solution. I'm not (as you might understand) a professional sysadmin though so I cannot myself verify that. Also, important to note when taking about replication: we have something like a ration of 1000:1 reads to writes.

Product specs:

http://h10010.www1.hp.com/wwpc/us/en/sm/WF06b/12169-304616-3930449-3930449-3930449-4118659-4118705-4118707.html

Posted by ChUIMonster on 08-Feb-2011 11:48

HW replication and OE replication solve two different problems.

Hardware and OS focused solutions will very happily replicate all sorts of things that you don't really want to replicate -- stuff like "rm -rf *"...  or the ever popular "for each customer: delete customer. end.".

You may laugh at the above but, in my experience, human error along those lines is at least as likely as crashed servers or bad hardware.  And it is far more damaging.  It doesn't have to be malicious either -- at one customer site a certain "storage engineer" was infamous for trashing production while thinking he was working on test systems.  Once is unfortunate, twice is puzzling.  I lost count somewhere north of 5.

Anyhow, OE replication implemented independently of hardware schemes ensures that the business transactions are protected and recoverable.  One important fact to keep in mind is that you have to implement after-imaging before OE replication.  So the after-image log files provide yet another source of redundancy (very handy for those "for each customer" screw ups).

OE replication is famously finicky.  Setting it up and running it and managing it is a bit of a pain.  It seems like every little thing ends up being a "get a new backup and restart replication" exercise.  If you don't really need real-time fail-over then you might be better served with a simpler imnplementation of after-imaging.

Posted by Thomas Mercer-Hursh on 08-Feb-2011 11:57

3ms over network latency doesn't tell you much without including the network latency .... that is, after all, the slow part.

This thread is closed