How to maintain data integrity?

Posted by Joost Haakman on 13-Feb-2014 05:16

Hi All,

I want to implement online backup and after imaging for our application. I am actually a bit reluctant to do so because there are some “features” in our application that worries me in terms of losing the integrity of the data that is divided over multiple databases.

There are 2 things that bother me about our application:

We have quite a few transactions that write data to multiple databases at the same time.
I can’t be sure that all the “logical” transactions mentioned at point 1 , aren’t in fact 2 or more physical transactions.

What worries me is that when I restore data and I have made my backups online and/or have after imaging enabled, I can never be sure that the records that are divided over all these databases and belong together are all restored. Resulting in an application that will behave erratically.

I know about Two-phase commit and if I understand it correctly this would solve data integrity issues that would otherwise be the result of the transactions mentioned at point 1.
I don’t believe this would prevent the loss of data integrity in the case of a transaction mentioned at point 2.

This is a long introduction for the following questions I actually want to ask you:

- What steps should be taken to ensure data integrity over multiple databases when performing an online backup?

- What steps should be taken to ensure data integrity over multiple databases when after imaging is enabled?

- What are the conditions under which data integrity over multiple databases can be assured?

I want to thank you for reading this far and I hope you can provide me with some insight in this matter. Or perhaps some answers.

Kind regards,

Joost.

OpenEdge RDBMS - Forum

Posted by Paul Koufalis on 13-Feb-2014 11:20

The easiest solution is to combine the DBs into one. With modern versions of Progress and Type II storage areas, the need for multiple DBs is more difficult to justify. And you don't need much code change: just add DB aliases in your startup program. The only hiccup I can see is if you have the same table name in multiple DBs.

Your point #2 sounds like a programming/architecture weakness. No one can help you with that.

To answer your questions: You cannot ensure LOGICAL data integrity across databases with online backup nor with AI if you don't have 2PC enabled. In practice, you can restore and roll forward to the same point in time and cross your fingers that a multi-DB transaction didn't commit in one DB but not in the other at that very exact second.

Paul Koufalis

Progresswiz informatique

Posted by Thomas Mercer-Hursh on 13-Feb-2014 11:46

I can’t be sure that all the “logical” transactions mentioned at point 1 , aren’t in fact 2 or more physical transactions.

That's not just a problem with backup, but with the integrity of the DB in production.

All Replies

Posted by Paul Koufalis on 13-Feb-2014 11:20

Your point #2 sounds like a programming/architecture weakness. No one can help you with that.

Paul Koufalis

Progresswiz informatique

Posted by Thomas Mercer-Hursh on 13-Feb-2014 11:46

I can’t be sure that all the “logical” transactions mentioned at point 1 , aren’t in fact 2 or more physical transactions.

That's not just a problem with backup, but with the integrity of the DB in production.

Posted by Rob Fitzpatrick on 13-Feb-2014 13:33

Regardless of the architectural design challenges you face, without after imaging you face another significant problem in terms of data loss in the event of media failure, DB corruption, a rogue transaction, etc.

Are you doing daily backups? Are they tested and known to be good? If you need to restore one can you afford a day's worth of data loss and the downtime required to recover from a backup?

I don't see this as an "either-or" situation. First make sure your data is safe. Recheck your backup strategy and enable after imaging. Make sure you have a plan for what you're going to do with your AI extents (copy off prod box, roll forward, archive, culling old files, etc.) and test it. Then look at possibly restructuring your application. As Paul said, if your object names are unique it should be a straightforward change to combine the DBs.

Posted by mike.perkin on 13-Feb-2014 21:27

Another reason to combine into one DB is that if you ever consider OE Replication it doesn't support DBs where 2PC is enabled.

Posted by Joost Haakman on 14-Feb-2014 07:08

Thanks everybody for the feedback. This really gives me a clearer picture.

This is what I am going to do:

I will present point #2 to my superiors as a programming/design weakness and have them take a standpoint. When they accept it as a calculated risk, I can then implement 2PC, AI and Online backups. If they can’t accept it, they have to accept that in case of a system failure or other situation where the data might be compromised we will have to go back to the last offline backup and possibly lose a days worth of data.

We actually have the same table names in different databases so until we can get rid of those I can’t merge the data into one database.

Thanks again!

Joost.

Posted by Ruanne Cluer on 17-Feb-2014 02:38

Another option that should be explored is of course Multi-Tennancy?

This thread is closed