Archiving data : possibilities

Posted by gdb390 on 26-Feb-2019 15:01

Hi,

What are common archiving data strategies in OpenEdge ? 

We would like to archive data from the database so that it is only available for consultation.

Keeping in mind that tables and possible linked tables should be archived in the same operation.

We work with both workgroup & enterprise databases.

Kind regards

Gerd 

All Replies

Posted by ChUIMonster on 26-Feb-2019 16:10

IMHO archiving, per se, is a fool's errand that is almost always a hold-over from days of yore when storage constraints were much different than they are today.

You either need the data or you don't.  Your data retention policies (which should be largely driven by legal requirements) should define how much data to keep and for how long.

"All of it, forever" is hardly ever the right policy and can lead to serious legal problems.  As well as expenses incurred having to search it or make it available in the event of legal action.  So you probably need a well defined *purge* policy and process a lot more than you need an archive strategy.

If you think you have performance problems due to volume of data you much more likely _actually_ have performance problems due to poor indexing or poorly written queries.

In a few cases you could legitimately argue that the "working set" of data is distinct from that which you would archive and that some operations, like backup and restore, take substantially longer because you are carrying the "dead weight" of that old data.  But you mention that you have workgroup databases so I am skeptical that that situation applies here.

Aside from the above -- one huge problem that always occurs when people archive old data is that they fail to keep the old code that knew how to operate on that data.  They either assume that the code will never change or that the schema will never change and then, very quickly, whatever code they are (probably not actually) using to access the archived data can no longer make sense of it.  Which they eventually find out years down the road when that unexpected legal demand for 10 year old data is served.

In my experience people who successfully "archive" old data generally do it by exporting the important bits to a an external systems (such as a data warehouse) and aggressively pruning the live working set.

Posted by gus bjorklund on 26-Feb-2019 17:28

> On Feb 26, 2019, at 11:11 AM, ChUIMonster wrote:

>

> one huge problem that always occurs when people archive old data is that they fail to keep the old code that knew how to operate on that data

including the old versions of operating systems and other software (and possibly hardware too) needed to read the old data.

this can be a yuuge problem. people tend to forget things.

when you save backups for years, can you actually read them today? are your software upgrades and changes documented well enough that you can determine what is needed to read a three year old backup? or seven years?

if you archive to tape, do you have drives that can read the cartridges to which you wrote them? is there documentation for how to read the data? where is it? how do you know?

if you get a subpoena for data from 2010, how will you respond? the best response is "we do not have that data. in accordance with our data retention policies, we deleted it in 2017".

when you do save old data, how do you know you still need it? "just in case" is not good enough - maybe you only need a few summaries.

Posted by gdb390 on 14-Mar-2019 10:03

thank you for your responses , I will take your concernes into account

This thread is closed