I'm looking for any input from anyone who has experiences using a cloud backup service which states they only back up the "changes of your files" to the cloud. I know from previous PUG conferences OpenEdge DB's don't de-dupe well. Has anyone had experiece with these companies that say they can de-dupe their data and send it to the cloud and you can recover your files as of that point in time?
I see the process as follows:
Do a probackup to a file, the backup file gets sent to the service who 1) store a local copy, 2) uploads only the changes since the last copy to the cloud, 3) allows you to recover copy from the cloud to new location or a cloud server.
The IT department sees this working with all the windows files, exchange server, MSSQL databases etc. But I don't trust the process with everything I've heard and read about a Progress DB. Because all the extents of the DB change when you startup/shutdown, how are they only tracking the changes.
Anyone using a service like this? What are your experiences. PS database is around 600Gig.
Thanks for any info.
Even if all files are changed when you open a OE DB, not every byte in every file is modified. Probably only a few actualy change. I'm no dedup expert, but I know they check "blocks" for changes, whatever a block mean. Let's say the compare 4KB at a time, so they can extract the changes.
As with any backup solution, the only way to know it is actually working is to go ahead and restore the backup. Restore, check the database and if everything comes out clean, then you know the bytes of the backup file were correct at the time the restore operation read them. After the backup read those bytes they might have been corrupted by whatever corrups backups :)
Mike, if I understood correctly, the dedup processing would occur on backup files, and those don't change after probkup has finished with them.
So the OP would send a BAK to a service who would store this first BAK entirely. Then on subsequente backups they would only store diffs.
As long as the service can provide me with a byte-by-byte exact copy of what I send them, I don't really care how they store things. But again, can they? Only way to be sure is to test backups regularly.
For backups I wouldn't be terribly concerned about integrity per se. As Mike says if the data is at rest then it should be fine. I would have some concerns about how effective the solution is and the operational issues around it.
- does it really eliminate much data?
- how much overhead does it take to do that?
- is it more, or less, impactful than simply zipping the backup files?
- how about restores? How long does it take to get you data when you need to restore?
If they are talking about de-duping the live database on disk I would be very, very worried and insist on a very high level of proof.
This would only be 'backing up to the cloud' the progress probackup file (database.BAK). The BAK file will be copied to a local appliance ("disk to disk" copy), then that appliance dedups and sends the data to the cloud. The appliance is suppose to keep 4-5 days worth of local copies of the data for recover. The could is more for a DR solution that local environment is gone. Other than the overhead of the initial copy, there is no overhead I'm responsible for.
I haven't even gotten to the real recovery process because I don't believe in the backup dedupe process. The local appliance is suppose to dedupe the local copy before sending it to the cloud.
I see it working for file based systems, I just don't see it working for a probackup file.