11.5.1 Win 32: Why is a backup online to an exiting file so

Posted by Stefan Marquardt on 10-Jun-2016 04:11

I noticed that if I do not delete the backup file before the online backup is running much slower (2-3 times)
It's easy to add a del db.bak in the script but I am interesting about the possible cause.
Does anybody knows it?

All Replies

Posted by e.schutten on 10-Jun-2016 04:26

We have the same over here. When there is an old backup file, the OpenEdge database backup take much more time than when there is no backup file. My quess is that Progress is reordering the file (with BI blocks?)

BTW We have this on Windows.

Posted by Gareth Vincent on 10-Jun-2016 04:34

I thought I would quickly test this on Linux (OE11.6).  No significant difference

Posted by George Potemkin on 10-Jun-2016 04:38

Correct, it's an issue on Windows only. And it's not Progress issue, me think.

Posted by Stefan Marquardt on 10-Jun-2016 05:35

Not a progress issue?

probkup have to delete the file itself or throw a warning or something else.

Nobody would expect that an existing file is the problem for a long running backup.

PSC: I like to get the source code please to know what's the secret for this behavior.

Posted by George Potemkin on 10-Jun-2016 05:42

Probkup was created to write initially to a tape rather than to a file.

Posted by Stefan Marquardt on 10-Jun-2016 06:08

For Windows too? A virtual device is not very common in windows backup software, isn't it?

The measured times:

20 Minutes to an non existing file

~40 Minutes redo the same backup but the previous file is still there

Posted by e.schutten on 10-Jun-2016 06:18

I don't think that it is a Windows issue. When we do SQL server backup, than it is fast. So my guess is that OpenEdge is doing something with the backup file. I think that the BI blocks are at begin of the file and then the data blocks is written. So when there is more or less BI blocks, than OpenEdge is trying to reorder something in the backup file. But that is an assumption of my!!

Posted by George Potemkin on 10-Jun-2016 06:52

> I think that the BI blocks are at begin of the file and then the data blocks is written. So when there is more or less BI blocks, than OpenEdge is trying to reorder something in the backup file

I'm sure that the size of BI file in Stefan's database is much smaller than the size of data areas. Then we can ignore the existence of BI blocks. And since V11.2 online probkup dumps only active BI clusters. BTW, probkup reports the number of blocks it dumped. Stefan, can you post the numbers?

> 20 Minutes to an non existing file

>~40 Minutes redo the same backup but the previous file is still there

Check db log:

BACKUP 5: (1362) Full backup started.
BACKUP 5: (16866) Dumped 256 active BI blocks.

BACKUP 5: (5461) Begin backup of Data file(s).
BACKUP 5: (5462) End backup of Data file(s).

Does backup slow while dumping BI blocks? Or Data files? Or both?

Does the issue exist for offline backup as well? Or for two sequential online backup with no database activity in-between? In other words when backups are totally identical. Then what Progress would reorder in backup file?

Posted by Rob Fitzpatrick on 10-Jun-2016 06:55

> And since V11.2 online probkup dumps only active BI clusters.

That feature was introduced in 11.3 and was fixed to be usable in 11.5.

Posted by George Potemkin on 10-Jun-2016 07:27

Message 16866 (Dumped <n> active BI blocks) was added in 11.2. Yes, I know it's wrong to reply on promsgs file. ;-)

Posted by Rob Fitzpatrick on 10-Jun-2016 07:45

So we know the message was there in 11.2.  Was the feature there?  

Posted by gus on 10-Jun-2016 09:06

> On Jun 10, 2016, at 6:43 AM, George Potemkin wrote:

>

> Probkup was created to write initially to a tape rather than to a file.

true, but this makes almost no difference for what it does, which is to write sequentially to the output file or device. it certainly does not explain this behaviour, which does not occur on other operating systems.

Posted by James Palmer on 10-Jun-2016 09:42

Testing on 11.6 34 bit:

500MB backup file

Backup online to new file:

[2016/06/10@15:37:09.732+0100] P-7036       T-308   I BACKUP  6: (1362)  Full backup started.

[2016/06/10@15:37:15.484+0100] P-7036       T-308   I BACKUP  6: (1364)  Full backup successfully completed.

Backup online with existing file, no database activity between: 7 seconds.

[2016/06/10@15:37:25.936+0100] P-5964       T-428   I BACKUP  6: (1362)  Full backup started.

[2016/06/10@15:37:32.610+0100] P-5964       T-428   I BACKUP  6: (1364)  Full backup successfully completed.

Backup online with existing file and database activity between: 7 seconds.

[2016/06/10@15:38:32.596+0100] P-6716       T-1632  I BACKUP  7: (1362)  Full backup started.

[2016/06/10@15:38:39.426+0100] P-6716       T-1632  I BACKUP  7: (1364)  Full backup successfully completed.

Posted by Thomas Mercer-Hursh on 10-Jun-2016 09:50

One might note that the safe thing to do is never to write over and over again to the same backup file name.  Even if you think you have a process which then copies that file somewhere else, if you are ever wrong that it got copied and your backup fails, then you have just destroyed your last backup.  Write to a backup file name with a date-time stamp name and this issue will never arise.

Posted by gus on 10-Jun-2016 09:56

indeed. good advice thomas.

the last time i had to restore a customer's database, the last nightly backup was no good because the had done just that and then the backup failed. and had failed several times. fortunately they were using after-image journalling and had a week old backup.

Posted by Thomas Mercer-Hursh on 10-Jun-2016 10:02

Many years ago I had a customer who set up a backup process, I think direct to tape.  The operator was supposed to rotate the tapes in some fashion like a new tape each day of the week and then save the Friday tape for a while and the end of month tape for longer, but at some point, for reasons unknown, the operator stopped bothering to change the tape at all.  Eventually, the DB failed and it turned out that the last good backup was close to a year old.  Ferocious rekeying!!!

Posted by Paul Koufalis on 10-Jun-2016 10:02

If I may add my $0.02CAD:  On WIndows, when probkup starts and fails, it still changes the timestamp on the physical DB file.

I remember a specific situation where a customer had no more empty AI extents so the backup had been failing for weeks. But every day the existing 400 GB probkup file was touched and the admin was incorrectly relying on the file time stamp to assure himself that his backups were good.  

Posted by George Potemkin on 10-Jun-2016 11:24

Customer's example.

Backup time of 144.6 GBytes:
1. 3934.319 sec = 01:05:34.319
2. 59646.163 sec = 16:34:06.163
3. 61128.497 sec = 16:58:48.497

Backup time of 144.6 GBytes:
1.  3934.319 sec = 01:05:34.319
2. 59646.163 sec = 16:34:06.163
3. 61128.497 sec = 16:58:48.497


[2015/03/07@00:05:01.551+0300] P-4688       T-3456  I BACKUP 11: (12850) Backup blocks will be written to e:\backup\dbname.bkp.
[2015/03/07@00:05:01.551+0300] P-4688       T-3456  I BACKUP 11: (1362)  Full backup started.
[2015/03/07@00:05:01.552+0300] P-4688       T-3456  I BACKUP 11: (6686)  37437691 active blocks out of 45569365 blocks in e:\db\dbname will be dumped.
[2015/03/07@00:05:01.553+0300] P-4688       T-3456  I BACKUP 11: (6688)  222784 BI blocks will be dumped.
[2015/03/07@00:05:01.554+0300] P-4688       T-3456  I BACKUP 11: (9285)  Backup requires an estimated 144.8 GBytes of media.
[2015/03/07@00:05:01.554+0300] P-4688       T-3456  I BACKUP 11: (9286)  Restore would require an estimated 37960087 db blocks using 144.5 GBytes of media.
[2015/03/07@00:05:02.009+0300] P-4688       T-3456  I BACKUP 11: (16866) Dumped 256 active BI blocks.
[2015/03/07@00:05:02.010+0300] P-4688       T-3456  I BACKUP 11: (5461)  Begin backup of Data file(s).
[2015/03/07@01:10:36.329+0300] P-4688       T-3456  I BACKUP 11: (5462)  End backup of Data file(s).
DATA BACKUP TIME: 3934.319 sec = 01:05:34.319
[2015/03/07@01:10:36.331+0300] P-4688       T-3456  I BACKUP 11: (13625) Wrote a total of 1103359 backup blocks using 143.1 GBytes of media.
[2015/03/07@01:10:36.406+0300] P-4688       T-3456  I BACKUP 11: (1364)  Full backup successfully completed.


[2015/03/08@00:05:01.301+0300] P-7664       T-6404  I BACKUP 11: (12850) Backup blocks will be written to e:\backup\dbname.bkp.
[2015/03/08@00:05:01.301+0300] P-7664       T-6404  I BACKUP 11: (1362)  Full backup started.
[2015/03/08@00:05:01.301+0300] P-7664       T-6404  I BACKUP 11: (6686)  37449527 active blocks out of 45569365 blocks in e:\db\dbname will be dumped.
[2015/03/08@00:05:01.303+0300] P-7664       T-6404  I BACKUP 11: (6688)  222784 BI blocks will be dumped.
[2015/03/08@00:05:01.304+0300] P-7664       T-6404  I BACKUP 11: (9285)  Backup requires an estimated 144.9 GBytes of media.
[2015/03/08@00:05:01.304+0300] P-7664       T-6404  I BACKUP 11: (9286)  Restore would require an estimated 37971947 db blocks using 144.6 GBytes of media.
[2015/03/08@00:05:01.666+0300] P-7664       T-6404  I BACKUP 11: (16866) Dumped 256 active BI blocks.
[2015/03/08@00:05:01.667+0300] P-7664       T-6404  I BACKUP 11: (5461)  Begin backup of Data file(s).
[2015/03/08@16:39:07.830+0300] P-7664       T-6404  I BACKUP 11: (5462)  End backup of Data file(s).
DATA BACKUP TIME: 59646.163 sec = 16:34:06.163
[2015/03/08@16:39:07.864+0300] P-7664       T-6404  I BACKUP 11: (13625) Wrote a total of 1103709 backup blocks using 143.2 GBytes of media.
[2015/03/08@16:39:07.898+0300] P-7664       T-6404  I BACKUP 11: (1364)  Full backup successfully completed.


[2015/03/09@00:05:01.020+0300] P-3684       T-1648  I BACKUP 11: (12850) Backup blocks will be written to e:\backup\dbname.bkp.
[2015/03/09@00:05:01.021+0300] P-3684       T-1648  I BACKUP 11: (1362)  Full backup started.
[2015/03/09@00:05:01.021+0300] P-3684       T-1648  I BACKUP 11: (6686)  37458573 active blocks out of 45569365 blocks in e:\db\dbname will be dumped.
[2015/03/09@00:05:01.022+0300] P-3684       T-1648  I BACKUP 11: (6688)  222784 BI blocks will be dumped.
[2015/03/09@00:05:01.023+0300] P-3684       T-1648  I BACKUP 11: (9285)  Backup requires an estimated 144.9 GBytes of media.
[2015/03/09@00:05:01.024+0300] P-3684       T-1648  I BACKUP 11: (9286)  Restore would require an estimated 37981012 db blocks using 144.6 GBytes of media.
[2015/03/09@00:05:01.427+0300] P-3684       T-1648  I BACKUP 11: (16866) Dumped 256 active BI blocks.
[2015/03/09@00:05:01.428+0300] P-3684       T-1648  I BACKUP 11: (5461)  Begin backup of Data file(s).
[2015/03/09@17:03:49.925+0300] P-3684       T-1648  I BACKUP 11: (5462)  End backup of Data file(s).
DATA BACKUP TIME: 61128.497 sec = 16:58:48.497
[2015/03/09@17:03:49.946+0300] P-3684       T-1648  I BACKUP 11: (13625) Wrote a total of 1103974 backup blocks using 143.2 GBytes of media.
[2015/03/09@17:03:49.977+0300] P-3684       T-1648  I BACKUP 11: (1364)  Full backup successfully completed.

Posted by gus on 10-Jun-2016 13:06

yes. we know that this happens on windows. but we do not know why.

Posted by George Potemkin on 10-Jun-2016 23:20

Comparing the results posted by Stefan and James and the results from our customer I think we can say that the ratio between the times to backup to an existent vs new file is growing as a power of backup's size. And taking into account the units of measurement we, of course, should use a ratio of backup's size to a size of something else. The only candidate that come to my mind is the system memory. James' example: 500MB backup file is, of course, smaller than available RAM and the backup time is the same when it uses an existent or new file. Stefan's example: backup to an exiting file is 2-3 times slower. "20 Minutes to an non existing file" - I guess the size of backup is approximately 40 GB (estimating the backup's writes as 30 MB/sec). Most likely backup's size is a few times larger than RAM. In the case of our customer the size of backup file is almost 150 GB - much bigger than available memory and backup to an exiting file is 16-17 times slower.

If the issue indeed depends on ratio between the size of backup and system memory then I would say it's not a Progress issue.

I guess the issue can be reproduced by a simple 4GL program (provided you have 40 GB of free space on your disk and available RAM is a few times smaller than 40 GB):

OUTPUT TO VALUE(OutputFile) APPEND.
SEEK OUTPUT TO 0.
DO  i = 10000000:
  PUT StringWith4KSize.
END.
OUTPUT CLOSE.

Posted by gus on 11-Jun-2016 12:34

some further thought

- It makes sense that the backup would slow down when the in-memory filesystem cache was exhausted. But that should be so whether we are writing to a new file or an existing one.

- Writing over an existing file should be /faster/ since disk space has alreay been allocated.

- If backup blocks are not a multiple of the file system blocksize (and I do not recall what the backup blocksize is) then that could introduce an inefficiency in writing. If a backup block were say 3 and a half filesystem blocks, then writing a backup block would overwrite 3 filesystem blocks and update one, which requires reading it and pasting new data over half of it. That might require waiting for a full rotation of the disk platter. When writing a new file, no reading is necessary. Maybe that is the cause.

Posted by George Potemkin on 11-Jun-2016 12:57

Yes, I totally forgot that probkup uses "XXL size" backup blocks: the -bf * db blocksize where the default for the blocking factor parameter is 34. IMHO, it would be a good idea to try to run probkup to an existent file with the -bf 1 just for a testing.

And the -verbose parameter can shed a light if probkup is slow from the beginning or the slowness is increasing over time: probkup will display "Backed up n blocks in hh:mm:ss" every 10 seconds.

> then writing a backup block would overwrite 3 filesystem blocks and update one, which requires reading it and pasting new data over half of it.

But it should not be 16 times slower, could it be?

Posted by Stefan Marquardt on 13-Jun-2016 05:22

-bf 1 with existing file is fast:

OpenEdge Release 11.5.1 as of Wed May  6 19:01:46 EDT 2015

11543502 active blocks out of 11669279 blocks in [db] will be dumped. (6686)

16384 BI blocks will be dumped. (6688)

Backup requires an estimated 88.3 GBytes of media. (9285)

Restore would require an estimated 11579658 db blocks using 88.2 GBytes of media. (9286)

Backed up 11559886 db blocks in 00:21:50

Wrote a total of 4014039 backup blocks using 61.2 GBytes of media. (13625)

Backup complete. (3740)

Posted by Stefan Marquardt on 13-Jun-2016 05:24

Backup to SAN which shall be SSD, so there isn't any rotation of something (hopefully)

Posted by Richard Shulman on 13-Jun-2016 07:31

Hello All,

I made an article for this in the Progress Knowledgebase siting the two known workarounds.

I believe the improvement noted when using -bf 1 suggests it is something happening in the buffering of data by the probkup online but this will have to be investigated more internally.

I'll file a bug for it.

Thank you

This thread is closed