we have a basic 160mb database.
Using win32 progress 10.1B...
Using the progress editor (pro ) it takes roughly 33 minutes to load this 160mb database.
Using a bulkloader file, the data loads in about 10 seconds. Then another 1 minute to run an index build.
The server is a quad core xeon 2.xxx ghz proc with 16gb ram. Raid-10 disk setup.
On my presumed much slower laptop running progress 10.1b in linux, this same load takes 12 minutes, not 33, using the editor. The bulkload/idxbuild alternative runs super fast as well.
is there something with win32 progress that just makes it run significantly slower in the editor and is there any way around it?
thanks for any input
I'd surmise that the lack of a -B may be your problem with the editor load, as the default for a single-user session's pretty small, and most likely it's doing a record-per-tx transaction structure, which is also incredibly slow.
First suggestion would be to try
pro db-name -B 5000
If that doesn't do it for you, then write a small import program to block the record TX creation blocks into groups of 100 records or so.
There are a lot of variables.
1) The dictionary load is easily the slowest commonly used way to load data.
2) Disk IO is usually the biggest bottleneck. Not CPU speed. But RAID10 vs a laptop disk should have worked out differently. My guess is that the disks that you think are RAID10 are not. They might be RAID10/2.
3) Neither the dictionary load nor bulk load will multi-thread.
4) Various startup parameters and config settings will impact load times. You haven't said anything about that so it is hard to judge.
5) And so will other activity on the OS.
6) The problem with win32 progress would be that it is windows
With a mere 160mb the magnitude of the bulk load improvement may be partly due to the data being cached from the initial dictionary load.
Just guessing but is your laptop a single-core laptop?
Do you have a server running or are you doing these loads in single-user mode?
Where is the bi file located in thses tests? What is the bi cluster size?
Did you try loading with -i?
Did you set -B?
How about -T?
-TB and -TM?
If a server is running did you set -spin?
Tim makes a good point. I somehow assumed that you were using the data dictionary from the editor.
If you are instead using something along the lines of:
input from "data.d"
repeat:
create tableName.
import tableName.
end.
then you have found the "even slower than the data dictionary" method.
_mprosrv -B 50000 -spin 50000 -L 50000
160mb in 33mins!!
server has 16gb ram and quad xeons. the raid 10 is raid 10
the same setup on my laptop running linux in vmware with 768mb ram and a dual core 2ghz cpu takes 12 mins.
the bulkload runs in about 10 seconds and then another minute or so for the index rebuild.
is windows just that much slower than linux? it's obnoxious. unfortunately the customer wont go linux.
yes i'm doing this from the data dictionary in the editor.
_mprosrv -B 50000 -spin 50000 -L 50000
Try cutting -spin down, that's way too high. I attended Rich B's session on db latches and such, and -spin being too high can result in the CPU doing more time in -spin than in actual work.
is windows just that much slower than linux? it's
obnoxious. unfortunately the customer wont go linux.
>
yes i'm doing this from the data dictionary in the
editor.
There's something weird going on here. However, a dictionary load's not that fast either.
i guess i'm just noticing a huge performance difference between the typical linux systems we work with and then this windows system in terms of the loading times.
we have standard data sets that get loaded at installation time and it's 3 to 4 times slower on a similarly configured windows vs linux box.
and then to see my laptop in vmware smoked the server...
even win32 progress on my laptop edged it out by 5 minutes.
Did you do a promon on the db while it was loading? That should provide some clues as to why things are going so slow.
I can think of a number of other possibilities, but to me 33 min for 160MB can only be interpreted as something's seriously different between your win and linux platforms which is causing the long load times.
Like Tim said, -spin 50,000 is probably too high. Try 5,000. (But I doubt that that is the main problem.)
What makes you so sure that it is really RAID10? It sure isn't acting like RAID10.
Is the bi file on the same "drive" as the database? What about ai?
One more thought... Windows is the slow box, right? Are running virus software? How about spyware software? Do you have the windows search thingie enabled? Have you shut off the screen saver?
Oh, wait a new last thought! Are you starting a local self-service session to do the load or have you connected client/server to do the load? IOW did you use -S in you client session startup?
Is the bi file on the same "drive" as the database?
What about ai?
Is the db on a network drive? (Please say "no"!)
everything is local. the guys build the raid array as raid 10. i can only trust that they are smarter than monkeys and picked the right option.
bi is on the same logical drive, drives are NOT networked and i am connecting shared memory not client/server
bi location shouldnt matter because i'm not worried about performance tuning here, just the relative speed difference between windows and linux.
this smokes on a linux box.
no AI running for this load
Given what you've written so far, I'd find some Windows performance monitoring tools and see how busy your system is. If the CPU's pegged, or the disk drive isn't too busy, there's a problem somewhere besides the db.
Also, the BI location does affect performance depending on what hard drive subsystem it's on.
How fast can this system do a straight DOS or explorer copy of a 160 MB file? That would be your theoretical upper limit of it's performance.