Very slow Furgal Test, internal SSD

Posted by James Palmer on 08-Mar-2018 05:54

Got a new laptop (Dell XPS 9560), running Windows 10. 1TB SSD. Just done the Furgal test and it's taken over 30 seconds to grow the bi to 96MB. 

This is the same with OE10.2B, so it doesn't seem to be Progress version related. 

                Thu Mar  8 11:39:58 2018
[2018/03/08@11:39:58.947+0000] P-4964       T-5692  I DBUTIL   : (451)   Bigrow session begin for JamesPalmer on CON:. 
[2018/03/08@11:39:58.950+0000] P-4964       T-5692  I DBUTIL   : (15321) Before Image Log Initialisation at block 0  offset 0. 
[2018/03/08@11:40:21.485+0000] P-4964       T-5692  I DBUTIL   : (7129)  Usr 0 set name to AzureAD\JamesPalmer. 
[2018/03/08@11:40:21.486+0000] P-4964       T-5692  I DBUTIL   : (6600)  Adding 2 clusters to the Before Image file. 
[2018/03/08@11:40:32.211+0000] P-4964       T-5692  I DBUTIL   : (15109) At Database close the number of live transactions is 0. 
[2018/03/08@11:40:32.213+0000] P-4964       T-5692  I DBUTIL   : (15743) Before Image Log Completion at Block 0 Offset 235. 
[2018/03/08@11:40:32.240+0000] P-4964       T-5692  I DBUTIL   : (334)   Bigrow session end. 

I'm presuming therefore this must be something configuration related. The laptop is running plugged in to the wall. It's a similar story on Mike Fechner's identical laptop. 

Would be nice to understand why it's this slow, and maybe get it faster! 

All Replies

Posted by Paul Koufalis on 08-Mar-2018 09:05

James - I also have a Dell XPS 15. On my machine it took 1.125 sec.

Posted by James Palmer on 08-Mar-2018 09:21

Hmmm so something is very wrong!

Posted by ChUIMonster on 08-Mar-2018 09:23

You're running Windows on a shiny new laptop.  Assuming you don't immediately come to your senses and apply The Linux Patch my first avenue of investigation would be to see what sort of extra crapware you're running.  There's  nothing quite like a virus scanner to bring perfectly good hardware to its knees.

Posted by Peter Judge on 08-Mar-2018 09:24

Is it the French-to-english-to-german-and-back translation? :D

Posted by jmls on 08-Mar-2018 09:36

nah, it's the default windows 10 telemetry settings, sending each change of the bi file to MS for "analysis" and then forwarded to the NSA ...

Posted by James Palmer on 08-Mar-2018 11:16

Completely disabled AV and it's still slow. Also slow in a Linux Hyper-V VM.

Posted by ChUIMonster on 08-Mar-2018 11:24

Is your Windows also in a VM?

Posted by ChUIMonster on 08-Mar-2018 11:28

Did you install a FAT16 filesystem instead of something reasonable?

Posted by ChUIMonster on 08-Mar-2018 11:29

Does it have Meltdown and Spectre patches installed?

Posted by Mike Fechner on 08-Mar-2018 11:56

Windows is native (you know me).
 
File system is NTFS

Posted by ChUIMonster on 08-Mar-2018 12:07

Do you know if any of the Meltdown/Spectre patches have been installed?

Posted by James Palmer on 08-Mar-2018 12:21

All patches etc have been installed so I imagine it is fully melting down!

Posted by George Potemkin on 08-Mar-2018 13:24

Is the -zextendSyncIO implemeted on Windows?

Tests on my laptop (with/without -zextendSyncIO):

4.206 / 4.408 sec on HDD

1.800 / 1.750 sec on SSD

Posted by James Palmer on 08-Mar-2018 13:26

SyncIO makes very little difference here.

Posted by George Potemkin on 08-Mar-2018 13:32

With/without -zextendSyncIO: 14.317 sec vs 8.918 sec on VMware (Linux) running on the same laptop

proutil sports -C bigrow 2 -r

0.216 sec on HDD

0.094 sec on SSD

Test on USB memory (formatted as FAT32):

bigrow: 9.356 sec vs bigrow -r: 4.637 sec

So the -zextendSyncIO is indeed not required on Windows.

Posted by Stefan Drissen on 08-Mar-2018 14:51

Takes 0.9 seconds on my laptop's SSD - but then you seem to have this kind of problem more often James - community.progress.com/.../25200

Have you really rebooted (not the fake windows restart) your laptop?

Posted by ChUIMonster on 08-Mar-2018 15:51

Back in ancient times there used to be some settings buried somewhere like "My Computer" or some such place that, by default, had values such as "Let Windows choose what is best for my computer"...  I'm pretty sure that that option is mutually exclusive with "high performance".

If I recall it was something like "System Properties" -> Advanced -> "Performance Options"

The oh so helpful disk indexer is also probably a bad idea.

Posted by Rob Fitzpatrick on 08-Mar-2018 16:04

Correct Tom.  It's under System Properties; still there in Windows 10.  The "Let Windows choose..." is under Visual Effects; mainly shell animation/rendering options.  There are also options for DEP, paging file, and scheduler quantum.  

But I'm not sure how much tweaking any of that will help with James' problem of slow synchronous writes.

Posted by mfurgal on 08-Mar-2018 16:05

Tom:

That parameter was to determine how much memory to use for file cache vs how much memory to allow processes to consume.

If the bigrow is taking too long and it’s consistent, there is likely a hardware problem.  Look at device drivers, make sure the disk ones are up to date.  Run perfmon to make sure that you are driving the IO during the test.  Somewhere between the _dbutil process issuing a write() call and the disk, something is the bottleneck.  The trick is to find it and sometimes that can be a hard trick.  

Mike
-- 
Mike Furgal
Director – Database and Pro2 Services
PROGRESS Bravepoint
617-803-2870 


Posted by ChUIMonster on 08-Mar-2018 16:48

Hopefully you haven't chosen to have a compressed disk either.

Isn't there also some stuff with Windows tracking changes behind the scenes to create save points for disk recovery purposes?

Do you just have one gigantic C: drive?  Or did you split it into several logical drives?

Are you ready to install The Linux Patch yet?

Posted by Mike Fechner on 08-Mar-2018 16:52

Linux patch not approved by management.
 
One large c:\ drive.

Posted by dbeavon on 08-Mar-2018 18:03

Not sure what Linux has to do with anything.  We had recent call with the OpenEdge product group, and they assured us that there were no technical reasons why their products should perform substantially different in Linux vs Windows.  Aside from system calls, the machine code is the same, and runs on the same chips.  Vendors that sell disk are equally happy making money from both OS'es.  I've been very happy using OE products on Windows, especially after coming from HP-UX, which is a pretty outdated platform.

If you have disk i/o problems, they should affect other applications too.  You can isolate disk i/o problems first, before assuming the fault is with Progress.

I'd highly recommend a disk benchmarking tool like "sqlio" (don't be scared by the name, it is independent of SQL server). 

You can download it from here:
https://www.microsoft.com/en-us/download/details.aspx?id=20163

Here are some basic instructions to getting started.
https://www.red-gate.com/simple-talk/sql/database-administration/the-sql-server-sqlio-utility/


You can compare your sqlio results between a system that performs poorly and one with acceptable I/O performance. The tool will allow you to measure both random and sequential I/O.  And it will allow you to specify the number of outstanding/concurrent I/O operations. It will also let you specify your choice of block size.

30 seconds sounds like a very long time for this. Since you are using Windows, couldn't you simply open Resource Monitor and see if there is a disk issue from there?  I'd suggest looking at the total throughput in bytes/sec and also at the response times (circled).  An SSD's response time should stay very low (<3 ms) for most single-user activities.  Note that you can see virus scanning activity from here too (eg MsMpEng).   

 

If you don't see anything of interest on the Disk side of things then flip over to the CPU tab and see if you are capped on a CPU core.  Remember that you can be maxed out on CPU *before* you reach 100%.  For example if you have four logical cores and have a synchronous workload, you may only see CPU reach the 25% mark but this still implies that your process is in fact CPU-bound.

It is quite common for Progress database operations to be CPU-bound, as frustrating and counter-intuitive as it may seem.  And when they do become CPU-bound, often times the operations neglect to use more than one of your logical cores at a time.   

Posted by bronco on 09-Mar-2018 02:19

I got the same problem on my XPS15 9560, 512GB SSD. 26 seconds...

On my previous laptop (4y old by now) it took ~3s...

Posted by George Potemkin on 09-Mar-2018 02:22

> I got the same problem on my XPS15 9560, 512GB SSD. 26 seconds...

Is 'bigrow -r' fast in your case?

Posted by Mike Fechner on 09-Mar-2018 02:22

The old one also Window 10?
 
It’s just a database. Would’t call it a problem though &#128521;

Posted by bronco on 09-Mar-2018 03:18

Both are running W10, latest and greatest. Drivers and everything is up-to-date, as always. Although a diskspd.exe suggests that everything is pretty fast, and I do have the system freezing for a couple of seconds (unable to move the mouse etc).  All in all it's not that the laptop is slow, it's just not as fast as I expected it to be (every now and then). The "furgal" test seems like a symptom.

bigrow -r is fast.

Posted by James Palmer on 09-Mar-2018 04:11

With -r it is almost instant for me.

Posted by George Potemkin on 09-Mar-2018 04:30

As far as I understand the synchronous I/O on Windows are performed by process itself while asynchronous I/O - by the kernel (a process just sends I/O request to the kernel):

msdn.microsoft.com/.../aa365683(v=vs.85).aspx

Could it be just a matter of a privilege for account that run proutil?

Is there any difference if you run proenv as administrator?

Posted by James Palmer on 09-Mar-2018 04:39

With the way my system is implemented I have to run proenv with elevated permissions.

Posted by James Palmer on 09-Mar-2018 06:33

[mention:77d0f2ca82a041a08c26cc89b12b968e:e9ed411860ed4f2ba0265705b8793d05] Thanks for your detailed insights. I've run a disk benchmark on it and the usual stuff is very quick. Quicker than I've seen on most systems I've run the same benchmarker.

Posted by ChUIMonster on 09-Mar-2018 08:09

> Not sure what Linux has to do with anything

If I failed to mention The Linux Patch the forum admins would be likely to conclude my account has been hacked.   Think of it as my "safe phrase".

Aside from general light hearted ribbing -- while the point that you and Progress make about the same hardware etc is valid there is one very obvious  difference.  A Windows laptop or desktop OS is, by default, going to heavily favor the user's GUI experience.  It is definitely not going to be optimized in a server like manner.  By default Linux is a much more server oriented platform.

As surprising to some people as it may seem I do happen to agree that Windows *can* run a Progress DB just as well as Linux on the same hardware.  In my experience it is, however, less likely that Windows will do so out of the box and when there are issues it is less likely that you will find appropriate knowledge and resources available.

Posted by ChUIMonster on 09-Mar-2018 08:10

Do these disk IO benchmarks that you guys are running have options for synchronous IO?

Posted by dbeavon on 09-Mar-2018 09:55

As far as the GUI point goes, yes, there are server and desktop versions of windows. For the OE dbms you would probably want to run on a *server* version of windows in production.  

You also want to disable all the "bells and whistles"; turn off DEP, and set your virus scanning exclusions, and disable offline files in the sync center, etc.   It would be very important to disable all the "unnecessary" security overhead to put the two platforms on a level playing field.  Beyond that I would need to see specific evidence to believe Linux is faster in a given scenario.  I'd guess it depends primarily on the specific app in question,  and on the related developers.  Most applications developers would probably optimize for the platform they use every day and then port things to the others (or let some other outsource developer do that part of the work).  I must say that this is one thing that is so painful about HP-UX.  Nobody is optimizing for HP-UX anymore.  The bar is low - it either works, or you don't get to check that box.  HP-UX is not going to be the focus for any performance optimizations.

David

PS.

It's not hard evidence of anything, but I wanted to tell a Windows vs Linux story of my own.  In one case I believe I had observed that Progress had done their initial software development on Windows and then ported to Linux/UNIX, overlooking a performance degradation.  Here is the KB.  knowledgebase.progress.com/.../State-Free-AppServer-Calls-on-UNIX-Linux-May-Perform-Poorly .  The repro I used when I reported this case was quite basic.  It involved making a number of simple appserver calls in a tight loop with a prodataset.   Given the overhead (an extra ~50-100 ms per round-trip) I don't think it is something a software developer would overlook if it happened on a platform they used every day.

Posted by Libor Laubacher on 09-Mar-2018 12:27

>> You're running Windows on a shiny new laptop.  Assuming you don't immediately come to your senses and apply The

>> Linux Patch

Says someone who is using mac.

[2018/03/09@19:19:33.776+0100] P-22076      T-22612 I DBUTIL   : (6600)  Adding 2 clusters to the Before Image file.

[2018/03/09@19:19:34.203+0100] P-22076      T-22612 I DBUTIL   : (15109) At Database close the number of live transactions is 0.

And that Win10 laptop is not even new and definitely not shiny with loads of crapware as noted above (both AV and Defender are on).

@James - get Process Monitor from sysinternals, catch the output of the bigrow and see what's going on the machine during the bigrow.

Posted by jmls on 09-Mar-2018 12:31

>>Says someone who is using mac.

this. lmfao :)

Posted by ChUIMonster on 09-Mar-2018 15:30

> Says someone who is using mac.

I don't run Progress databases on it.  I use it to open SSH Windows to servers that run Progress databases.

Posted by n.lewers on 10-Mar-2018 11:23

Are there any settings on the  Intel® Rapid Start Technology app that you can  change?

Posted by James Palmer on 12-Mar-2018 04:37

[mention:44a028c96ca44788b729e5185220e84a:e9ed411860ed4f2ba0265705b8793d05] Tried procmon and there's nothing that jumps out at me as being unusual. There's a lot going on though! Might play around with it a bit more later. _dbutil hits the file immediately so it's not like there's a delay or anything.

[mention:2e5a137780774b33bf31156964d08985:e9ed411860ed4f2ba0265705b8793d05] Tried turning off the only obviously optional setting, rebooted, and retested. No joy.

Posted by George Potemkin on 12-Mar-2018 04:51

> _dbutil hits the file immediately so it's not like there's a delay or anything.

It's expected.

[2018/03/08@11:39:58.950+0000] P-4964       T-5692  I DBUTIL   : (15321) Before Image Log Initialisation at block 0  offset 0. 
[2018/03/08@11:40:21.485+0000] P-4964       T-5692  I DBUTIL   : (7129)  Usr 0 set name to AzureAD\JamesPalmer. 

Dbutil added four initial bi clusters (22.535 sec).

[2018/03/08@11:40:21.486+0000] P-4964       T-5692  I DBUTIL   : (6600)  Adding 2 clusters to the Before Image file. 
[2018/03/08@11:40:32.213+0000] P-4964       T-5692  I DBUTIL   : (15743) Before Image Log Completion at Block 0 Offset 235. 

Dbutil added another two bi clusters (10.727 sec).

Time per cluster is aproximately the same on both phases.

Posted by kirchner on 13-Mar-2018 07:09

> As far as I understand the synchronous I/O on Windows are performed by process itself while asynchronous I/O - by the kernel (a process just sends I/O request to the kernel)

All kinds of I/O are performed by the kernel. The usermode does not have direct access to the devices/drivers/etc. I believe it goes this way in every OS, doesn't it?

Also, from the kernel POV pretty much everything in the I/O stack is async. If the request can be served immediately, say from the filesystem cache, then it returns control to the application and off it goes. But if the request needs to hit the actual device then the kernel dispatches a request to the driver which will eventually set the device in motion. Once the device is in motion the driver returns and there is no waiting/blocking in the kernel. If the request was a sync one, the requesting thread (usermode) is put to sleep until the request is completed. For async requests the usermode thread gets control back and continue its execution.

> Could it be just a matter of a privilege for account that run proutil?

AFAIK there is no such thing as sync/async privileges. Once you can do I/O to a handle, you can do it sync or async, buffered or unbuffered, whatever suits better.

Posted by gus bjorklund on 13-Mar-2018 09:06

while it is true that the bigrow test performs synchronous writes, that is a technical detail that may be interesting to discuss but it is entirely beside the point. the point is that if this test runs poorly, whatever it does, the database and application probably also run poorly.

Posted by Ruanne Cluer on 12-Apr-2018 05:54

You're not backing up your bi files to OneDrive or similar?

Posted by James Palmer on 12-Apr-2018 07:09

No but thanks for the question :) This is in a completely scratch folder with a Sports2000 database.

This thread is closed