Compiler Performance in OE Dev Studio (slow access to databa

Posted by dbeavon on 16-Aug-2012 17:30

Compiling large projects in OE Dev Studio can be very slow.  I've only found a few tricks to speeding it up.

The most substantial obstacle to good compiler performance seems to be the network interaction with the remote OE database.  I believe this network chatter is happening in order to retrieve and/or validate database schema.

If I bring over an empty database copy to my local machine, I can get my local projects to compile very quickly, using about 80% cpu utilization.  If the database is remote, however, it takes over twice as much time to compile as the local cpu stays fairly idle at only 30% utilization.

Since I work in a fairly large team (other developers make schema changes and code changes regularly), it is not practical to maintain an empty copy of the database on my local machine.

So I'm investigating how to limit all the network chatter that is going on between the _progres runtime and the remote database during compiles.  I discovered how to get a local schema cache file and add it to my client connection using the "-cache" parameter.  The use of this schema cache file doesn't seem to improve the performance of compile operations (even though a technician on a recent Progress support call said it would).  Using the Windows Resource Monitor in Windows 7, it is easy to see that there is just as much chatter with the database when the "-cache" parameter is used as there is when it is omitted.


Can anyone confirm that the "-cache" file is not used while executing COMPILE statements?  I want to get some feedback from some other developers before I make another call to Progress support.

Is there any other way to minimize the network chatter with the remote OE database during compiles?  I've tried other things like increasing "-B" and "-fc".  These don't seem to impact the performance of COMPILE statements either.

Any help would be much appreciated.

Thanks, David

All Replies

Posted by Sasha Kraljevic on 17-Aug-2012 07:09

As far as I remember, only index and table information is preserved in the local schema cache file.

However, table fields information is fetched during compilation and cached based on -fc parameter.

Try to increase -fc as default 128 is quite low. From online help: "For best results, set this parameter to the total number of fields retrieved  from the database."

Posted by dbeavon on 17-Aug-2012 09:39

Thanks for the response.

Based on your feedback, I went back and did all my "-fc" testing again.  First I noted that this startup parameter applies to "client session".  Then I add it to my Project AVM under startup parameters.  I tried using "-fc 0" as a benchmark (along with the other obvious benchmark which is to omit the "-fc" parameter and use defaults).  Then I keep increasing "-fc" all the way to 100000.  Nothing seems to work.  Compiling my programs never improves in performance and the compiles always uses massive amounts of network bandwidth to talk to the remote database.  This network chatter occurs for ever single program while compiling the application.

As I said before, the things I've tried (unsuccessfully) in order to minimize database chatter during compiles are:

  • -cache file
  • -B
  • -fc

My ultimate goal would be that large application compiles should be able to get each piece of schema information just once and never use the network bandwidth again.  This would be comparable to exporting a delta df from the remote database and loading it on a local (shared memory) database before compiling against that.

I am disappointed that the -cache file may not be the answer to improving compiler performance.  The documentation for this parameter seemed very promising.  Any other ideas?  Worse case scenario is that I have to create a local version of the database every time I recompile the entire application.  This seems crazy, but jumping through these hoops is actually much better than waiting on all that network chatter.  If there is another option, I'd be happy to hear it.

Posted by Sasha Kraljevic on 17-Aug-2012 09:54

In case of OE Development Studio, there are network messages exchanged between Eclipse and AVM (_progres) as well

as between AVM and the remote database. As you said that the total compilation time is much lower when you use local copy

of the database, I'd conclude that it isn't the network traffic between Eclipse and AVM which is causing the problem, but

the one you suggested: between AVM and remote database. It would be nice to see results of the test compilation directly

from within application compiler in _progres and not from Dev Studio. Network latency and bandwidth are important factor.

I'd suggest that you gather this information and contact technical support to log an enhancement request for that issue.

Posted by Sasha Kraljevic on 17-Aug-2012 10:01

I forgot to add: when testing directly with _progres, remember that you would need to measure

time on the second compilation, as the first compilation should populate the

cache, so then you would be able to compare results and actual cache performance.

So start the session, run the first compilation, then start timing the second compilation while still in the same session.

Hope this makes sense.

Posted by dbeavon on 17-Aug-2012 11:31

Yes, I understand.  But even if my test involves compiling 10,000 copies of the exact same program, the compiler isn't faster and the network bandwidth isn't decreasing on the 10,000'th copy compiled.

In my network auditing, the same type of chatter goes on for each program that is compiled.  The chatter is greater when there are more tables referenced in the program that is being compiled.  But even if I have 10,000 copies of MESSAGE "HELLO WORLD", I still get a little bit of network chatter with the database for each copy compiled.  On average I'm using about 5 Mbps for the entire duration of time while I'm compiling my application.  And the application takes about 5 times longer to compile because of this network chatter with the remote database (vs. when I create a local copy of the database in "shared memory" on the same machine that is running Progress Dev Studio).

I'm starting to think that the network chatter with the database might not consist of schema information "as such".  It may be some type of pseudo-schema, such as CRC checking.  Even if the remote database schema is cached and is available locally, the COMPILE statements might be need to continually re-check schema for consistency.  If that is the case, this mechanism isn't very efficent because the network chatter ramps up quite a bit for just a handful of tables (vs. a HELLO WORLD program) .

If my theory is correct - that this is some type of CRC checking - then it may be the case that the "-cache" file parameter is actually being respected (like my Progress support technician had claimed).  But that parameter is just not doing nearly enough for performance sake.  While that feature might retrieve a local copy of the entire database schema "up front", it doesn't really matter much in the long run because the larger network cost, by far, is the continual re-checking of CRC's for dozens of tables in thousands of programs being compiled.

Thoughts?

Thanks, David

PS.

I suppose I can ask Progress for a performance enhancement, I'm just surprised to be the guy to do it (considering I'm new to OE Dev Studio).  You had also mentioned another significant issue that slows down the performance while compiling large applications.  That loopback activity between eclipse and _progres for every compiled program is just pretty crazy.  Based on my testing you could compile an application many times faster if this integration was designed a bit differently.  I expect that most organizations must be using third-party or custom tools that work outside of eclipse to recompile their larger applications.  It seems a bit redundant.  A software development IDE should be up to this task, right?

Posted by dlauzon on 17-Aug-2012 14:08

I had a similar issue a couple of years ago under 10.1C01 with Eclipse/Dev Studio being very slow to parse ABL code (it might not have anything to do with your problem, tough), I don't know if it has been corrected, it was Bug# OE00177634.

I troubleshot it using Process Monitor (incredible tool: http://technet.microsoft.com/en-us/sysinternals/bb896645.aspx) and it turned out that javaw.exe of Eclipse was the culprit.


If the physical name of the DB is a file name (e.g. I:\Temp-db\temp-db.db), this file (I:\Temp-db\temp-db.db) is flooded with queries (a couple thousand).  When the file happen to be on a network drive, it's very slow over all to open the souce file. If the physical name is just the name (e.g. temp-db), the root of the project is queried (e.g. c:\someProject\temp-db) and the root of the workspace is queried (e.g. c:\...\Workspace101C\temp-db).  There are still many thousand queries, everything works the same, but if those paths are local, it's much faster, probably also because those files/directories (i.e. c:\someProject\temp-db and c:\...\Workspace101C\temp-db) doesn't exist.


The number of repeated calls seem to be directly proportional to the number of lines in the source code (some code lines adding more than one call).

The workaround was to use just the db name without the path (see screenshot), it doesn't need it anyway as long as the DB is connected through a TCP/IP port.

The other thing I tweaked is the compilation PROPATH (see my full description here of compilation propath VS execution propath and the performance implications: http://www.oehive.org/node/1464).

Posted by Tim Kuehn on 20-Aug-2012 10:41

The compiler has to resolve all the _file, _field, _index, and related queries as it goes through the source code, much like any FIND for FOR EACH for data tables. This is true regardless of the type of client connection the compiler makes (self-serve or remote).

This is where all the network chatter you're seeing is coming from.

The only way to make this go faster is to make a local db with a copy of the remote db's schema and connect to that during your compile.

Posted by dbeavon on 20-Aug-2012 11:11

Yes, I've given up hope on any client connection parameter/optimization that queries the remote database schema just *once* per build (vs. querying the same schema over and over again thousands of times).

When I opened a support case with Progress, they told me that the "-cache" parameter would do the trick.  They said it would allow schema references to use the local cache file while COMPILE'ing.  Unfortunately this wasn't the case.  It can be clearly shown that a ton of network chatter still takes place.

Whatever else "-cache" may do, it certainly doesn't significantly improve the performance of COMPILE'ing.  Thanks for all the responses.  When I have some time I will update my support case with your feedback.

This thread is closed