Architect looses connection to AVM when syntax-checking

Posted by bronco on 29-Aug-2011 13:47

I've got something strange. When syntax checking a rather lengty procedure, 8000+ lines of code, preprocessed +/- 2.6MB (it's legacy code), after 15 seconds OEA seems to loose connection to the AVM. The AVM is still busy for a while, but no message is returned to OEA. When I try to syntax check again the menu entry is disables, hence my conclusion of the lost connection.

Does this sound familiar? Or better does anyone have a fix/workaround?

Thanks,

Bronco

PS. Yes, I know when you want to eat an elephant you have to chop it in pieces, that's not an option however (for now).

All Replies

Posted by jmls on 29-Aug-2011 13:55

that happens when the avm crashes. You need to restart the avm

(Openedge->Restart avm)

What version of progress ?

On 29 August 2011 19:47, Bronco Oostermeyer

Posted by bronco on 29-Aug-2011 14:02

Well, the AVM doesn't crash. No protraces whatsoever, it's still in the task manager. I'm wondering why it stops talking to OEA. It somehow seems that the socket connection betweeb OEA & AVM has a time-out.

I know how to restart the AVM (after working with OEA for several years  ), I just want my syntax check to complete.

Btw, version is 10.2b04, as mentioned in the tags. What might be interesting however is that I'm running Eclipse 3.6.0 (because of MercurialEclipse)

Posted by jmls on 29-Aug-2011 14:10

I used to get that a lot in older versions of progress.

what version of progress are you using ?

On 29 August 2011 20:02, Bronco Oostermeyer

Posted by bronco on 29-Aug-2011 14:12

I accidently submitted. I edited some more info into my previous reply...

Posted by Thomas Mercer-Hursh on 29-Aug-2011 14:24

Want a recipe for elephant stew?

There is an historical issue, supposedly improved in more recent releases, with very large files, particularly if there are lots of includes, particularly nested includes, and compounded if one does things like reference a remote disk for source.  It might help to describe the elephant a bit ... but it actually sounds like something you should bring to tech support.

Posted by bronco on 29-Aug-2011 14:33

Well, indeed the elephant has a lot of (nested) includes. If I open the procedure in the Procedure Editor (with PE button in OEA) and compile this monster it actually does so succesfull after 40 seconds (or so). So, the AVM itself doesn't have a problem, with compiling that is.

I agree with you that it looks like a techsupport case. I was trying to avoid the ordeal of making reproducible case for them, but it looks inevitable.

Posted by Thomas Mercer-Hursh on 29-Aug-2011 14:39

If you have lots of nested includes, then I would suspect that it is an instance of the same problem.  This was OEA specific, i.e., not an issue with the AVM itself.

Posted by bronco on 29-Aug-2011 14:45

If I preprocess this it has 72000+ line of code (well, that includes comments as well). Huge by any standard I suppose.

I let someone refactor it into smaller pieces

Posted by Thomas Mercer-Hursh on 29-Aug-2011 15:09

Recognize that, if you do report it to TS, it is fairly likely that they will say "we confirm, but fix will be in the next release", so your immediate action may have to either do without OEA or put the elephant on a diet!

Posted by Admin on 29-Aug-2011 17:45

Even when you solve this by changing the code, make sure you report it to tech support.

They really work on fixing these errors and it kept getting better with every release/SP

Posted by bronco on 30-Aug-2011 01:52

Having worked for Progress for 8 years, I know the drill

Posted by Matt Baker on 30-Aug-2011 15:00

To clear up some misunderstanding here about sockets...

There is no timeout on the socket itself.  OEA will happily stay connected to an inactive socket to an AVM all night.  Sockets "generally" don't timeout unless you're trying to do a connect a read or a write on the socket and there is no response within a fixed timeout (~30 minutes on windows) or the network itself becomes unavailable.  There is also no timeout on the compile job in the OEA code.  Check syntax does have timeout set at 15 seconds so as not to hang the UI forever in case something goes awry.   But when the check syntax finishes everything should become available again.

Anything sent to the AVM blocks everything else sent to that AVM for some period of time.  So if youre compile/check syntax takes 40 seconds normally, it is going to take 40 seconds in OEA.  But if you're doing a check syntax, after 15 seconds, the monitor in the UI (the checking syntax dialog) gives up and goes home, but the call to the AVM will still finish.  That means you can't do anything like check syntax, run a program or compile something else until the AVM finishes.  UI elements such as menu picks get turned off in OEA if the AVM fails to respond in a reasonably short amount of time (these have timeouts on them) and get turned back on when it does become available.

If the AVM never becomes responsive again, then check the warning/error list coming out of the compiler for this procedure.  There may be problem trying to build up the results of the compile especially if it becomes is silly long.  Fix the warnings/errors. If you have frame sizing warnings the ABL compiler likes to spit out lots and lots of messages.  If this happens the response from the project AVM back to OEA may get silly long which can cause its own problems.

Check the workspaces .log file for a error messages.

If you're not getting lots of warnings...its something else and you'll probably have to provide tech support with a reproducible

Posted by bronco on 05-Sep-2011 13:48

Thanks for the comprehensive explantion. My question would be, if OEA assumes a lost connection (or a non-responding process), why not notify the user in a decent fashion (<> logfile)?

Since my syntax check never returns the result when OEA  assumes a lost connection (stalled process) it would be good practise (imho) to notify with a message or better a choice if you want to wait or proceed with the result.

just my 2c

Posted by Matt Baker on 07-Sep-2011 09:50

I had a really long detailed post that I started with here, and then deleted it because it sounded more like a rant.

So to sum up most of it:

1.  Hindsight is 20-20.

2.  There is no difference in the AVM between "hung" and "still busy doing a syntax check" because the AVM is single threaded, and hence no "assumed hung".

3.  The 15 seconds seemed like a reasonable timeout on a check syntax..."no one's code should take more than that to compile".

4.  The usual...tell tech support so that someone in development can adjust the timeout and provide some better feedback when the timeout on check syntax expires.

Posted by bronco on 07-Sep-2011 14:22

I agree with you that >15s is quite a long time. However, we switched to mercurial which enables us better to develop truly local. The databases are a different story, we're still running them over a VPN connection (when working at home). That why it takes so long.

I already notified TechSupport.

Thanks for sharing your insights.

B

Posted by edebeij on 08-Sep-2011 12:39

Hi,

Jumped into a simular issue.It was due to 'Automatically check syntax'(OpenEdge Architect Editor Build).

That option tries to check the syntax before saving on every letter you type, when you turn it off you only get syntax checking on save or on request.

(Just check the amount of times the _idecompile.p is called to see what he is trying to compile and how many times he tries).

But of course it is still an issue.

This thread is closed