ABL's SEH is missing error stack by default

Posted by dbeavon on 10-Oct-2017 14:41

My understanding is that most programming languages that support structured error handling (SEH) will provide the callstack to indicate where an unhandled exception originated.

For example in C#, you have the callstack frames in a member called "StackTrace".

If the information is not of interest, or sensitive, then it can be hidden fairly easily.  For example, by throwing a new exception without referencing the original, it will appear that the problem originated at the current location, instead of the original one.  However, for most internal business applications, the exception call stack is almost always of interest for troubleshooting purposes.

Is there any plan for ABL to make the necessary improvements in SEH to allow stack information to be available by default?  It takes quite a lot more effort for me to find bugs in *production* environments where -errorstack is not enabled by default.  For example, if a problem is difficult to reproduce or if it only happens on rare occasion, then it is hard to weigh the cost of adding the -errorstack option for long periods of time.

Ideally the exception stack in ABL could be available by *default* like it is in other languages, and not have such a high performance penalty associated with it.

Note that this question was asked in another post, where it was pointed out that there is a performance penalty for using "-errorstack":

https://community.progress.com/community_groups/openedge_development/f/19/t/16889

I understand that there are programming workarounds that can be applied on a case-by-case basis.  But it would be far better if Progress would do the work once for all of us; instead of making it every ABL developer's job to find a way to keep track of their own error stacks, or compromise performance by turning on "-errorstack" .  Neither of these options are very appealing.

Posted by Laura Stern on 13-Oct-2017 06:39

Sounds like we just need to fix our documentation so that it doesn’t scare people unnecessarily!  We'll take a look at that.

All Replies

Posted by Brian K. Maher on 10-Oct-2017 14:46

Look into the PROGRAM-NAME(n) function.  You can generate your own call stack at any time.
 

Posted by dbeavon on 10-Oct-2017 18:41

The problem with that suggestion is that you would have to implement it on the "first-chance" occurrence of the error (the top of the callstack),  if you do it after the callstack starts to unwind, you lose the details.  Similarly, in order to capture the details yourself as you suggested, you would have to know *exactly* in which part(s) of the code you need the custom CATCH 'es.  But most of the time you don't even know exactly what part of the code is failing, only that it happens somewhere during a round-trip to appserver (which is enclosed in an outer DO/CATCH block).

In practice a developer normally needs a *single* DO/CATCH block to wrap the nested business logic.  The CATCH section would be where you place the consistent behavior for any failure that occurs at *any* time within the block.  The "consistent behavior" that I mentioned often may need to save the original callstack details (which are not available by default in ABL).

Posted by Laura Stern on 11-Oct-2017 10:44

You're implying that we could have implemented this differently such that when we do generate the stack trace it will take no time.  That is unrealistic. I'm sure there was more than one way to implement this, but ultimately there is some kind of processing that needs to happen at some point.  But having said that, the time it takes is probably negligible in the scheme of things, and it only happens when an error occurs.  So maybe we were being overly cautious and the feature should have just been on by default.  But since it isn't, and you need it, just use -errorstack or turn it on in your main .p via the SESSION attribute (SESSION:ERROR-STACK-TRACE = yes) and stop worrying about it!

Posted by Peter Judge on 11-Oct-2017 13:44

As you note, you can enable it with -errorstack or SESSION:ERROR-STACK-TRACE = TRUE.
 
I’d suggest adding an Idea in Communities to change the default behavior for this.
 
You can, today, add the switch to your application’s PF file(s) for clients and appservers and have it always-on.  
 
 

Posted by dbeavon on 12-Oct-2017 18:37

I just wanted to point out that most programmers who use structured error handling in other languages know that there is an overhead cost associated with throwing and catching exceptions. The inclusion of the stack frames is just part of that cost and should be done by default. Without the stack frames we sometimes have to search through hundreds or thousands of lines of buried code within all code paths leading out of the do/catch.  It's like searching for a needle in the haystack - even if you *do*  know the error message.

After many months of using SEH in OE, I'm getting pretty discouraged by that exercise ; but whenever I read about the errorstack parameter, the documentation is written in a way that will frighten people away from it in production, or else they might try to only use it temporarily.  If it can't be used by default then Progress has more work to do.

 It is not fair or reasonable to force ABL programmers to make large performance compromises, or create their own custom mechanisms for capturing stack frames.  Furthermore i should probably say that I always try to *avoid* features that aren't enabled by default in OE because I'd rather not stray away from the herd, for fear of obscure bugs.

Posted by Laura Stern on 13-Oct-2017 06:39

Sounds like we just need to fix our documentation so that it doesn’t scare people unnecessarily!  We'll take a look at that.

Posted by dbeavon on 15-Oct-2017 12:54

Here is an example of a place in your documentation where application developers are being frightened into trying to get along without the benefit of any stack details:

https://documentation.progress.com/output/ua/OpenEdge_latest/index.html#page/dverr/enabling-stack-tracing-with-error-objects.html

  • "Examining the call stack is a valuable debugging feature, but it can potentially consume resources and it is not recommended in a production environment"

In contrast, I don't recall c# or java ever claiming that exception callstacks are disallowed in production.    They are an integral part of any exception and provide necessary context to interpret the error message. Note that errors are common in any app and can be raised in production just as easily as in development - because of innumerable types of unpredictable data-related issues (commonly found in database applications). 

There are different types of errors - some that may be an indication of a software bug which should have been found during development.  But many others that are handled as a matter of course (eg. the user enters data which creates a divide-by-zero scenario - and you must tell them to knock it off).

For reference, here is a fairly straightforward stackoverflow about how stack traces are used to troubleshoot unexpected exceptions in Java apps.  The discussion is also relevant to ABL (if you substitute "exception" terminology with "error").

https://stackoverflow.com/questions/3988788/what-is-a-stack-trace-and-how-can-i-use-it-to-debug-my-application-errors

Sorry for the long post about errors, but I want to make sure I am not misunderstanding ABL errors or how they are intended to be used.  Initially it seemed possible to me that OpenEdge was trying to take a slightly different approach to SEH than the other languages.  But the implementation of SEH in those other languages works the way it does for very good reasons (ie. as I'm finding in my ABL lessons that are learned the hard way).

Posted by Laura Stern on 16-Oct-2017 09:34

Ah.  I see your point!  But here is the reality.  I just did some timings where an error was generated resulting in a 10-level deep stack.  Constructing the call stack takes roughly .01 to .05 milliseconds for each error, depending on the machine.   Windows seems to be fastest relative to solaris and Linux!  In addition, we do not generate the call stack if you are using NO-ERROR.  So we are talking about taking this extra time only when there is an unexpected error and you are using a CATCH block. That does not seem at all onerous to me and certainly does NOT warrant that statement in the documentation.

So no, we are not taking a different approach to structured error handling from other languages.  We are just sensitive about performance and took the cautions approach, for better or worse.  So again, this statement in the doc is way too strong and we will amend it.  Therefore, as I originally said, you should just turn on -errorstack and not worry about it.

Posted by Mike Fechner on 16-Oct-2017 09:58

Thanks Laura! That’s helpful (including the Windows vs. Unix comparison)

Posted by dbeavon on 16-Oct-2017 11:50

It will be good to start turning on these stacks. Now I'll immediately know where to look for a problem when my program execution encounters an error like so : ("** Invalid character in numeric input ?.")  Unfortunately a vague data-related error message like that one might easily come from **any** part of our legacy ABL codebase.  Errors like that one have been very hard to troubleshoot without call stacks. They are usually intermittent, so it is difficult to enable errorstack temporarily  and capture one of them when you need it.

If it wasn't because of performance concerns, then the only other reason I can think why errorstacks might have been disabled by default is so that OE ISVs can hide that level of information from their customers in the event of a bug.  (IE. they may want to hide stacks for reasons that are similar to why they "encrypt" their ABL source files.)  

But hiding information has a cost in terms of maintenance and support.  It may not be worth hiding stacks in those kinds situations either.  We develop ABL for in house software and we certainly have no reason to hide call stacks.

Thanks for all the help.  

This thread is closed