ABL performance bugs and technical support

Posted by dbeavon on 04-Nov-2019 21:11

Does anyone have any experience reporting performance bugs to Progress technical support?  Has anyone successfully received a fix for a performance bug?  

We now run PASOE from a remote host (via client/server connections).  When working with large datasets, I try to structure my programs in a way that retrieves all the prerequisite data in large batches (eg. starting with some initial queries that fill ABL temp tables via "FOR-EACH-NO-LOCK" or via ODBC queries).

I had always imagined that where client/server performance is concerned, the "hard part" is simply to get the data out of the database and into the PASOE session (into local temp tables).  I had hoped that we would be home free after that "hard part" was finished.

However I'm now finding that even after the data has been placed in the PASOE session, we aren't yet out of the woods.  The ABL runtime can still spend another 50% of the time doing CPU-bound work.  If I look in the profiler, it seems that CPU bottlenecks can be located in the last place you would expect (running constructors, BIND'ing datasets, invoking methods, etc).  I recently asked about performance problems with method invocations here:

https://community.progress.com/community_groups/openedge_development/f/19/t/60023

Some people compare performance issues to the layers of an onion.  You can peel back one performance bottleneck to reveal another and another.  I understand that it is a never-ending effort.  And I suspect that it is extremely hard to get a performance issue prioritized by technical support, considering that it doesn't cause a production outage and there is always a workaround (ie. wait longer).  But I'm wondering if anyone has ever tried to report an ABL performance "bug" and successfully convinced Progress to fix it.  I'd love to hear about it, and it may help me get brave enough to start opening tickets for some of the things that seem excessively slow in ABL.  Please let me know.

All Replies

Posted by Thomas Mercer-Hursh on 04-Nov-2019 21:36

One has to be careful about what one calls a bug.  We have certainly seen bugs over the years where a particular release had a sudden, dramatic decrease for an operation which previously had performed adequately and mostly I think these get fixed with reasonable promptness.  But, it seems to me that the kind of thing you are having an issue with are more inherent to the nature of ABL ... not that one can't imagine some ABL like language performing better on that feature ... and, indeed, some pretty dramatic performance improvements have been introduced over time ... but that the performance you see today is similar to the performance you have seen since the feature was first introduced in the language.

Some of these are things that only show up in tight loops with limited internal processing, i.e., operations which are not actually very characteristic of real world processing.  If takes, for example, the time between the operator signals completion of the current screen until the time that a fresh piece of work is offered to the operator, a time when the operator is generally more tolerant of small delays,then is only certain specialized applications where the processing needed at that point is large enough to cause a meaningful delay end to end, so one has no concern that some particular piece of it is 100X slower than C or whatever.  There are, of course, cases where that is not true like packing or routing problems which are a challenge in any language and are good use cases for the use of an external language.

So, relative to your issue, I would say it depends.  If there is something you find that used to perform meaningfully better than it does now, then certainly report it and I would expect it to receive meaningful priority.  If it is something where you construct a test of a loop larger than one would ever likely use in a real world application and the test of slowness is a comparison to what you can do in C, then you might bring it to someone's attention but I would be surprised to see it get a lot of attention.  I would also ask around and do some thinking about whether there may be a different way to approach the problem that is likely to be more performant in a language like ABL.  E.g., running a method or internal procedure of an already instantiated class or procedure avoids the significant cost of instantiating a new class or procedure.

Posted by onnodehaan on 04-Nov-2019 21:47

I have reported a bug where a loop with a lot of string concatenations became more than exponentionally slower.

Had something to do with string allocation.

It has been fixed somewhere in 2018 I think.

Posted by dbeavon on 04-Nov-2019 22:55

I can understand how a large regression in performance would get some attention.  That seems straightforward.

What about if I can prove that, everything else being equal, something in particular like "BIND" parameters take substantially more time on one supported OS/platform than another?

Or what if a method has an artificial concurrency issue that forces the ABL to wait on a resource which shouldn't be shared . Eg.

community.progress.com/.../57795

Or what about local method invocations that take 1 ms each?  or 100 ms? or 1000 ms?  What is the upper limit before Progress would be convinced that it is a performance bug?

It seems like there should be some guidance that could reference to help us determine what is worth submitting as a bug and what is not.  I'm finding that the CPU bottlenecks in ABL becomes very significant when working with "larger" datasets (eg. data for 10,000 or 100,000 transactions).  One technique to overcome these CPU bottlenecks is to initiate work on PASOE in *parallel* (eg. using TPL in .Net with Parallel.Foreach).  This can allow us to send five or ten requests to PASOE to run in parallel on multiple ABL sessions at once.  A core that is being used by a .Net process can keep up with quite a lot (5-10?) of cores running ABL.

This discussion is, of course, related to some "real-world processing" that we are dealing with.  The fact that I'm working with "larger" datasets (10,000 transactions of data rather than 10 transactions of data) doesn't make the problem any less "real".  The question may ultimately depend on what Progress considers to be a "large" dataset.

Are there any sample ABL applications that ship with OpenEdge or can be found on github?  Maybe there are some OOABL programs to go along with the "sports2000" database?  It would be nice to have some benchmark/reference for my comparisons.  Maybe something like this would help whenever I'm opening a meaningful & convincing support case about ABL performance.  Ideally Progress would be a partner in solving performance problems.  There is a limit to the types of performance issues that customers can fix on our own (especially when the profiler identifies problems outside of our own custom code).

Posted by frank.meulblok on 05-Nov-2019 09:25

Below are my opinions. Other people's opinions may vary.

[quote user="dbeavon"]

What about if I can prove that, everything else being equal, something in particular like "BIND" parameters take substantially more time on one supported OS/platform than another?

[/quote]

That would be worth reporting to Tech Support, but may not be a bug in OpenEdge. It may still be that the difference is tied to hardware differences or differences in OS-level functions/implementations.

[quote user="dbeavon"]

Or what if a method has an artificial concurrency issue that forces the ABL to wait on a resource which shouldn't be shared . Eg.

community.progress.com/.../57795

[/quote]

That should be treated as a bug in OpenEdge, especially if that concerns moving to the PASOE server. Because it causes provable regressions in your code, and it's the changes in architecture between OE components that introduces it.

[quote user="dbeavon"]

Or what about local method invocations that take 1 ms each?  or 100 ms? or 1000 ms?  What is the upper limit before Progress would be convinced that it is a performance bug?

[/quote]

Here things get trickier. But you should still be able to get the help of Technical Support to work out specific scenarios.

- If calling the same method repeatedly with the same inputs gets slower over the number of iterations, that's likely a defect. (Not guaranteed, especially not if the method also leaves artifacts floating around elsewhere. Be it temp-table records, left-over handle-based objects or whatever.)

- If it depends on what's being passed in/returned, that may be just because ABL passes everything by-value by default. That including temp-tables & datasets, and that's where you'll really notice that larger amounts of data passed as parameters means more time spent on deep-copying the data. If you run into that, you really want to start looking into the by-reference / bind passing. (bindind/unbinding/rebinding references should have a relatively constant cost at least).

- If you have something that performs slow, and an alternative approach that outperforms it unexpectedly, then you at least have a provable bottleneck. So that can be reported to Tech Support for further investigation as well. Argumentation why the performance difference is unexpected important here.

A clear example here is appending to longchars, where you can compare appending in memory to writing the result to a temp-file and copying that file into the longchar at the end of the process.  You would reasonably expect the temp-file approach to be slower due to disk IO etc. In earlier OpenEdge releases, that was not the case, in current OpenEdge releases it should be because several bottlenecks got fixed. 

This thread is closed