OO and Performance

Posted by guilmori on 08-Mar-2010 15:24

Let's begin from the start, instance creation.

Where should we expect OOABL instance creation to be between

other OO languages
?

In my tests, creating instances and keeping them in a "temp-table collection" is 4X slower than creating temp-table rows of data.

Should this be considered a problem ?

Posted by Thomas Mercer-Hursh on 08-Mar-2010 16:04

First, I'd like to clarify because it looks like you have two points.

Your second point appears to be that creating an object and putting it in an temp-table is 4X slower than simply putting the data in a temp-table directly. Correct?

That hardly seems surprising since you are running code in addition to handling the data. But, lets consider that in the context of overall usage. If one is using a pattern like Model-Set-Entity and one is going to actually do something to each of those "objects", then one is going to instantiate the BE at the time one is ready for the processing, whereas, if you created the object before putting it into the TT, then it already exists. So, it seems to me that, in the end, you are going to be in the same place. Now, if you use one of the patterns in which you just do all your operations on the TT directly, i.e., within the object containing the TT, then you are going to save that object creation, but you are also going to have something that is not very OO-like at all. In particular, one is going to have set-oriented logic and instance-oriented logic all in the same place.

In the end, the question is whether it is too slow to be functional. If so, that is clearly a problem. If, however, it is a small increment in the context of processing as a whole and the whole thing works, then I don't see any reason to be concerned about it since there will be maintenance and design benefits from the OO approach.

I'm not so sure what you mean by your first question:

Where should we expect OOABL instance creation to be between

other OO languages

There seem to be two possible assertions or questions there. One is ABL versus other OO languages. The answer to that, I think, is the same as above. I.e., it is pcode implementation of a 4GL, of course it is slower at some things. It is also very fast at some things where a little bit of code corresponds to something complex. The bottom line is, is it fast enough or is there some usage that gets in the way of a successful implementation. The same issue applies to ABL generally. If your application involves solving equations in linear algebra, then ABL is probably not the right language. But, Order Processing, it is fine. What I don't know is whether there are things specific to OOABL that are unsatisfactory when their non-OO counterpart is satisfactory.

The second part of this seems to have something to do with running a .p, but I'm not sure what. Is the question whether newing a .cls is slower than running a .p? I don't know, is it?

Posted by guilmori on 08-Mar-2010 16:46

Sorry, I wasn't clear.

I meant where OOABL class instantiation performance should be situated in comparison of the following ?:

other OO languages class instantiation

Where item to the left is more efficient than the one to its right.

Posted by Thomas Mercer-Hursh on 08-Mar-2010 17:04

I question the term "more efficient" since they are not comparable units of work.

Naturally, a fully compiled 3GL is likely to instantiate a class faster than ABL. Does that matter? Those languages instantiate classes faster than one can run a .p also, but here we are writing huge applications with huge databases and huge numbers of users and we do just fine. I.e., at some level, it only matters if all your application does is to create classes.

Now, I suppose there might be an issue if you have something in a user interface where a user is expecting quick response and clicking a button means that you need to instantiate 1000 classes before the next thing happens on screen, but are there real requirements to do that? From a legacy ABL perspective, this is why one instantiates things in advance so that the don't need to be done in-line.

I think the relationship to creating a row in the TT is covered above.

Posted by Admin on 09-Mar-2010 01:02

Naturally, a fully compiled 3GL is likely to instantiate a class faster than ABL. Does that matter? Those languages instantiate classes faster than one can run a .p also, but here we are writing huge applications with huge databases and huge numbers of users and we do just fine. I.e., at some level, it only matters if all your application does is to create classes.

One (almost natural) characteristic of procedural code is Spaghetti code (check http://en.wikipedia.org/wiki/Spaghetti_code for some fun). Object oriented code would break up that same task into many classes (favor composition over inheritance, separation of concern, etc.).

So for me it appears to be natural that with OO code - in the puristic style advocated by you - you'll be instantiating a large number of classes where a procedure to

do some sort of optimization on an incoming Order with 100 Orderlines would simply work on 1 or 2 temp-tables, maybe a ProDataset and potentially accessing the database directly to read control or master data (non OERA approach, I know - but please forgive me).

If you disagree with the fact that this can be done without having a large number of instances in memory that at one point need to be instantiated in a bulk, I need a sample to understand your pattern.

But especially since you suggested the event to update the OrderTotal when a OrderLine Qty is changed, I assume everything needs to be in memory together. A persistent store can't react on event on his own.

And I know, you can always questions the relevance - but people are really concerned. (Presentations about "coding for performance" were usually well attended when there still where conferences and there we are talking about the benefit of grouping ASSIGN statements etc.).

Posted by Thomas Mercer-Hursh on 09-Mar-2010 11:33

I'm a little confused by your response. Yes, I would agree that spaghetti code is probably more common in procedural code than OO, but one can certainly write ugly code in either paradigm. If one really takes OO principles to heart, it should lead to cleaner code, but I think a lot of OO coders have learned the form without the concept.

Yes, if I am going to do some active processing on an Order with 100 lines, then I suppose I would have to instantiate an object for each line. Is that a lot? Surely 3GL OO packages do that sort of thing all the time. If there were a TT in every object one might be pushing towards the TMTT problem (more on that soon), but with PABLO BEs an orderline object has no TT and is a fairly compact piece of code because it only deals with one thing, the logic of the line. And, isn't rcode rentrant so that there will only be one copy of the code in those objects? What is the actual problem?

There are situations where lazy instantiation makes sense, e.g., Phil's test of all customers and all orders, add/delete/change one line each. If that were a real world problem it would be a natural for both lazy instantiation and limiting the transaction scope ... e.g., is there any business reason not to deal with one customer's orders at a time and then to get rid of all those objects? But, I don't see 100 orderlines as being a huge number of objects.

Yes, I understand the appeal of using a PDS ... been using TTs for a great many years. It is one way to solve the problem. Is it a better way or is OO better? And, if the advocates for OO are right about its benefits, why not go all the way?

I don't disregard performance ... especially if it reaches the point of violating non-functional requirements. But, certainly, there are lots of times where people get anal about performance differences that make no real difference, i.e., the fraction of a millisecond saved on a faster assign if it is in the context of database operation which will be hundreds or even thousands of times longer. Coding for every scintilla of performance is a good way to produce unreadable code. I believe that coding for clarity and maintainability at the expense of a little performance is a trade-off which is well worth while.

Note, btw, that the event approach to order total does not actually require all lines to be in memory at the same time. The Order must be there to receive the event, but only one line needs to exist at a time. This is a technique one could use with M-S-E.

Posted by Thomas Mercer-Hursh on 09-Mar-2010 16:37

One of the claims about possible OO performance issues has been the TMTT problem related to large number of active collections, e.g., all order lines for some large number of orders implies one collection per order (if we don't consider lazy instantiatoin).

Over on a TMTT thread here http://communities.progress.com/pcom/message/83206#83206 I just reported some computations and testing with creating TTs. See that thread for details, but let me include a couple bottom line numbers here.

-Bt can be set up to 50,000. With -tmpbsize 1 that is 55MB of RAM .. non trivial, but hardly enormous ... and that is enough RAM to fit over 6100 TT entirely in memory.

Doing some performance tests with a modification of a program created by Tom Bascom, which runs a program recursively to create the desired number of TT, I could set -Bt to the minimum of 10 and create 1000 TT in 1.45s, 2/3 of which was the time required to just run the program recursively with no temp-tables at all, i.e., less than 1/2 a second for creating 1000TT on disk. Less that 1/3 second if they are all in memory.

Is this really an OO performance issue?

I understand that if one has a ChUI app with some tired hardware and legacy disks, then providing reasonable -Bt per session could be an issue and the disk activity might well be slower than this. But, the hit on the disk would only really be meaningful if every session was initiating that many TT. I believe that might have been the case in Tim's original TMTT problem since the TTs were architectural, but in terms of OO use they are going to be situational, i.e., applicatble only to particularly large, complex processes, not every session. To be sure, using a TT in every object could certainly get one into trouble pretty easily, but leaving that aside, how often is one likely to get in to TMTT trouble if one pays attention to the parameters.

Anyone got a use case with number up to or above this range?

Note that there is transaction scope implication here. I.e., if one wants to process every line of every one of 10,000 orders then that would mean 10,001 TTs, but only if the transaction scope was around all 10,000 orders. If the scope is only per order, then one needs only two collections -- one for the orders and one for the order lines of the current order.

Posted by Thomas Mercer-Hursh on 09-Mar-2010 17:02

Apropos speed of object creation, I wrote a little program which would New an essentially empty object and put it into an array . 1000 objects took .89s and 2000 took 1.86s. That is about 80% of the time for the same number of run statements in a recursive run test that I did in parallel with the TMTT test. Is that too slow for a real world use case?

Posted by jmls on 09-Mar-2010 17:12

An interesting discussion on temp-tables and objects was held during

the initial exploratory design on proparse.jar (now since

discontinued)

http://www.oehive.org/node/1250

We start talking about performance at

http://www.oehive.org/node/1250#comment-899

Julian

On 9 March 2010 23:03, Thomas Mercer-Hursh

Posted by Thomas Mercer-Hursh on 09-Mar-2010 17:34

Yes, I remember the discussion. The question is, how does this relate to real ABL applications? Clearly, for doing something like Proparse or ProLint, one needs it to be very fast and there are a very large number of entities. ABL is not going to be good at that any more than it is going to be good at the string parsing which is the first part of the process. This is no diffreent than ABL not being the right tool for linear algebra solutions.

My question though, is whether there is a performance problem related to real world problems of the type that one would normally write in ABL. In what cases do I need a transaction scope which covers 10,000 objects?

Posted by Tim Kuehn on 09-Mar-2010 23:57

tamhas wrote:

Apropos speed of object creation, I wrote a little program which would New an essentially empty object and put it into an array . 1000 objects took .89s and 2000 took 1.86s. That is about 80% of the time for the same number of run statements in a recursive run test that I did in parallel with the TMTT test. Is that too slow for a real world use case?

What were you running this test on? My "TMTT demo" code instantiated something like 10K empty objects in very short order - on a PC.

Posted by ChUIMonster on 10-Mar-2010 07:59

Don't get trapped into thinking that only legacy ChUI apps have large scale on the server side.

An app-server based application can also have thousands of sessions running on a single server. Several famous partner applications start an app server instance for every GUI client. They do have customers with thousands of users in real life. So far as I know none of them currently have a TMTT problem -- but history suggests that a confluence of worst practices in some future release is not out of the question.

Posted by Phillip Magnay on 10-Mar-2010 10:44

tamhas wrote:

There are situations where lazy instantiation makes sense, e.g., Phil's test of all customers and all orders, add/delete/change one line each. If that were a real world problem it would be a natural for both lazy instantiation and limiting the transaction scope ... e.g., is there any business reason not to deal with one customer's orders at a time and then to get rid of all those objects? But, I don't see 100 orderlines as being a huge number of objects.

The test that I specified scoped the transaction at each Customer where all Orders of a given Customer are updated by adding/modifying/deleting Orderlines. The use case is obviously somewhat artificial but I don't believe that the scale of processing is.

Also, I believe that lazy instantiation should be leveraged as much a possible in order to optimize performance no matter which implementation approach (M-S-E, PABLO, whatever). Why instantiate objects which are not going to be used? If a given use case merely adds a new orderline to an existing order which already has 100 existing orderlines, why instantiate all those 100 orderline objects?

tamhas wrote:

Is it a better way or is OO better? And, if the advocates for OO are right about its benefits, why not go all the way?

Because it's never that simple. Benefits in one area usually involve a cost somewhere else. If the benefits from "all the way" OO (posited as more flexible design, code clarity, lower maintenance costs, etc) outweigh the costs in performance, then sure, it might make sense. But that's a big "if" which has yet to be demonstrated. Indeed, testing to date has only shown that the additional performance cost of "all the way OO" is so high that the proposed benefits are quite pale in comparison. Another problem is that these proposed benefits are future promises that can be difficult to realize and concretely measure. So there is a threshold of skepticism to overcome when considering solutions which mean suffering a certain pain today while hoping to realize an uncertain gain tomorrow.

If the future reward for the present cost was demonstrated to be more concrete and more certain, then I have no doubt ABL developers would be very receptive. But said demonstration is yet to be seen.

Posted by Thomas Mercer-Hursh on 10-Mar-2010 11:24

You say "empty objects", but also reference "TMTT demo". Are the objects being NEWed? Or are these TTs.

For the TTs, I did get higher rates if one factors out the RUN which goes with each TT in Tom's code.

For the .cls objects I seem to be getting higher rates in another test which is not yet complete.

Posted by Thomas Mercer-Hursh on 10-Mar-2010 11:37

Being a fan of browser clients, I'm aware that there can be lots happening on the server. And, I am well aware that architectural stupidity can create problems where none need exist. It is eminently possible that an AP will write an application in such a way that it either performs poorly or requires far more hardware to run well than it would if they wrote it properly. However, I don't see that as PSC's problem, beyond the educational opportunity to help them avoid doing it.

What I'm after in this thread is whether or not there are real performance issues when using good architectural practice. E.g., the idea of putting the knowledge attributes of a single BE in a one row TT in order to use the TT built in serialization strikes me as architectural abuse. It is using a sledge hammer where a needle nosed pliers would be appropriate. Therefore, it doesn't concern me that it also leads to a TMTT issue. Using a TT for a Collection is also overkill except for Map classes, but not of the same degree and there aren't really any good alternatives until or if we get something like PLO support in work-tables. So, TMTT problems from too many TT-based collections would bother me becasue there are not yet any good alternatives. But, I am still looking for a use case where one actually needs a large enough number of collections that it should be a problem with proper tuning and, even if the use case exists, it appears to be something that will impact an single session doing a particular task, not the architecture as a whole.

Posted by Thomas Mercer-Hursh on 10-Mar-2010 12:04

Would it be possible for you to share the code for your test so that others could explore whether it was possible to improve on your results?

And, if not, could you share the actual timings. 50X clearly sounds like a lot, but the difference between, say 1ms and 50ms would not be material in the context of overall process while 1h versus 50h would be.

Why instantiate objects which are not going to be used?

No argument there, for sure. Definitely an area for some good patterns, particularly with respect to read versus instantiate.

What you say about trade-offs in costs and benefits is all true, but let me add a couple of other thoughts.

There is some tendency to talk as if there was a scale with procedural, traditional ABL on one end and "pure" OO ABL on the other end. The landscape is, of course, much more complex than that. But, there is also some tendency to talk in terms of starting at the traditional end and wondering how far to move toward the other end. For a traditional ABL programmer, that may well be what takes place, but it is also possible for someone with an OO backgroud to start at the other end and wonder how far, if any, to move to the traditional ABL end. This is particularly apt given the historical problem many places have found in finding trained ABL programmers, while finding experienced OO programmers is not a problem.

Also, I think there are important issues of perspective here. It isn't just a question of what language elements one uses and where one uses them, but also a question of how they are used. It is very easy for an experienced ABL programmer coming to OO to build classes and structures and operations according to a relational mentality rather than a real OO mentality. If that produces performance or other problems, there is then a tendency to retreat to more familiar and proven traditional ABL patterns, but the real problem might have been not having the right OO concept of how things should work so that a different OO approach might not have had those same problems.

And, of course, there is the possibility that there is a performance problem which shouldn't exist and then one needs to expose that and try to get it fixed.

Posted by Mike Ormerod on 10-Mar-2010 12:12

I don’t think anyone would argue against the fact that there are good & bad practices, no matter which 'end' of the scale you're starting. History is full of cases where people moved from v6 to v7 and simple coded in the same way they had been for years and then wondered why they had problems!

Mike

- Mike Ormerod

Posted by Thomas Mercer-Hursh on 10-Mar-2010 12:21

To be sure, but one of the things one needs to ferret out is when someone is doing something like that and then saying "it's broken". I.e., we need to figure out whether it is the product or the programmer that is broken!

Posted by Phillip Magnay on 10-Mar-2010 12:44

tamhas wrote:

Would it be possible for you to share the code for your test so that others could explore whether it was possible to improve on your results?

And, if not, could you share the actual timings. 50X clearly sounds like a lot, but the difference between, say 1ms and 50ms would not be material in the context of overall process while 1h versus 50h would be.

The tests were conducted by creating a PABLO factory on top of the existing CloudPoint DA Layer. This is the same DA that the M-S-E uses. I am not permitted to simply give out his code. At least not yet.

IIRC, for a given Customer, the processing of all Orders (adding/modifying/deleting an Orderline respectively), then modifying all Orders and then the Customer, the average across all Customers was in the order of 40ms (using Ubuntu on a 2.66GHz PC) using M-S-E. For PABLO, the average processing time across all Customers was over 2 seconds.

The additional overhead appeared to have less to do with instantiating objects and inserting them into TT-based collections (although the insertion into collections certainly did add some). The bulk of the additional overhead appeared to be more about: moving the data (retrieved from the database into a PDS within the DA layer) into the data members of the object instances upon instantiation; maintaining before image information inside the object instances; moving after-image and before-image data out of the data members of the object instances into a PDS within the DA layer for subsequent update to the database. The retrieval from the database into a PDS and update from the PDS to the database was done the same way and incurred the same overhead in each case and thus cancelled out.

Posted by Phillip Magnay on 10-Mar-2010 12:45

tamhas wrote:

To be sure, but one of the things one needs to ferret out is when someone is doing something like that and then saying "it's broken". I.e., we need to figure out whether it is the product or the programmer that is broken!

Or the approach itself.

Posted by Thomas Mercer-Hursh on 10-Mar-2010 13:18

This is the same DA that the M-S-E uses

It is a bit unclear to me what this means. My understanding of the M-S-E data access was that the Model is passed to the DL and "decorated" with data access components. How would that translate to supplying a PABLO factory?

On the surface, those timings seem to have a good news, bad news flavor. The PDS solution is a lot faster, but 2s for all orders of a customer doesn't seem that bad.

So, I would say we need some more focused tests. There are a number of architectural choices which one might handle differently than it sounds like you handled them in that particular context. For example, does one really want to make it the responsibility of the individual object to provide its own before image handling or should that be done in some other way? E.g., something like a memento pattern?

There is an inherent problem in testing. At one level, the real bottom line is how does it all work together, so one needs to test whole integrated systems. But, testing whole integrated systems, one often can't identify the exact cause of a performance difference which might stem from a particular operation, a particular design approach, or an architectural issue about how the pieces fit together.

Posted by Phillip Magnay on 10-Mar-2010 13:38

tamhas wrote:

This is the same DA that the M-S-E uses

It is a bit unclear to me what this means. My understanding of the M-S-E data access was that the Model is passed to the DL and "decorated" with data access components. How would that translate to supplying a PABLO factory?

You are correct. In M-S-E, the model is passed to DL and data access decorators are applied. The decorators are then removed and the model returned to the BL.

These same data access components are used to populate to and/or update from a PDS. This PDS is then passed to a PABLO factory in order for the subsequent instantiation of the PABLOs and insertion into TT-based collections.

On the surface, those timings seem to have a good news, bad news flavor. The PDS solution is a lot faster, but 2s for all orders of a customer doesn't seem that bad.

It's a credit to you that you can see that the glass is half-full in this result. But others can be forgiven for concluding that it is not a favorable comparison.

So, I would say we need some more focused tests. There are a number of architectural choices which one might handle differently than it sounds like you handled them in that particular context. For example, does one really want to make it the responsibility of the individual object to provide its own before image handling or should that be done in some other way? E.g., something like a memento pattern?

I'm open to other approaches for before-image handling. And memento certainly appears to be a functional alternative. But I would be very interested to determine that such an approach did not further add to the performance overhead.

tamhas wrote:

There is an inherent problem in testing. At one level, the real bottom line is how does it all work together, so one needs to test whole integrated systems. But, testing whole integrated systems, one often can't identify the exact cause of a performance difference which might stem from a particular operation, a particular design approach, or an architectural issue about how the pieces fit together.

To be sure. But in this instance everything was almost identical with the exception of replacing M-S-E with PABLOs created by a different factory object.

Posted by Thomas Mercer-Hursh on 10-Mar-2010 13:57

I think the "half full" perspective comes from a history of people making comparisons like how long it takes to instantiate a simple class in ABL versus C#. Yes, it is a big number, but that doesn't always translate into violating non-functional performance requirements.

In the PABLO test, did you instantiate all lines or just the ones to be processed?

Structurally, I would have been inclined to leave the PDS in the DL.

Certainly, we can do some testing on alternate BI functionality. If the PDS is hanging around in the DL, I don't know that we even need BI functionality in the object since that can be established when the data gets back to the PDS for persisting.

That it was nearly identical is both a pro and a con. It means that it was a good test for comparing one strategy to another in context. But, it is also assuming a great deal of context and it may well be that an entirely different approach would yield very different comparisions. I'm quite sure, for example, that one could create a PDS-based implementation that performed much worse than M-S-E!

Posted by Tim Kuehn on 10-Mar-2010 14:05

tamhas wrote:

You say "empty objects", but also reference "TMTT demo". Are the objects being NEWed? Or are these TTs.

For the TTs, I did get higher rates if one factors out the RUN which goes with each TT in Tom's code.

For the .cls objects I seem to be getting higher rates in another test which is not yet complete.

The code in question generated a user-specified number of objects with a user-specified number of TT definitions in each object, as well as a ".p" to instantiate everything.

Run the program when it was set to "zero" TT's per object was quite fast, even for "high" number of object instances.

Posted by Thomas Mercer-Hursh on 10-Mar-2010 14:54

One more thought. This test is one that is based on knowing up front that one will be processing all customers, all orders, and many lines. There are some common tasks which might be like that, but far more common is to process a much smaller number of things at once. It seems to me that the mass update scenario is a natural fit for a PDS. But, what happens if it is not done as a mass update? What happens if it is a whole lot of individual, isolated updates?

I.e., one of the things that concerns me a bit about M-S-E is that there is a lot of "stuff" there if what one is doing involves only one record in one TT.

Posted by Thomas Mercer-Hursh on 10-Mar-2010 17:24

Ah, the importance of little details ... I added -q to the parameters and got 10,000 in .77s That's more like it!

Posted by jmls on 10-Mar-2010 17:42

http://www.oehive.org/node/1267#comment-960

http://www.oehive.org/node/1267#comment-992

http://www.oehive.org/node/1267#comment-949

Really must retry this stuff in 10.2B

Julian

On 10 March 2010 23:24, Thomas Mercer-Hursh

Posted by Phillip Magnay on 10-Mar-2010 17:59

tamhas wrote:
I think the "half full" perspective comes from a history of people making comparisons like how long it takes to instantiate a simple class in ABL versus C#.  Yes, it is a big number, but that doesn't always translate into violating non-functional performance requirements.

I guess. But who would choose such a poorly performing approach when the benefits are so unclear and so uncertain?

tamhas wrote:

In the PABLO test, did you instantiate all lines or just the ones to be processed?

Just the ones being processed.

tamhas wrote:

Structurally, I would have been inclined to leave the PDS in the DL.

OK. The PABLO factory is a DA component. Done.

tamhas wrote:

Certainly, we can do some testing on alternate BI functionality. If the PDS is hanging around in the DL, I don't know that we even need BI functionality in the object since that can be established when the data gets back to the PDS for persisting.

Sure. The BI functionality of PDSs is definitely much faster and much more robust. And the BI functionality of PDSs was one important factor behind M-S-E being built around the PDS.

But how can BE objects disconnected from PDSs leverage this functionality? There might be a way but I don't immediately see it.

tamhas wrote:

That it was nearly identical is both a pro and a con. It means that it was a good test for comparing one strategy to another in context. But, it is also assuming a great deal of context and it may well be that an entirely different approach would yield very different comparisions. I'm quite sure, for example, that one could create a PDS-based implementation that performed much worse than M-S-E!

I suppose there is a glimmer of hope there for you. But *everything* except object factory was identical so the comparison is really quite clear.

I suppose one could create a PDS implementation that performed worse than M-S-E. But it would be impossible to create a PDS implementation that performed worse than PABLO.

Posted by Thomas Mercer-Hursh on 10-Mar-2010 18:08

My test is like the one in the first link, except that I am actually saving all 10000 objects in an array. Compiling the class drops it to .72.

Posted by Phillip Magnay on 10-Mar-2010 18:11

tamhas wrote:

One more thought. This test is one that is based on knowing up front that one will be processing all customers, all orders, and many lines. There are some common tasks which might be like that, but far more common is to process a much smaller number of things at once. It seems to me that the mass update scenario is a natural fit for a PDS. But, what happens if it is not done as a mass update? What happens if it is a whole lot of individual, isolated updates?

I.e., one of the things that concerns me a bit about M-S-E is that there is a lot of "stuff" there if what one is doing involves only one record in one TT.

There really isn't a lot of "stuff" there. But the more important point is that M-S-E has been developed to scale from single record updates to large batch updates. I do not want the business logic implemented/executed differently for a particular BE type depending on whether it is a single object update, a multiple object update, or a large batch update.

Posted by Thomas Mercer-Hursh on 10-Mar-2010 18:32

But who would choose such a poorly performing approach

I think we have a fair amount of testing left to do before we can characterize the performance fully. But, if one were to use performance on such things as the driving indicator, then shouldn't we all be writing in C#?

The PABLO factory is a DA component

Not my current thinking, but that's whole different topic.

But how can BE objects disconnected from PDSs leverage this functionality? There might be a way but I don't immediately see it.

Depends on a lot of things, but one of the simple possibilities is that the DL contains a PDS which is the proximate data source and when the data comes back from the BL it is put back into the PDS such that it uses the BI functionality. No different really than the sort of thing one might do with PDS data going to a non-ABL client. The client sends back either a changed copy of the data or just the changes and the server side logic applies those to the source PDS and then persists the data in the usual fashion.

Seems like a good idea to think of the DL and BL as separate subsystems in this way.

so the comparison is really quite clear.

Quite clear for what it was, but not necessarily a thorough exploration of PABLO options. At the very least, one needs to instrument sufficiently to be able to identify *why* it is slower and consider whether or not there is an approach that would change that. One also ought to have a mix of tests, including one's that focus on lots of little updates in addition to large batch.

But it would be impossible to create a PDS implementation that performed worse than PABLO.

I'm more optimistic than you are about people's ability to creat poorly performing systems!

But, in any case, it isn't really the point. The point is to understand what is and is not possible within the performance requirements.

One of the really big gaps here is that we have no explanation for why your PABLO test was so much slower. Back in grad school, my dissertation advisor used to beat into our heads that when we had run some complex multivariate analysis we had nothing, until we were able to understand the results in terms of the underlying biology. The same principle applies here. Until we understand why the PABLO version was slower, we don't really know anything except that the particular design you tried performed poorly. My first guess was that the PABLO example was being done with individual database updates instead of FILL(). Now we know that wasn't the case. My second guess was that you were instantiating all the lines, even though most were not touched. You also say that wasn't the case. Either would have been a reason for a fairly large difference in performance and both would have made the comparison not really equivalent since other designs would have been possible. So, if it isn't those factors, what is it? Both should have had to create an approximately equal number of objects. The PABLO version could have had to create a lot of TTs for collections, if they were all done at once, but again a design shift would have only required one order line collection at a time. Updating order and customer totals by a FOR EACH on the objects could have added a lot, but there's an alternative for that too. Someplace in there is a reason for the difference. Maybe it is inherent and maybe it isn't. Until we know what it is, we don't really know much.

Posted by Thomas Mercer-Hursh on 10-Mar-2010 18:41

No one questions that you have worked very hard on the design and tested it to work in many different situations. But, it is using a PDS or at least a TT even in the case in which there is a single entity. That may perform perfectly well ... one of my on-going points here is that as long as something performs well enough, it performs well enough even if something else might perform better .. but a TT is a "lot of stuff" for a single instance of a few values. Moreover, those single instances are very, very common in real applications. For most operations, I am dealing with one customer, one order, one invoice, one check, one item ... and I'm going to persist that entity before I move on to the next. Chances are, there is no 50X performance difference in those cases. Who know, maybe the performance advantage even goes in the other direction. We don't know any of that until we have tested it.

Posted by Phillip Magnay on 10-Mar-2010 20:55

tamhas wrote:

No one questions that you have worked very hard on the design and tested it to work in many different situations. But, it is using a PDS or at least a TT even in the case in which there is a single entity. That may perform perfectly well ... one of my on-going points here is that as long as something performs well enough, it performs well enough even if something else might perform better .. but a TT is a "lot of stuff" for a single instance of a few values. Moreover, those single instances are very, very common in real applications. For most operations, I am dealing with one customer, one order, one invoice, one check, one item ... and I'm going to persist that entity before I move on to the next.

As I said, I want an approach that scales from single instance updates to batch updates for large numbers of instances using the same business logic.

tamhas wrote:

Chances are, there is no 50X performance difference in those cases.

I don't see it. Perhaps you could explain why performance differential would change between single instance versus multiple instances when everything is identical except PABLO versus M-S-E?

tamhas wrote:

Who know, maybe the performance advantage even goes in the other direction.

Keep hoping.

Posted by Phillip Magnay on 10-Mar-2010 21:10

tamhas wrote:

I think we have a fair amount of testing left to do before we can characterize the performance fully.

We already have and the results are quite clear to us - the performance of PABLO is unacceptable. By all means perform you own tests and come to your own conclusions.

But, in any case, it isn't really the point. The point is to understand what is and is not possible within the performance requirements.

One of the really big gaps here is that we have no explanation for why your PABLO test was so much slower. Back in grad school, my dissertation advisor used to beat into our heads that when we had run some complex multivariate analysis we had nothing, until we were able to understand the results in terms of the underlying biology. The same principle applies here. Until we understand why the PABLO version was slower, we don't really know anything except that the particular design you tried performed poorly. My first guess was that the PABLO example was being done with individual database updates instead of FILL(). Now we know that wasn't the case. My second guess was that you were instantiating all the lines, even though most were not touched. You also say that wasn't the case. Either would have been a reason for a fairly large difference in performance and both would have made the comparison not really equivalent since other designs would have been possible. So, if it isn't those factors, what is it? Both should have had to create an approximately equal number of objects. The PABLO version could have had to create a lot of TTs for collections, if they were all done at once, but again a design shift would have only required one order line collection at a time. Updating order and customer totals by a FOR EACH on the objects could have added a lot, but there's an alternative for that too. Someplace in there is a reason for the difference. Maybe it is inherent and maybe it isn't. Until we know what it is, we don't really know much.

As I have indicated in a couple of posts today, the additional performance overhead appears to be caused by inserting the instantiated PABLOs into collections and moving data between from the PDSs to the BEs and back again, things that M-S-E doesn't need to do.

Posted by Thomas Mercer-Hursh on 11-Mar-2010 00:45

I want an approach that scales from single instance updates to batch updates for large numbers of instances using the same business logic.

As do we all ... but it is unlikely that any one design is superior in performance to all other designs in all circumstances. We certainly have no data on which to make that judgement here.

There are at least three broad categories of operations that come to mind -- operations on single enties, batch operations like the FOR EACH on a table for a total, and large volume of operations, but each operation is on a single entity. These are each very different requirements. At best, you have reported results from a single specific implementation versus another specific implemtation in one of these categories. We have no data by which to explain why the performance difference happened in that test nor do we know how the results might have turned out in one of the other categories nor do we know how the results might have been with a different implementation.

Until you can explain your results in terms which can be extrapolated to other models and other tests, we essentially have no meaningful data. We don't yet *understand* anything except that you ran a test and got some results.

Perhaps you could explain why performance differential would change between single instance versus multiple instances when everything is identical except PABLO versus M-S-E?

For an operation which is intrinsically focused on an individual BE, you are creating an object and PABLO is creating an object. In M-S-E, any data involved in that operation needs to be retrieved by a generic interface on a delegate object which is getting data from a buffer on a temp-table. In a PABLO implementation, it is accessing data in a property on the BE itself. Can you expect that the M-S-E approach is going to be more efficient than the PABLO approach in that context? It just makes no sense. Sure, it might be efficient enough to be acceptable, but there certainly is no reason to expect a 50X performance *advantage* for the dynamic buffer.

Keep hoping.

It isn't about hope. It is wanting real facts, real data, real understanding of why there is an advantage to one way over another and real metrics about how significant that advantage is. The data you have presented from you test so far makes no sense. The same number of objects, the same DB retrieval strategy, operations are on the objects. Where does any performance advantage come from in there, much less a 50X one. Mike *might* have a point that a FOR EACH on a table is faster that a FOR EACH on a colleciton of objects .... depending on design, but we don't even have that in your test since the operations are on the BEs, not the table.

Posted by Thomas Mercer-Hursh on 11-Mar-2010 00:58

We already have and the results are quite clear to us

What is clear is that you ran a test and got a result. I have seen nothing yet that explains that result. Same number of BEs, same access to the DB, a vague idea that maybe assigning properties to a BE is more expensive than something. It isn't an explanation. There are many different implementations possible.

Yes, I will be running my own tests, although, without your code, I can't compare my tests to your implementation. I will, however, be exploring what, if anything, is inefficient. I haven't really found it yet other than a recognition we had beforehand that a TT is a pretty heavy weight implementation for a simple collection and it would be nice to have something lighter. And, if you need a lot of them, better attend to the parameters or you might have a problem.

As I have indicated in a couple of posts today, the additional performance overhead appears to be caused by inserting the instantiated PABLOs into collections and moving data between from the PDSs to the BEs and back again, things that M-S-E doesn't need to do.

There are some fairly obvious tests one could do to quantify such suspicions. There are also a significant number of alternate ways to instantiate the values in a BE. Not to mention the issue of accessing data that is right in the BE versus a dynamic access through a delegat to a buffer in order to get any data. Get one value from the BE and the cost of the initialization might dominate the cost of the dynamic access. Get 100 accesses to the values in the BE and you think the dynamic access is still going to be faster?

I suppose that if you build a lot of BEs and then don't do very much with them, not having to initialize their data or capture their data for persistence might be a saving. But, what if you are actually going to do a lot of processing with the BE?

Posted by guilmori on 11-Mar-2010 09:07

tamhas wrote:
My test is like the one in the first link, except that I am actually saving all 10000 objects in an array.  Compiling the class drops it to .72.

And... what is your conclusion of this test ? You're happy with this result ? Almost one second for doing nothing business related yet ? Do you think 10000 records(if we're talking about 1 record = 1 instance) is an exceptionally huge amount of data to work with ?

Moreover, I don't think using an array directly is a representative test. I think one building enterprise application would privilege the usage of a collection interface.

I did new 10000 objects and store them into your com.cintegrity.Lib.Collection.List, it took 1.4s using .r and -q.

1.4s, and we're still not calling any properties or methods on those objects.

[edit] I'm using 10.1C02

Posted by Phillip Magnay on 11-Mar-2010 09:15

tamhas wrote:

What is clear is that you ran a test and got a result. I have seen nothing yet that explains that result. Same number of BEs, same access to the DB, a vague idea that maybe assigning properties to a BE is more expensive than something. It isn't an explanation. There are many different implementations possible.

It's not a vague idea. It's actually reasonably clear to us. Managing the collections is a significant factor. But copying data from the PDS to the data members of the BE, then copying data back from the data members of the BE back to the PDS is the primary factor driving the poor performance.

tamhas wrote:

Yes, I will be running my own tests, although, without your code, I can't compare my tests to your implementation. I will, however, be exploring what, if anything, is inefficient. I haven't really found it yet other than a recognition we had beforehand that a TT is a pretty heavy weight implementation for a simple collection and it would be nice to have something lighter. And, if you need a lot of them, better attend to the parameters or you might have a problem.

You absolutely should conduct your own tests. In fact, I urge everyone who is seriously considering the PABLO approach to make their own investigations and conduct their own tests. I am quite confident that the results of such tests will be in line with ours.

There are some fairly obvious tests one could do to quantify such suspicions. There are also a significant number of alternate ways to instantiate the values in a BE. Not to mention the issue of accessing data that is right in the BE versus a dynamic access through a delegat to a buffer in order to get any data. Get one value from the BE and the cost of the initialization might dominate the cost of the dynamic access. Get 100 accesses to the values in the BE and you think the dynamic access is still going to be faster?

I suppose that if you build a lot of BEs and then don't do very much with them, not having to initialize their data or capture their data for persistence might be a saving. But, what if you are actually going to do a lot of processing with the BE?

If you can develop a PABLO implementation (and a series of realistic tests) that demonstrates that PABLO can be a viable option, then I'm sure everyone would be quite interested in seeing it.

But so far our investigations into PABLO (and the those of several customers) have only shown that the performance is completely unacceptable.

Posted by Thomas Mercer-Hursh on 11-Mar-2010 10:32

Well, yes, I guess I think that 10,000 objects in less than a second is a level of performance we can probably live with. How often, after all, do you do any operation on 10,000 things without any intervening steps or interaction. Mostly, it seems to me to hit that kind of volume is associated with reporting. Not only am I likely to be using a third party SQL-based tool for reporting, but if I am doing something like bringing up a summary on a screen for interaction, then I am going to do the heavy duty record flogging in the DL and pass the summary data to the BL and UI, so I won't have 10,000 objects. And, in the event of something like receiving stock for a back-ordered item and allocating it to 10,000 open lines, that is a process which is going to run in batch, so who cares about a second or two.

What's your use case for 10,000 objects being created at once in a context where 1 second is too long?

From Tim's numbers ... and yours too, really ... it appears that storing them in a TT is not as fast, but close. I'll need to test this on my own box to get a really comparable number. But, again, I don't see putting 10,000 things into a collection.

Actually, arrays are an interesting thing here. We already have arrays where we can dynamically size the array prior to use, so any context in which one knows the count up front and there is no possibility of new entries, like a message data packet, an array makes a fine simple implementation of a collection without the overhead of a TT, as long as all one needs is sequential access. If we could get resizeable arrays, I would leap for them.

And, yes, we need more testing doing real work, but at this point I haven't found a place where OO is performing drastically slower than other ABL. Slower than C# perhaps, but that is an entirely different issue.

Posted by Thomas Mercer-Hursh on 11-Mar-2010 10:43

But copying data from the PDS to the data members of the BE, then copying data back from the data members of the BE back to the PDS is the primary factor driving the poor performance.

You have some test results focused on that point?

I think I have half a dozen different ideas about how to initialize a BE with data. Which ones did you try?

Is there a test-specific aspect here? E.g., if you have a BE with 30 properties and the data is actually in the BE, then one has to assign 30 properties and read 30 properties, but if the test is accessing only one of those properties, then the M-S-E delegate or accessor approach only has to read or write one property. In a different test, a much larger number of properties might be accessed.

In fact, I urge everyone who is seriously considering the PABLO approach to make their own investigations and conduct their own tests.

I wouldn't confine the recommendation to PABLO approaches, but rather would suggest apply to any approach. I am very confident that people are capabile of making far less efficient PDS-based approaches than you have created with M-S-E and, at the current state, no one is likely to be able to replicate M-S-E without engaging Progress Professional Services.

Posted by guilmori on 11-Mar-2010 10:49

tamhas wrote:

if I am doing something like bringing up a summary on a screen for interaction, then I am going to do the heavy duty record flogging in the DL and pass the summary data to the BL and UI, so I won't have 10,000 objects.

So you suggest interacting with entities sometimes directly through database records, other times through their classes ?

You don't mind using different interfaces to work the same data, and duplicating many code ?

Posted by Thomas Mercer-Hursh on 11-Mar-2010 11:04

My context is very specific. If the BL has a need for summarized or processed data, then I see no reason that the DL shouldn't do the compilation for it. I'm not talking about duplicating *any* logic. This is very focused on questions like "give me a list of customers ranked by total open order volume". That requires only an ID, name, and total amount. That is the kind of compilation which I think should be done in the DL.

Posted by Phillip Magnay on 11-Mar-2010 11:26

tamhas wrote:

You have some test results focused on that point?

Our guys dug quite extensively into the underlying causes of PABLO's poor performance. They showed that the copying of the data from the PDS to the BE and back again was the most significant factor.

tamhas wrote:

I think I have half a dozen different ideas about how to initialize a BE with data. Which ones did you try?

Upon instantiation of the object(s), the object factory simply reads the field values from a buffer in the PDS (after the PDS has been populated) and assigns these values to the relevant properties on the BE upon instantiation. We tried direct assignment to public properties and indirect assignment by passing the values through the constructor. Nothing very complicated. The objects were then inserted into a TT-based collection and then the collection is returned back to the requesting client class.

Upon update, the object collection was passed to data access component which then parsed through each of the objects in the collection to determine which ones has been updated. For those that had been updated, the before image was added into the relevant table of a PDS, tracking changes turned on, then the after image overlaid. Then the standard data access components which update PDSs to the database are used.

If you have an alternative implementation that has been proven to be faster or more streamlined than this approach, then we would like to see it.

Indeed we would be very interested in seeing any of your half-dozen ideas working. If any of these ideas make PABLO viable, then we would certainly look at it.

tamhas wrote:

In fact, I urge everyone who is seriously considering the PABLO approach to make their own investigations and conduct their own tests.

I wouldn't confine the recommendation to PABLO approaches, but rather would suggest apply to any approach. I am very confident that people are capabile of making far less efficient PDS-based approaches than you have created with M-S-E and, at the current state, no one is likely to be able to replicate M-S-E without engaging Progress Professional Services.

Sure. People might be capable of producing of any number of less efficient approaches PDS-based or otherwise. But a top-notch team of developers with many years of experience in ABL and OO in has only been able to prove that PABLO is not viable due to unacceptable performance.

Posted by jmls on 11-Mar-2010 11:47

You can have resizable arrays - to a point

DEF VAR a AS CHAR EXTENT NO-UNDO.

EXTENT(a) = 5.

a[5] = "test5".

EXTENT(a) = ?.

EXTENT(a) = 10.

a[10] = "test10".

MESSAGE a[5] SKIP a[10].

Julian

On 11 March 2010 16:32, Thomas Mercer-Hursh

Posted by Peter Judge on 11-Mar-2010 11:57

You can have resizable arrays - to a point

>

Do you see 'test5' still? (I hope so, but I don't with a somewhat recent build).

I've ended up having to do something like the below if I want to keep the existing values:

assign extent(cTempPrimitive) = Depth

cTempPrimitive = mcPrimitiveStack

extent(mcPrimitiveStack) = ?

iMax = Size.

assign extent(moObjectStack) = piDepth

extent(mcPrimitiveStack) = piDepth

Size = 0.

do iLoop = 1 to iMax:

PrimitivePush(cTempPrimitive[iLoop]).

end.

(sorry for the OT; back to your regularly scheduled discussion)

-- peter

Posted by Thomas Mercer-Hursh on 11-Mar-2010 11:59

Our guys dug quite extensively into the underlying causes of PABLO's poor performance. They showed that the copying of the data from the PDS to the BE and back again was the most significant factor.

You do understand that, unless I know what the actual test was, I don't know what factors might have influenced it or whether some other design might have behaved differently?

Upon update, the object collection was passed to data access component which then parsed through each of the objects in the collection to determine which ones has been updated

This sounds like instantiating all lines rather than lazy instantiation. Which was it?

If you have an alternative implementation that has been proven to be faster or more streamlined than this approach, then we would like to see it.

They are all still ideas at this point, so I'll let you know as testing proceeds.

People might be capable of producing of any number of less efficient approaches PDS-based or otherwise.

I would say that, unless you publish details of how to do M-S-E, then likelihood is that most PDS-based solutions are likely to be significantly less efficient than M-S-E, if for no other reason than the person-years worth of effort which have gone into M-S-E at this point.

Which said, let me focus on two points.

1. I have said previously that the test you describes is a test of one particular type of scenario in an application ... a pretty atypical one. That doesn't make it a bad test, by any means because it is always worth seeing what happens when you really bang away at something. But, to have a good understanding of how a model will behave in the overall context of an application, one should also be testing other scenarios. In particular, one needs to test the opposite extreme, i.e., situations characterized by heavy update of a single entity or a separately fetched series of single entities.

2. Does it seem reasonable to you that what amounts to two assignment statements would result in a 50X degredation of performance?

Posted by Thomas Mercer-Hursh on 11-Mar-2010 12:05

Yes, it has occurred to me that one could implement collections using two arrays. When the array needs to extend, then extend one and copy across, and so on back and forth. But, it sounds like a lot of overhead if one is doing something like adding elements to a collection without knowing up front how many one might need. Of course, one might get around that a lot by starting with 1000 or something, i.e., bigger than a lot of collections would ever get.

Posted by Thomas Mercer-Hursh on 11-Mar-2010 12:29

I don't suppose you would care to share the full code somewhere?

Posted by Phillip Magnay on 11-Mar-2010 12:36

tamhas wrote:

Upon update, the object collection was passed to data access component which then parsed through each of the objects in the collection to determine which ones has been updated

This sounds like instantiating all lines rather than lazy instantiation. Which was it?

Lazy instantiation. Objects were instantiated only when a collection is requested. A collection could be requested containing one object. Or multiple objects. Not perfect but simple implementation of the concept. No doubt you will have a better approach.

tamhas wrote:
Which said, let me focus on two points.
1. I have said previously that the test you describes is a test of one particular type of scenario in an application ... a pretty atypical one.  That doesn't make it a bad test, by any means because it is always worth seeing what happens when you really bang away at something.  But, to have a good understanding of how a model will behave in the overall context of an application, one should also be testing other scenarios.  In particular, one needs to test the opposite extreme, i.e., situations characterized by heavy update of a single entity or a separately fetched series of single entities.
2. Does it seem reasonable to you that what amounts to two assignment statements would result in a 50X degredation of performance?

It is only an atypical testing scenario from your standpoint. The scale may vary but it is a very typical scenario for many of the customers we work with.

But in order to make the best use of time and bandwidth...

Develop your own alternative PABLO implementation and your own tests, and demonstrate that you can overcome PABLO's unacceptable performance. We would definitely be interested in seeing it.

Posted by Peter Judge on 11-Mar-2010 12:49

tamhas wrote:

I don't suppose you would care to share the full code somewhere?

It's still (somewhat) of a work in progress, but attached is my Stack code. It's part of a larger body of work that will be published 'properly' (for some value of properly ) in the (hopefully) not-too-distant future. I'd suggest any discussion on this code be in a new thread.

-- peter

Posted by Tim Kuehn on 11-Mar-2010 13:18

pmagnay wrote:

Our guys dug quite extensively into the underlying causes of PABLO's poor performance. They showed that the copying of the data from the PDS to the BE and back again was the most significant factor.

Upon instantiation of the object(s), the object factory simply reads the field values from a buffer in the PDS (after the PDS has been populated) and assigns these values to the relevant properties on the BE upon instantiation. We tried direct assignment to public properties and indirect assignment by passing the values through the constructor. Nothing very complicated. The objects were then inserted into a TT-based collection and then the collection is returned back to the requesting client class.

Upon update, the object collection was passed to data access component which then parsed through each of the objects in the collection to determine which ones has been updated. For those that had been updated, the before image was added into the relevant table of a PDS, tracking changes turned on, then the after image overlaid. Then the standard data access components which update PDSs to the database are used.

If moving the data to the PDS's .bi and .ai images is where the slowdown is happening, then why not do a direct single-record update instead of running it through a PDS?

Posted by Thomas Mercer-Hursh on 11-Mar-2010 13:26

BTW, this indention is driving me nuts. I have to look at the e-mail version or respond in order to read a post.

Objects were instantiated only when a collection is requested. A collection could be requested containing one object. Or multiple objects. Not perfect but simple implementation of the concept

OK, just to clarify. In your test, you added a line, deleted a line, and modified a line. What was the logic for deleting and modifying, i.e., did the lines need to all exist first or did you just instantiate those two lines?

No doubt you will have a better approach.

Maybe, maybe not. I need to understand it first.

It is only an atypical testing scenario from your standpoint.

I have been doing distribution applications since the late 70s so I do have a pretty rich experience base to draw on. And I've been involved professionally in some major benchmarking efforts from the early 70s ... 60s really, so I have a little experience there as well. One of the problems I keep having in a number of these scenarios is wondering what real world task actually corresponds to a proposed test. There are, of course, lots of different kinds of test. There are times when one wants a very simple test just to get a feel for whether something is possible, what one might call limit tests. E.g., knowing that one can create 10,000 empty objects in .7s provides us with a kind of limit. If our requirement is 100,000 per second of real objects, then we know we have a problem. But, if it is likely that the real requirement is significantly less than 10,000, then we can feel like it is OK to go ahead. A second kind of test is to model some real world problem, especiallly if it is one that has a performance requirement. E.g., I need to be able to do an MRP run in under an hour kind of thing. And, then, there is testing where one creates a mix of jobs and sees how the thing works as a system.

I'm having trouble thinking of a real world scenario corresponding to your test. The closest I come to where there is a mass update is something like a mass allocation of a back-ordered item or an item substitution or item repricing. That is going to be all open order lines of a particular item number and possibly all order headers containing those lines. And, it is the kind of operation which is typically done as a batch run with a report, so millisecond performance is not critical, especially since the task is only run a few times a day at most. So, I'm not sure that it is telling me a lot about the way most functions are going to perform.

Compare, for example to a reporting requirement which is not supported by a good index, i.e., the sort of thing where one is likely to read a lot of records in a way which is performant for the database and create a temp-table and then do the report from the temp-table. If the access mode for the temp-table was something that one needed to do in real time, then one would add an index or some other mechanism to avoid having to use a TT intermediary, but, because it is a report, the performance penalty of having to use a TT is acceptable.

As for working on my own models, yes I plan to. But, I don't think this removes the need for trying to understand the rather surprising results which you have reported. I ask again, does it seem to you to be reasonable that two batch assignment statements would result in a 50X difference in performance? Do you think I should expect to be able to replicate that behavior in an isolated test?

Posted by Phillip Magnay on 11-Mar-2010 13:28

timk519 wrote:

If moving the data to the PDS's .bi and .ai images is where the slowdown is happening, then why not do a direct single-record update instead of running it through a PDS?

With the aim of making it an apples-to-apples comparison, we wanted to use the same DA layer.

Posted by Thomas Mercer-Hursh on 11-Mar-2010 13:30

Thanks. Yes, I agree this should be a new thread if we are going to pursue it, although it would be interesting to test it out for collection performance.

Posted by Thomas Mercer-Hursh on 11-Mar-2010 13:44

Understandable. It is fairly easy to re-use existing work and keeps a lot ot things fairly common. Mostly good things. But, getting the results you have, it does seem like one might want to wonder if there is some coupliing problem that is getting in the way.

Posted by Phillip Magnay on 11-Mar-2010 14:14

tamhas wrote:

OK, just to clarify. In your test, you added a line, deleted a line, and modified a line. What was the logic for deleting and modifying, i.e., did the lines need to all exist first or did you just instantiate those two lines?

Only the objects for the lines that were created/modified/deleted were instantiated.

tamhas wrote:

I have been doing distribution applications since the late 70s so I do have a pretty rich experience base to draw on. And I've been involved professionally in some major benchmarking efforts from the early 70s ... 60s really, so I have a little experience there as well.

This is beside the point. I'm sure there are other developers here on this forum that are at least as experienced as you who have a different viewpoint.

tamhas wrote:

I'm having trouble thinking of a real world scenario corresponding to your test. The closest I come to where there is a mass update is something like a mass allocation of a back-ordered item or an item substitution or item repricing. That is going to be all open order lines of a particular item number and possibly all order headers containing those lines. And, it is the kind of operation which is typically done as a batch run with a report, so millisecond performance is not critical, especially since the task is only run a few times a day at most. So, I'm not sure that it is telling me a lot about the way most functions are going to perform.

I was working with a customer *yesterday* on a problem which followed a very similar scenario where a complex multi-record update was being performed via a graphical UI. Without violating customer confidentiality, the use case involved a master record maintenance where changes on the master were propagated to many child records and the changes on the children were then propagated to their many children - all in a single transaction scoped to the master. Response time and performance was the critical issue.

You might think the test is not typical. Others are entitled to a different view.

tamhas wrote:

As for working on my own models, yes I plan to. But, I don't think this removes the need for trying to understand the rather surprising results which you have reported.

These results are not uncommon. Guillaume (a very pro-OO participant in these discussions here) has also clearly indicated he has experienced the similar unacceptable performance issues. Several customers we've been working with who in a similar fashion have wanted to adopt OO and tried the PABLO approach. Their efforts have all run into unacceptable performance.

tamhas wrote:

I ask again, does it seem to you to be reasonable that two batch assignment statements would result in a 50X difference in performance? Do you think I should expect to be able to replicate that behavior in an isolated test?

There is much more to it than just two batch assign statements. If you actually go through the process of developing your own PABLO implementation, then you will finally discover that for yourself.

As for you being able to replicate the same behavior in an isolated test, I'll keep an open mind. If you actually develop a PABLO implementation which delivers better performance than what we've experienced to date, then I'm sure people will want to take a look.

Posted by Tim Kuehn on 11-Mar-2010 14:14

pmagnay wrote:

timk519 wrote:

If moving the data to the PDS's .bi and .ai images is where the slowdown is happening, then why not do a direct single-record update instead of running it through a PDS?

With the aim of making it an apples-to-apples comparison, we wanted to use the same DA layer.

Then change the DA layer to allow for single-record updates as well as supporting PDS-sourced updates.

Posted by Phillip Magnay on 11-Mar-2010 14:24

timk519 wrote:

Then change the DA layer to allow for single-record updates as well as supporting PDS-sourced updates.

Thats an interesting idea. Perhaps you could give me more detailed guidance as to how one would go about creating a DA layer that discriminates transparently between single-record updates and PDS-sourced updates without creating redundant code.

Posted by Thomas Mercer-Hursh on 11-Mar-2010 14:41

You might think the test is not typical.

What it sounds like you are describing is something I can imagine, i.e., edit customer, change discount code, read all open orders of customers, inspect all lines of those orders and update with the new discount. That seems like a different order of magnitude than all orders for all customers.

I am looking at Guillaume's tests and will comment as I get results.

The point, Phil, is that, if there is an inherent problem, we ought to be able to understand why that problem exists. Possibly it is something that needs to be addressed by development and we are out of luck until they do. Possibly it is something that will respond to a different design. Until we know what it is, we are left making a decision based on litte more than guesses.

You were the one who pointed the finger at assigning and reading the properties. Doesn't it seem to you a bit inexplicable that reading and writing the same number of records (DB activity often swamping what happens in the session) and creating and deleting the same number of objects (except for the collection objects, which isn't a large number), and executing the same add/delete/modify operations that it is a bit inexplicable that there should be a 50X difference? It just doesn't make sense.

Posted by Tim Kuehn on 11-Mar-2010 14:42

pmagnay wrote:

I was working with a customer *yesterday* on a problem which followed a very similar scenario where a complex multi-record update was being performed via a graphical UI. Without violating customer confidentiality, the use case involved a master record maintenance where changes on the master were propagated to many child records and the changes on the children were then propagated to their many children - all in a single transaction scoped to the master. Response time and performance was the critical issue.

You might think the test is not typical. Others are entitled to a different view.

I've got a customer with a business process that does something similar to this all the time. The notion of reading data into a PDS, doing something to it, and then writing it back to the db gives me the shudders - both in terms of implementation and performance challenges.

The only thing the code passes in right now is "X happened to Y at time T", and the system makes the initial adjustment and all cascading changes from there.

Now, if the PABLO part of the system was coded to handle events and then ran lower level 'drives' to carry out the business implications of those events, that would be a more performant, albeit not-pure-OO solution (unless the 'drivers' were all static classes....)

Posted by Thomas Mercer-Hursh on 11-Mar-2010 14:43

Why redundant? One is a FIND and the other is a QUERY. Those tend to have very different contexts.

Posted by guilmori on 11-Mar-2010 15:15

What about this case.

We need to calculate a project score based on shop production data.

There are many conditions that govern how the score is calculated depending on product type and others factors.

On some project, this means going through 25k records.

You do all this in DL ?

Posted by Thomas Mercer-Hursh on 11-Mar-2010 15:47

I'd need to know more details, but possibly. E.g., consider wanting a list of the top ten customers by open order value. That list is really just a piece of information. After you get the list, you might drill down or something, but that is a separate operation to getting the list in the first place. Can you see any point in marshalling all that data into the BL just to accumulate totals, sort, and then select the top 10?

Posted by Thomas Mercer-Hursh on 11-Mar-2010 16:22

Bringing this back out to the left ....

Done a couple of tests now ... I think there is some further diagnosis needed, but let me put a couple of things out there.

In Phil's results he is reporting an average of about 2 seconds per customer. With 1117 Customers, that is about 2234 seconds for the full task compared with apparently about 45s for the M-S-E version.

With 1117 customers and 3953 Orders and three lines per order (add/delete/modify) that is a total of 16929 objects which need to be created, values assigned and values read. Once the values are assigned, the same logic will run in both cases, only the M-S-E version will need to get or put any data into the model instead of just having it locally. Then the values need to be read again.

I converted my test for creating empty classes to create classes with 10 properties and then assigned values to each property of each class as it was newed. My first run on this got me 1.3s for 10000, but subseqent runs dropped it to half that ... not sure why. I then tried to do 16929 and the time wasn't going up appreciably, so I began to wonder if something funny was going on with using an array so I switched to a TT. That is not getting me 1.75s for 10,000 objects and about 4.5s for 16929. There is something oddly non-linear there since 12500 is 2.97s and 15000 is 3.22s. In particular, note that not all of these classes need to exist at the same time, so I am putting stress on memory here that wouldn't exist in the actual problem. But, worst case my figure for creating the objects and populating the properties is about 10% of the M-S-E figure and about 0.2% of the apparent result of Phil's test. Now, to be sure, there is a lot of logic missing in this test compared to Phil's but it does demonstrate that one can generate the same number of PABLO objects and assign values to them in a tiny fraction of the time reported for the PABLO solution in the test.

Something just doesn't compute here.

Admittedly, I am comparing numbers on my box to numbers on Phils since I don't have his code to run here. Still, it seems like there is a couple of orders of magnitude of missing explanation.

Posted by Tim Kuehn on 11-Mar-2010 17:29

pmagnay wrote:

timk519 wrote:

Then change the DA layer to allow for single-record updates as well as supporting PDS-sourced updates.

Thats an interesting idea. Perhaps you could give me more detailed guidance as to how one would go about creating a DA layer that discriminates transparently between single-record updates and PDS-sourced updates without creating redundant code.

I'm having a hard trying to figure out whether this is a serious request.

Are you assuming wide TTs or a generic dynamically-driven DA layer?

Posted by guilmori on 12-Mar-2010 07:08

Could you try making your class inherit some base class hierarchy and also implement some interfaces, to see what kind of overhead it adds.

Posted by Thomas Mercer-Hursh on 12-Mar-2010 11:11

I will put that on my list of things to test, but I don't think it is germane to the immediate problem since those are issues which should be resolved at compile-time, not run-time.

Posted by Thomas Mercer-Hursh on 13-Mar-2010 17:22

In talking about all of this to someone else, I have realized that there is even a bigger mystery here than I first realized. We have been talking as if the big difference between the two was the need to assign all properties of the PABLO objects and read all properties of the PABLO objects at the end, but actually, there is a lot more going on. Given data that starts in and ends up in a PDS, consider these two descriptiong of what happens when a BE is created:

M-S-E

Create the BE object

Create a delegate object

Delegate object dynamically creates a buffer on the TT and points it to the right record

Assign the delegate object to the BE

PABLO

Create the BE object

Assign all properties of the object.

During use, we have the same logic in both, but if we need to use or change data, the PABLO object has the data right there as an existing property, but the M-S-E BE needs to use the delegate which will create a dynamic access on the buffer based on the field name.

When we are done with the BE, we get this:

M-S-E

Delete the BE

BE's destructor deletes the delegate

Data is already in the PDS

PABLO

Read the properties of the BE and assign to PDS

Delete the BE

So, yes, the PABLO approach requires two block assignment statements which M-S-E doesn't require, but M-S-E needs to create and delete a delegate object which needs to create a dynamic buffer and all data access requires dynamic code.

One might also ask, if the before-image features of the PDS are being used for the update to the DB in M-S-E, why isn't that also being used in the PABLO test? The M-S-E BEs don't appear to have any BI functionality since they don't have any data in them. Why is there BI functionality in the PABLO objects?

Posted by Phillip Magnay on 13-Mar-2010 20:07

"One might also ask, if the before-image features of the PDS are being used for the update to the DB in M-S-E, why isn't that also being used in the PABLO test?"

How would a PABLO use the BI functionality of a PDS if there is no PDS in the BL?

Sent from a traveling Blackberry...

Posted by Thomas Mercer-Hursh on 14-Mar-2010 01:08

No reason for it not to be still in the DL.

Per your test, the PDS and the DL access and persistence elements were a part of the consistent parts of the test. To be sure, there are a bunch of strategies one might use, but my understanding of your test is that the DL part was substantially the same for both models.

Posted by Phillip Magnay on 14-Mar-2010 09:47

Yes, in our tests the DL was substantially the same. However, in our tests the PABLOs and the PABLO collections are completely decoupled from the PDS that was used to create them. Therefore the PABLOs could not utilize the PDS BI functionality - they had to implement their own.

Again, if these PABLOs and PABLO collections are decoupled from the PDS, how would the PABLOs utilize PDS BI functionality? Or is the suggestion now that these PABLOs indeed be coupled to a PDS in order to utilize PDS BI functionality?

If so, isn't this moving away from PABLO and closer to M-S-E?

Also, if PABLOs are now coupled to a PDS (to utilize PDS funtionality), why not utilize all of the functionality of the PDS and just adopt M-S-E?

Sent from a traveling Blackberry...

Posted by Thomas Mercer-Hursh on 14-Mar-2010 10:47

They could use the PDS in the same way one does with a non-ABL client. Bring back the BE to be persisted, turn on tracking changes, and update the PDS record.

I'm not saying one way or the other that this is my recommendation, just that it is a possibility. I think the idea of BEs carrying around their own BI functionality is a dubious notion though.

Posted by Phillip Magnay on 14-Mar-2010 11:37

If our PABLO implementation does not meet your standards then show us all how a PABLO implementation should be done. Show us all how the woeful performance of PABLO can be solved.

Sent from a traveling Blackberry...

Posted by Thomas Mercer-Hursh on 14-Mar-2010 12:17

There are two entirely separate questions here.

One question is, if one was going to use a PABLO approach what are the pros and cons of various design models and what might cause us to pick one over another. I'm working on a discussion and testing to help with this consideration.

But, really, before one can go into detail on that, one has the question of whether the PABLO approach is even feasible from a performance perspective. You are making the claim that it is not by a 50X factor. In fact, one gets the impression that you are relying on those results to advise customers that a PABLO approach is no feasible from a performance perspective and that one must use a PDS-based system.

It is this second question we are addressing here. Thus far, my testing suggests that there is no 50X difference. I can't yet really say what difference there is or which might be faster for any given task, but it seems on the face of it that there must have been something involved in your test which is not apparant, unless, of course, it was the overhead of your BI implementation in the BE.

To me, if one gets a test result like that, it behooves one to figure out why one got the result. Is there some operation in the PABLO approach which is exceptionally slow? Is it intrinsic to the PABLO approach or might there be a design which avoided that slowness? Is there some error in the design so that the test is flawed?

How can one know whether or not one has a valid result without making that kind of inquiry? Aren't you concerned that you might be directing people away from using a PABLO approach on faulty evidence?

Posted by Admin on 14-Mar-2010 12:32

Aren't you concerned that you might be directing people away from using a PABLO approach on faulty evidence?

So far I've heard (off line) comments from a number of people that PABLO is not interesting because there are no samples of a large enough scale so that one can make it's own suggestion based on fact like performance, maintainability, easy of coding, ABL-ness, etc.... - and the overall theoretical nature of thread like this here.

Phil has already laid out that he can't share the code right now due to legal restrictions.

How about your ability to share your test code?

Posted by Phillip Magnay on 14-Mar-2010 12:42

"In fact, one gets the impression that you are relying on those results to advise customers that a PABLO approach is no feasible from a performance perspective and that one must use a PDS-based system."

Actually, customers are coming to us after trying PABLO and doing their own testing, and rejecting PABLO because the performance is unacceptable.

"Aren't you concerned that you might be directing people away from using a PABLO approach on faulty evidence?"

Not at all. We are quite confident that our testing has shown PABLO not to be viable.

Sent from a traveling Blackberry...

Posted by Thomas Mercer-Hursh on 14-Mar-2010 12:53

I will be sharing code after I develop it. I can't share what doesn't exist. When I share it, it will not be a single monolithic model, but a bunch of ideas that people can consider and pick from according to their own needs and preferences.

But, there isn't any point in my trying to work on a PABLO approach if there really is an intrinsic 50X penalty, is there? Why try if that is real?

Frankly, I don't think that this thread is theoretical at all. I think it is about testing to find out where and why performance problems exist. If people are making design choices based on performance issues, it is important that those performance issues are valid. It is important whether there are alternatives that avoid the performance issues.

Phil's test, which I am a bit surprised hasn't been mentioned before this, is a big WOW. If it is true, it is a major sign post in the road to a design saying "Don't go this way!" There clearly are some valid signposts like that, such as needing to be moderate in the number of simultaneous temp-tables. But, is this one a valid sign? If so, why? What makes it so? Hard to know whether alternatives exist until we understand that.

BTW, "WOW" is a term used in some investment circles for Wall of Worry. It means some public event that causes investors to lose confidence in a stock resulting in a major price drop. But, specifically it relates to cases where the reaction is unjustified, e.g., an unfounded rumor or an interpretation of some news which turns out to be a misinterpretation. A confident investor looks for a WOW event because it presents a buying opportunity for a stock which has an artificially depressed price.

Posted by Thomas Mercer-Hursh on 14-Mar-2010 13:05

We are quite confident that our testing has shown PABLO not to be viable.

Why are you confident? Doesn't one need to be able to explain a result in order to have confidence in it? So far, the proposed explanation doesn't seem to be supported by additional testing. Doesn't that suggest that either there was something wrong with the test or that there is a different explanation?

Actually, customers are coming to us after trying PABLO and doing their own testing, and rejecting PABLO because the performance is unacceptable.

So, the question should be, is there some intrinsic reason why PABLO is not going to provide acceptable performance or is it simply a question of people making some error in their design which produces the bad performance? We know, for example, that people have tried a form of PABLO in which they put a one line TT in every object in order to take advantage of the built-in serialization. OK, we understand why that is a performance problem, but it doesn't thereby mean that every possible PABLO implementation is going to have the same performance problem.

Understand, I'm not trying to say that PABLO is the only way. There are pros and cons to every approach. I just want people to be able to make intelligent, informed decisions based on real attributes and real data. At this point, this test doesn't seem to me to be passing the sniff test as real data. I don't have to build an entire alternate PABLO implementation to question the data.

Posted by ChUIMonster on 14-Mar-2010 13:22

If there really were a 50x difference in performance and people were

coming forward to complain about it then examples should be readily

available, easy to produce and a significant benefit to the community.

Not doing so stretches credibility to the breaking point.

Posted by Admin on 14-Mar-2010 13:37

I will be sharing code after I develop it. I can't share what doesn't exist.

I'm a bit surprised right now. By the amount of energy you spend to convince people about the fact that PABLO is the one and only (my interpretation) good OO/OERA implementation I was expecting you to have some samples available that proved your theories right.

Frankly, I don't think that this thread is theoretical at all. I think it is about testing to find out where and why performance problems exist.

I is theoretical as long as we are discussing about a solution that we are not allowed to receive insight and a solution that's not (yet) existing in code.

Posted by Thomas Mercer-Hursh on 14-Mar-2010 13:40

I suppose the customer experiences are confidential. Whatever they are, they are also incomplete if all they are is an anecdote about a particular company trying something and failing to succeed. On the other hand, if the customers ran some test and were able to point a finger and say "we don't think we can go this route because this test shows that the performance of X will not meet our requirements", then we would have something that was not confidential (we only need to know about the test, not who ran it), repeatable, and concrete enough to sink our teeth into.

Posted by Thomas Mercer-Hursh on 14-Mar-2010 13:54

By the amount of energy you spend to convince people about the fact that PABLO is the one and only (my interpretation) good OO/OERA implementation

I don't know where you get this impression. You were there for the talk in Rotterdam ... all pros and cons. Sure, I think there are some really good reasons to be exploring the PABLO approach or I wouldn't doing it, but haven't you noticed that I quite regularly praise M-S-E for the things they have worked out well? Not only do I make no pretense of being privy to the One Right Way, I don't believe there is such a thing as a One Right Way. There are choices and consequences of choices. If I am ardent in trying to defend PABLO from detractors, it is simply because there is a strong bias in many camps toward PDS based solutions. Those approaches have their champions.

It is theoretical as long as we are discussing about a solution that we are not allowed to receive insight and a solution that's not (yet) existing in code.

Go back and read the original post that started this thread. The whole question here is not "my pattern is better than your pattern" or "my pattern performs better than your pattern". The point of the thread is that there have been performance claims that suggest working in OOABL is a problem. This isn't even confined to PABLO vs M-S-E because some of those claims are about any OO. The purpose here is to figure out what is an isn't a problem and put some numbers on it. If there is a problem, are there alternatives and what are the implications of choosing an alternative? Is there something that seems wrong that we should be lobbying PSC about? What do we need to be aware of in making choices in our designs?

Posted by Phillip Magnay on 14-Mar-2010 13:57

I believe Guillaume distributed his code to this forum and he had similar results to our own. I am simply not permitted to distribute any code from these projects without obtaining express permission and that is like getting a colonoscopy. If that is too incredible for you, then it must be nice to live in wonderland.

But it's absolutely OK to question these tests. As I stated earlier, I urge anyone who is interested in PABLO to satisfy themselves by doing their own tests. But we haven't come across a customer who has done so and hasn't rejected it due to unacceptable performance.

Sent from a traveling Blackberry...

Posted by Phillip Magnay on 14-Mar-2010 14:02

It's OK. You don't have to accept our tests or our experiences. We have satisfied ourselves that the performance of PABLO is unacceptable. Again. If anyone is interested in PABLO, I urge them to do their own investigations and own tests.

Sent from a traveling Blackberry...

Posted by Thomas Mercer-Hursh on 14-Mar-2010 14:14

I have Guillaume's code and will be working through it. I've been distracted from that by exploring your claim. But, from what I have seen, Guillaume's tests are not directed at the same question. They are focused on very specific issues. E.g., one tests instantiating a large number of simple objects with no logic versus objects with the same data members, but a bunch of methods. He is testing the presumption that the code should be rentrant and therefore the time to instantiate the second set should be the same as the first set, which it doesn't seem to be.

That is not a test which relates to the feasibility of using PABLO versus PDS. Guillaume does seem to think there are issues there, but I'm not clear at this point what he has actually tried and not tried or what it is that he is doing instead.

I don't think anyone expects you to distribute customer code. It does seem that if you are going to educate people about M-S-E, you are eventually going to have to get around to releasing some sample code.

But, I don't think that is the issue here. The issue is whether or not there are specific performance problems which mitigate against seriously pursuing a PABLO approach ... and which presumably are not a problem for M-S-E since you seem satisfied with its performance. It doesn't take full blown complex implementations to expose performance problems. In fact, full blown complex implementations conceal the source of the performance problem unless the problem is in the implementation itself. If the problem is something fundamental, then a simpler test will show the problem more clearly and point the finger directly at the issue.

So, what I am asking is for you to help us point that finger. Help us figure out why you got the results you did. So far, my testing doesn't seem to be finding the problem and seems to suggest that the problem doesn't exist unless it is a problem in the specific implementation. We need to understand the problem before we can decide what our response should be.

Posted by Thomas Mercer-Hursh on 14-Mar-2010 14:16

Again, why are you satisfied? If you understand why you got the results you did, share the reasons with us so that we can verify them for ourselves and consider the implications. If you don't understand why you got the results, why are you satisfied?

Posted by Tim Kuehn on 14-Mar-2010 14:25

pmagnay wrote:

It's OK. You don't have to accept our tests or our experiences. We have satisfied ourselves that the performance of PABLO is unacceptable. Again. If anyone is interested in PABLO, I urge them to do their own investigations and own tests.

Why? If Progress has information that indicates that using a given technique is a Really Bad Idea, then they should be making that information available to it's partners so they don't invest a bunch of resources into a technological dead-end.

At the very least you should be able to post some sample code which illustrates the overall structure that's being used and indicate where the performance hit is the greatest.

Posted by Phillip Magnay on 14-Mar-2010 14:37

"If you don't understand why you got the results, why are you satisfied?"

I've stated our explanation a number of times now. But you're never satisfied with it. So we go 'round and 'round and 'round and arrive at this same point again. I am not going to explain it again after this one last time.

We found the poor performance of PABLO to be primarily caused by the fact that PABLOs are decoupled from the PDS that is used to retrieve data from and update data to the database. This decoupling requires data from the PDS to be moved into the PABLOs, the before-image data in managed inside the PABLO, and then after-image and before-image data to be moved back into the PDS for update. Inserting and parsing through TT-based collections of PABLOs also adds some overhead.

That's it. That's our explanation. If you dismiss it once again with all you need is a couple of batch assigns and some better yet unspecified way of handling the before-image problem, then please don't bother me to go through this explanation again..

Posted by Thomas Mercer-Hursh on 14-Mar-2010 15:42

At the level you have just stated your explanation, it is effectively "this whole bunch of things we did here" is 50X slower that "this bunch of things we did there". That might be a test to indicate your specific PABLO implementation was flawed, but it is in no way a condemation of PABLOs as a whole.

Let's break it down a little. What you seem to be saying is:

1. We had to move the data into the BE (as opposed to accessing it through the delegate)

2. We managed the BI information in the BE (as opposed to using TRACKING CHANGES on the PDS)

3. We had to move the data back out of the BE into the PDS.

So, I have done a test which strongly suggests that the assignment necessary to move data into the BE is not a performance problem. Frankly, it would have been pretty astounding if it had been. Well, that's parts 1 and 3.

So, does this mean that the 50X came from #2? If so, I have to wonder about your implementation, especially since I wouldn't be inclined to make an object responsible for its own BI unless that was built-in. There certainly are many other ways to get the job done, especially since you are putting the data back into the PDS at the end. Why not use TRACKING CHANGES on the PDS anyway? Or, if there is a reason you don't like that, try any one of a number of other techiques. I can give you a whole list.

There is no smoking gun here; no identificatiion of a language feature which inherently makes the PABLO approach 50X slower than M-S-E.

I'm working on a more exhaustive and comparative test, but if you just ignore it and say that I must be missing something then we aren't going to advance the state of knowledge very much.

When I was in grad school, a professor asked me to write a program to implement the algorithm in a paper he gave me. It related to a method for estimating the equilibrium point if one had group of populations which had genetic differences and limited migration between them. The algorithm was an incredibly complex thing full of theta transforms and such which I was grasping to understand, but something kept niggling my brain about the very first equation. Eventually, a light bulb went on, I did a minor piece of algebra on that first equation, and presto I had a classic formulation of a thing called the "characteristic value equaltion". What was magic about that is that the characteristic value equation is heavily studied and one simply has to extract a thing called the first eigenvector from the matrix and there is the equalibrium condition ... not an estimate, but the exact solution. Moreover, there were well-defined conditions that would tell one simply whether an solution existed. I wrote to the authors explaining my discovery. They wrote back to me and my professor announcing that they were sure I had made a mistake, but they were too busy to find it right now, and, oh, would I be intereested to coming to their lab for the summer for a job,

"You must have made a mistake somewhere" or "That's not quite the same thing" is not an explanation.

Posted by guilmori on 15-Mar-2010 08:57

I'm not sure that my tests are directly related to your comparison of PABLO vs M-S-E.

They are really basic samples trying to illustrate some performance overhead in very specific OO areas, versus a traditional procedural/relational approach. They also compare some results with C#, since our OO experience comes from C#.

Some of these tests, and the help of PSC consulting, were used 2 years ago to guide us build an OO architecture, but the result was more a PDS centric model with a replace of .p by .cls, without using very much all those great OO concepts. M-S-E seems light years ahead of what we have. It is quite a shame that we didn't hear about M-S-E "was/would be" in the work back then... as we will probably not pay again the price of an whole re-architecture.

Posted by Thomas Mercer-Hursh on 15-Mar-2010 11:43

Phil, when you encountered the 50X performance difference, did it occur to you to report this to tech support or the development team? If there is a genuine language performance problem here, rather than just an issue with the specific implementation, it seems like a report would have been warranted.

In particular, I am wondering if there is a problem that might be impacting other parts of the application. My understanding is that M-S-E delivers BEs and ESs which look and behave very much like regular BEs and collections to the client and that the other parts of the application are presumably good old objects of the ordinary sort. Might they not be vulnerable too? Certainly, they get properties assigned to and read from them.

Posted by Thomas Mercer-Hursh on 15-Mar-2010 11:48

FWIW, Guillaume, moving to M-S-E *might* be less expensive than you might think since the M-S-E components are produced by MDA from UML models. So, to the extent you could reverse engineer your existing code into UML models, you might be able to get M-S-E output. Unfortunately, you are also on the wrong side of the pond for this to be simple, I think. People on your side are doing the iMo thing, which shares that UML foundation, but which does not appear to be as highly evolved as M-S-E.

Posted by Admin on 15-Mar-2010 18:29

It does seem that if you are going to educate people about M-S-E, you are eventually going to have to get around to releasing some sample code.

I may be repeating myself. But I see a much higher demand for PABLO sample code.

So far, my testing doesn't seem to be finding the problem and seems to suggest that the problem doesn't exist unless it is a problem in the specific implementation.

You seem to have some "PABLO" like code to run relevant tests. Can't you share that with us? You know I'm pretty skeptical about PABLO for a number of reasons (performance is only one of them). But I'm open to get convinced of the opposite. By working code - not additional megabytes of posts on this forum.

Posted by Thomas Mercer-Hursh on 15-Mar-2010 19:23

I may be repeating myself. But I see a much higher demand for PABLO sample code.

Why? The total released code base of M-S-E code is 0 lines. Yes, we know that somewhere in PPS and their customers that it is actually in use, but we can't see any of it. The total publicly released documentation is one UML diagram and scattered descriptions in variious posts, many of which are by me.

You seem to have some "PABLO" like code to run relevant tests. Can't you share that with us?

There are a couple of significant points here.

First of all, a PABLO object is not some complex mysterious thing that we need a ton of documentation to get a handle on. Define a class, add whatever properties it needs, include a few methods if it has behavior, and presto you have a PABLO object. Now, there are a bunch of other things that one needs to figure out to have a complete PABLO-based architecture like where the factory is, who is going to handle persistence, what the DL is going to look like, how the DL and BL will communicate, what one is going to do about before-image handling, etc. Each of these has 5 or 10 different possibilities that might appeal to certain people. A PABLO solution is never going to be a single, one sized fits all monolithic pattern.

Second, the context of this thread is an exploration of possible performance problems with OO in general, not just with PABLO approaches. We have had a claim that PABLO is 50X slower than M-S-E. If that is true and can't be altered by different PABLO approaches then there is no point going to the effort to build any complete PABLO solution because it will inherently fail. So, it is critically important to discover why it is true, if it is true. If it is true, then we need to ask whether a different approach might eliminate the problem area. If it is not true, then we can proceed to explore various options and keep testing.

Thus far, we don't seem to have any enlightenment on the source of the problem which can be measured by a test outside of the context of the test which PPS did. We can't verify any theories. That is critical.

I can publish the tests I have done and reported here, but their trivial. They don't pretend to be a test of entire PABLO systems, but disagnostics looking for the source of this supposed 50X performance problem. I am working on a more complete test which will be closer to a stripped down version ot the PPS test, but it isn't going to pretend to be a model for a complete production approach to PABLO, just a way to verify that one can accomplish all of the required operations with adequate performance. If that works, then all one has to do to get to a production system is to add to it to provide the desired features without adding anything that destroys performanc unacceptably. That's not a 5 minute job, of course, but it is the way that one tests the feasibility of the idea.

Posted by Thomas Mercer-Hursh on 16-Mar-2010 15:05

I would like to create a new test or tests which we can use to get a better idea what does and doesn't work. Let me say up front that any performance testing we do should be capable of very precise interpretation. Among other things, this means to me that if a complex test produces a surprising result, one needs to break down the complex test into simple tests to find the source of the surprise. If the source of the surprise is not found, then one needs to explain why the combination performs poorly and test that. In the end, one should have a fairly simple test which illustrates the problem and then one can decide whether that is fundamental or whether a design change can avoid the problem.

I have some objections to the PPS test design since it doesn't seem to me to be representative of common tasks, but I am willing to work with it for now just because it gives us a starting point and I think it is important to either verify or refute the finding. I would be very interested in hearing people's ideas for other tests which they think would be meaningful stress tests.

PPS were starting with a complete production model in M-S-E which they then adapted to make a PABLO version. For comparison testing purposes, I don't think this is ideal since production code is necessarily complex and this obscures our ability to see the specific sources of performance variation. Therefore, the tests I would like to do would be to create stripped down versions that do all of the same basic operations, but without all of the specific mechanics. There are, however, some aspects of the test which are not yet clear to me so first I need some clarification.

My understanding of the test is as follows:

1. FILL a PDS with all customers, all orders, and all lines(1)

2. Instantiate a collection for customers and instantiate all customer objects(2)

3. For each Customer in the collection, create an order collection and instantiate all orders.(3)

4. For each Order in the collection:

a. Add a new line. Implies assign of all values.

b Delete an existing line.

c. Modify a value in an existing line.

This appears to imply instantiating two existing lines and one new one.(4)

5. After completion of #4, persist all changes to the database(5)

6. Clean up.

To keep my test simple, I am going to leave out any effort to create a strict DL/BL separation and won't use the M-S-E "decorators" for the DB access. Likewise, there will be no subclasses or their M-S-E equivalents as that is better tested separately.

For the PABLO version, I will create PABLO objects for each object indicated above.(6) I will probably do the persistence by putting the new values back into the original PDS. This is only one of many possible PABLO models, but it is directly comparable to what M-S-E is doing so it seems like the appropriate strategy to use for this test. We can compare persistence strategies in a separate test. Collections will be done in temp-tables.

For the M-S-E-like version, I will create both a BE and a delegate object in the fashion of the current M-S-E(7) and all data read and write will be through the delegate into the PDS. I'll do something to imitate an ES, but that might take a little experimentation.

So, what do I have wrong and/or what am I missing?

Notes and Questions

(1) Not all lines are actually used, but accessing two lines per order is nearly 8000 lines and there are only 13000 lines total, so I'm inclined to read them all. This is an element which could be tested separately for different strategies.

(2) If the transaction scope is the Customer, one could instantiante, process, and delete each customer individually, but the same number of operations are going to be required either way. It appears that nothing is modified in the customer using Sports, which may not be typical of production systems.

(3) Orders also could be created one at a time, but again, the same number of operations is required. There is also nothing modified in the order, again not likely to be true in production systems.

(4) Do all existing orders have two lines? If not, what happens?

(5) Could also be done at the end for all customers instead.

(6) Even though the Customer and Order objects are not modified, they need to exist if one is attempting to model OO interaction since the assignment of Customer.CustNum to Order.CustNum and Order.Ordernum to Orderline.Ordernum is an assignment between objects.

(7) I don't know if PPS' test was done on the latest release or on the earlier accessor method approach.

Posted by Håvard Danielsen on 16-Mar-2010 16:28

----

My understanding of the test is as follows:

1. FILL a PDS with all customers, all orders, and all lines(1)

2. Instantiate a collection for customers and instantiate all customer objects(2)

3. For each Customer in the collection, create an order collection and instantiate all orders.(3)

4. For each Order in the collection:

a. Add a new line. Implies assign of all values.

b Delete an existing line.

c. Modify a value in an existing line.

This appears to imply instantiating two existing lines and one new one.(4)

5. After completion of #4, persist all changes to the database(5)

6. Clean up.

----

So, what do I have wrong and/or what am I missing?

I think you miss an important piece. The collection need to be traversed and the properties need to be returned/accessed to give the full picture of the architecture's performance. This is after all, how the classes are being used both in Business Logic and Presentation.

Posted by Thomas Mercer-Hursh on 16-Mar-2010 16:54

Isn't the OO version of loop = traversing collections?

Doesn't creating a new BE imply setting all of its properties?

Doesn't modifying a BE imply setting that property?

And, of course, the PABLO version is going to have to set all the BE properties as a part of creating the BE and do something equivalent to reading them all in order to persist the results.

Seems like this is covered ... what am I missing?

Mind you, I agree that a more real world test would involve a lot more execution of logic. It might be interesting to find out if simple accessing the properties which are just "right there" in a PABLO BE versus having to fetch and set them through the delegate had performance implications. But, I think that is something which could be easily and appropriately tested in its own test.

Posted by Håvard Danielsen on 16-Mar-2010 19:31

tamhas wrote:

Isn't the OO version of loop = traversing collections?

Doesn't creating a new BE imply setting all of its properties?

Doesn't modifying a BE imply setting that property?

And, of course, the PABLO version is going to have to set all the BE properties as a part of creating the BE and do something equivalent to reading them all in order to persist the results.

Seems like this is covered ... what am I missing?

Mind you, I agree that a more real world test would involve a lot more execution of logic. It might be interesting to find out if simple accessing the properties which are just "right there" in a PABLO BE versus having to fetch and set them through the delegate had performance implications. But, I think that is something which could be easily and appropriately tested in its own test.

I actually thought this was more than one test or at least more than one measure point. Yes, you are already traversing collections and setting the properties, but I would include read performance and get of properties (and not by the implied PABLO read which is implementation specific) also in a single test, if the intention is to get an idea of the overall performance of an architecture. (probably before I tested delete).

Posted by Thomas Mercer-Hursh on 16-Mar-2010 21:24

I have done and previously reported a simple test of just creating a bunch of objects, assigning properties, and reading the properties. It was very fast. As things move along, I should categorize and create a reporting page for these things on OE Hive or something and include the code.

In the current context, though, the reason I am proposing this specific test is that Phil has reported a 50X advantage of M-S-E over PABLO in a test run by PPS. I haven't been able to identify any single component of that test where I can find a performance issue which would explain that difference. The only remaining areas that I can think of which are not tested are the specifics of the before-image handling and some architectural problem in the implementation, neither of which I can test or diagnose without their code. Therefore, I am proposing the current test as a simplified version of their tests with all of the same basic elements, except a different handling of the before image issue. If this test shows a major performance issue, it should be straightforward to instrument it and discover the specific source(s) of the problem and to devise a really simple test to illustrate the problem. Then, we can consider whether it is avoidable or intrinsic. If the test does not show a major performance issue, then we will have to conclude that the result of the PPS test was due to an architectural issue with their PABLO implementation (BI or otherwise) and that it is not an indicator of a necessary problem with all PABLO implementations. One can then move forward to figure out a production PABLO architecture, testing alternatives as one goes on.

Posted by Thomas Mercer-Hursh on 27-Mar-2010 15:24

I'm working on my own benchmark code, both versions simplified to essential operations so that we can more easily see and measure the source of any performance issues. This should either make the source of the problem clear or illustrate that there is no problem in the simplest case. If the latter, then it will be a question of adding things in gradually to approach a more full production implementation to see where the problem arises.

I do have a couple questions about the test itself in order to try to do the same thing.

Modify a line and delete a line seems to require two existing lines. A bit under 700 of the orders have only one line. So, what happens to those?

For each customer one needs a collection of orders. For each order one needs a collection of lines. Does M-S-E clear and re-use entity sets or delete and create for each? I am going to assume the latter since it seems cleaner. Whichever it is, both could do the same.

Is there any basis for selecting what lines to modify or delete? I'm going to assume modify the first and delete the second.

Based on that assumption, for one line orders I will modify the one that is there and skip the delete.

I'm going to do the initial PABLO implementation using a PDS factory in the DL in order to start with as comparable a foundation to M-S-E as possible. This means that I will use the PDS TRACKING-CHANGES to handle the updates, just like M-S-E presumably does.

If that performs acceptably, I will then explore some other techniques for the DL and measure them against the same baseline.

While I won't be trying to imitate the BI within the BE that you used in your test, I would be interested in hearing something about how that was handled. Might we see a copy of one of you PABLO BEs? Since these are not a part of CloudPoint, I wouldn't think the same issues applied.

Also, can you say something about the process you used for persisting the data given that you have a BE with both BI and current data?

This thread is closed

All Replies